mirror of
https://codeberg.org/andyscott/HashMaps.git
synced 2025-01-04 14:35:55 -05:00
README: added method details with time complexity analysis
This commit is contained in:
parent
155571e227
commit
6ae584ec8c
1 changed files with 69 additions and 9 deletions
78
README.md
78
README.md
|
@ -1,12 +1,72 @@
|
||||||
# HashMaps
|
# HashMaps
|
||||||
|
|
||||||
These two hash map implementations feature open addressing with quadratic probing
|
This hash map library features two methods for collision resolution: separate
|
||||||
and separate chaining to handle collisions. The hm\_include module provides the
|
chaining, and open addressing with quadratic probing. All methods for both
|
||||||
underlying data structures, and two hash functions.
|
classes were implemented iteratively to guarantee straightforward time and space
|
||||||
|
complexity. Further, no built-in Python methods or data structures were used -
|
||||||
|
this library was written to avoid all current and future hidden surprises from
|
||||||
|
the ground up.
|
||||||
|
|
||||||
|
## Separate Chaining
|
||||||
|
|
||||||
|
This implementation leverages a dynamic array of singly linked lists to create
|
||||||
|
chains of key/value pairs. Time complexity assumes your hash function has a
|
||||||
|
complexity of O(1).
|
||||||
|
|
||||||
|
| Method | Time Complexity (worst case) | Description |
|
||||||
|
|-------------------|-----------------|----------------------------------------------------|
|
||||||
|
| `put` | O(n) | Adds (or updates) a key/value pair to the hash map |
|
||||||
|
| `empty_buckets` | O(n) | Gets the number of empty buckets in the hash table |
|
||||||
|
| `table_load` | O(1) | Gets the current hash table load factor |
|
||||||
|
| `clear` | O(n) | Clear the contents of the hash map without changing its capacity |
|
||||||
|
| `resize_table` | O(n) | Changes the capacity of the hash table |
|
||||||
|
| `get` | O(n) | Gets the value associated with the given key |
|
||||||
|
| `contains_key` | O(n) | Checks if a given key is in the hash map |
|
||||||
|
| `remove` | O(n) | Removes a key/value pair from the hash map |
|
||||||
|
| `get_keys` | O(n) | Gets an array that contains all the keys in the hash map |
|
||||||
|
|
||||||
|
|
||||||
|
This data structure also includes a standalone function, `find_mode`, which
|
||||||
|
returns a tuple containing an array comprising the mode (elements with the
|
||||||
|
highest number of occurrences) and frequency (the number of times the mode
|
||||||
|
appears.)
|
||||||
|
|
||||||
|
## Open Addressing
|
||||||
|
|
||||||
|
This hash map uses a dynamic array to create a series of individual
|
||||||
|
buckets. Each bucket contains a key/value pair as well as a flag to indicate if
|
||||||
|
the value has been deleted. This flag is also commonly known as a
|
||||||
|
*tombstone*. The open address implementation also resizes the table
|
||||||
|
automatically to ensure efficient insertion of new elements as the size
|
||||||
|
increases. For the purpose of calculating time complexity, this implementation
|
||||||
|
also assumes that your hash function runs in constant time.
|
||||||
|
|
||||||
|
| Method | Time Complexity (worst case) | Description |
|
||||||
|
|-------------------|------------------------------|----------------------------------------------------|
|
||||||
|
| `put` | O(n) | Adds (or updates) a key/value pair to the hash map |
|
||||||
|
| `empty_buckets` | O(n) | Gets the number of empty buckets in the hash table |
|
||||||
|
| `table_load` | O(1) | Get the current hash table load factor |
|
||||||
|
| `clear` | O(n) | Clear the contents of the hash map without changing its capacity |
|
||||||
|
| `resize_table` | O(n) | Changes the capacity of the hash table |
|
||||||
|
| `get` | O(n) | Gets the value associated with the given key |
|
||||||
|
| `contains_key` | O(n) | Checks if a given key is in the hash map |
|
||||||
|
| `remove` | O(n) | Removes a key/value pair from the hash map |
|
||||||
|
| `get_keys` | O(n) | Gets an array that contains all the keys in the hash map |
|
||||||
|
|
||||||
|
## Notes on Time Complexity
|
||||||
|
|
||||||
|
While I have provided theoretical "worst case" time complexities in the tables
|
||||||
|
above, the actual time complexity is highly dependent on a hash map's *load
|
||||||
|
factor*. In short, we can consider the load factor to be *n/m*, where *n* is the
|
||||||
|
number of elements and *m* is the number of available spaces. In the case of
|
||||||
|
open addressing, the average expected time for adding an element is *1/1-λ*,
|
||||||
|
where λ is the load factor. Thus, when *λ < 1* we should expect that the average
|
||||||
|
and amortized time complexity for the `put` operation will actually be O(1). On
|
||||||
|
the other hand, we can consider the expected time for separate chaining to be
|
||||||
|
*λ + 1*, where the 1 represents the hashing operation. Once again, time
|
||||||
|
complexity is dependant on the load factor and we should expect that the average
|
||||||
|
and amortized time complexity will be O(1). However, it should be noted that the
|
||||||
|
separate chaining hash table provided here will not automatically resize
|
||||||
|
itself. I have left it to the user to decide when it is appropriate for their
|
||||||
|
own program to resize.
|
||||||
|
|
||||||
Both implementations use the included DynamicArray class for the underlying hash table,
|
|
||||||
however hash\_map\_sc.py uses a singly linked list for each bucket while hash\_map\_oa.py
|
|
||||||
uses a HashEntry object. Additionally, hash\_map\_sc.py includes a seperate function,
|
|
||||||
find\_mode(), that provides a mechanism for finding the value that occurs most
|
|
||||||
frequently in the hash map and how many times it occurs with an O(n) time complexity.
|
|
||||||
Finally, both implementations include some basic testing when run as a script.
|
|
||||||
|
|
Loading…
Reference in a new issue