Interactive Visualization Features
This visualization tool helps you understand extendible hashing through interactive demonstrations:
- Real-time bucket splitting animations
- Directory doubling visualization
- MSB-based hash addressing
- Collision detection and prevention
- Configurable parameters (hash modulo, bucket capacity, depth)
- Binary representation of hash values
About Extendible Hashing Algorithm
What is Extendible Hashing?
Extendible hashing is a dynamic hashing technique used in database management systems (DBMS) to efficiently handle
large datasets with minimal reorganization. Unlike static hashing, extendible hashing allows the hash table to grow
and shrink dynamically as data is inserted or deleted, making it ideal for applications where the data size is unpredictable.
How Does Extendible Hashing Work?
The algorithm uses a directory (or index) that points to buckets where actual data is stored. Each bucket has a
local depth, and the directory has a global depth. The hash function converts keys into binary addresses, and the
most significant bits (MSB) of these addresses determine which bucket a key belongs to.
When a bucket becomes full (overflow), it splits into two buckets, redistributing entries based on an additional
bit from the hash value. If necessary, the directory doubles in size to accommodate the split. This approach
minimizes the need for complete reorganization when data grows.
Key Components
- Directory:
These containers store pointers to buckets. Each directory is given a unique id which may change each time when expansion takes place. The hash function returns this directory id which is used to navigate to the appropriate bucket. Number of Directories = 2^Global Depth.
- Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain more than one pointers to it if its local depth is less than the global depth.
- Global Depth: It is associated with the Directories. They denote the number of bits which are used by the hash function to categorize the keys. Global Depth = Number of bits in directory id.
- Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is associated with the buckets and not the directories. Local depth in accordance with the global depth is used to decide the action that to be performed in case an overflow occurs. Local Depth is always less than or equal to the Global Depth.
- Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then the bucket is split into two parts.
- Directory Expansion: Directory Expansion Takes place when a bucket overflows. Directory Expansion is performed when the local depth of the overflowing bucket is equal to the global depth.
Advantages of Extendible Hashing
- Data retrieval is less expensive (in terms of computing).
- No problem of Data-loss since the storage capacity increases dynamically.
- With dynamic changes in hashing function, associated old values are rehashed with respect to the new hash function.
Use Cases in Database Systems
Extendible hashing is widely used in modern database management systems for indexing and data retrieval.
It's particularly effective for:
- B-tree alternatives in database indexes
- File systems and directory structures
- Distributed hash tables (DHT)
- Cache implementations
- Dynamic symbol tables in compilers