Ticker

6/recent/ticker-posts

Hash Partitioning Techniques - Parallel Database

 

Hash Partitioning Techniques

Hash Partitioning Techniques: With hash partitioning, Adaptive Server uses a hash function to specify the partition assignment for each row. You select the partitioning key columns, but Adaptive Server chooses the hash function that controls the partition assignment. 

Hash partitioning is a good choice for :

i) Large tables with many partitions particularly in decision-support environments

ii) Efficient equality searches on hash key columns 

iii) Data with no particular order, for example, alphanumeric product code keys. 

If you choose an appropriate partition key, hash partitioning distributes data evenly across all partitions. However, if you choose an inappropriate key 

Advantages Hash Partitioning 

i) Hash partitioning has the advantage of providing for even distribution of data across the available disk, helping to prevent skewing. 

ii) Skew can slow the performance caused by one or more CPUs and disks getting more work than others. Has partitioning is best suited for point queries (involving exact matches) based on the partitioning attribute. For example, if a relation is partitioned on the employee identification numbers (EMP-ID), then we can answer the query "Find the record of the employee with employee identification number = 4134" using SQL statement as follows: 
SELECT* FROM EMPLOYEE WHERE EMP-ID = 4134; 

iii) Hash partitioning is also useful for sequential scans of the entire relation (table) placed on n number of disks. The time taken to scan the relation is approximately 1/n of the time required to scan the relation in a single disk system.
 
Disadvantages Hash Partitioning 

i) Hash partitioning technique is not well suited for point queries on non-partitioning attributes. 

ii) It is also not well suited for answering range queries, since, typically hash functions do not preserve proximity within a range. For example, hash partitioning will not perform for queries involving range searches such as:
SELECT* FROM EMPLOYEE WHERE EMP-ID > 4134 and EMP-ID < 4156; 

iii) In such a case, the search (scanning) would have to involve most (or all) disks over which the relation has been partitioned.