Ticker

6/recent/ticker-posts

QUERY PARALLELISM - Parallel Database

 

QUERY PARALLELISM - Introduction

Parallelism is used to provide speed-up and scale-up. So the queries are executed faster and the increasing workload is handled without increasing the response time, this is done only by increasing the degree of parallelism. 

Parallelism means data can be partitioned across multiple disks for parallel I/O in which individual relational operations (e.g., sort, join, and aggregation) can be executed in parallel as each processor can work in parallel independently on its own partition. 

Queries are expressed in a high-level language (SQL, translated to relational algebra) makes parallelization easier. Different queries can be run in parallel with each other. Concurrency control takes care of conflicts. Thus, databases naturally lend themselves to parallelism. 

If query predicates enable the optimizer to eliminate some fragments, it is possible that only a small number of chunks will be scanned in parallel, leading to an idle CPU while I/O is being completed. Increasing the number of fragments and hence the number of active threads can improve this situation. Striping the fragments across multiple disks is another way 01 ensures that the available disk bandwidth is used effectively. 

QUERY PARALLELISM