Professional Documents
Culture Documents
Query optimization process of producing a query execution plan which represents an execution strategy for the query. Input: Fragment query
Search Space
Restrict by means of heuristics
Perform unary operations before binary operations Restrict the shape of the join tree linear versus bushy trees Linear tree is a tree such that at least one operand of each operator node is a base relation. Bushy tree both operands are intermediate relations
Search Strategy
The most popular search strategy used by query optimizers is dynamic programming
Deterministic
Start from base relations and build plans by adding one relation at each step Dynamic programming: breadth-first Greedy: depth-first
Randomized
Search for optimalities around a particular starting point Trade optimization time for execution time Better when > 5-6 relations
Total Time =
another
Response_time =
Tcpu * seq_#insts + TI/Os * seq_#I/Os + Tmsg * seq_#msgs + TTR * seq_#bytes Seq.#x x can be instructions(insts), I/O, messages or bytes
Database Statistics
The primary factor affecting the performance of an execution strategy is the size of the intermediate relations that are produced during the execution. For each relation R defined over the attributes A= {A1, A2,.., An} fragmented as R1, , Rr the statistical data length of each attribute: length(Ai) the number of distinct values for each attribute in each fragment: card(Ai (Rj)) maximum and minimum values in the domain of each attribute: min(Ai), max(Ai) the cardinality of each domain: card(dom[Ai]) the cardinality of each fragment: card(Rj)
Selectivity factor of each operation for relations The join selectivity factor denoted SFj of relations R and S is a real value between 0 and 1
INGRES Algorithm
Decompose each multi-variable query into a sequence of mono-variable queries with a common variable Process each by a one variable query processor
sizes
INGRES AlgorithmDecomposition
Replace an n variable query q by a series of queries q1 q2 qn where qi uses the result of qi-1. Detachment Query q decomposed into q' q" where q' and q" have a common variable which is the result of q' Tuple substitution Replace the value of each tuple with actual values and simplify the query q(V1, V2, ... Vn) (q' (t1, V2, V3, ... , Vn), t1 R)
System R Algorithm
Simple (i.e., mono-relation) queries are executed according to the best access path Execute joins
2.1 Determine the possible ordering of joins 2.2 Determine the cost of each ordering 2.3 Choose the join ordering with minimal cost