Professional Documents
Culture Documents
Task Decomposition
Tasks: programmer-defined units of computation. Tasks can be of arbitrary size. Decomposition: dividing the main computation into multiple tasks. The execution time is reduced by executing multiple tasks in parallel. Ideal decomposition:
All tasks can be executed in parallel with comparable execution times. The computation require little or no sharing of data among tasks.
Task-Dependency Graph
Some tasks may use data produced by other tasks. Problem: How to express such dependencies? Task-dependency graph: a directed acyclic graph (DAG) in which nodes represent tasks and directed edges represent dependencies among them. Rule: A task can be executed when all the tasks connected to it by incoming edges have completed. Ideal task-dependency graph for parallel computing: very few or no directed edges. A task-dependency graph is influenced by decomposition of computation into tasks and organization of tasks.
Granularity
The number and size of tasks determines the granularity of decomposition. Fine-grained decomposition: large number of small tasks. => suitable when the computation has a lot of parallelism and the underlying architecture can provide low-latency, high bandwidth communication. Coarse-grained decomposition: small number of large tasks.
Granularity: Example
Coarse-grained decomposition => 4 tasks, each computing n/4 entries of the output vector.
Concurrency
Degree of concurrency: number of tasks that can be performed concurrently. Maximum degree of concurrency: maximum number of tasks that can be performed simultaneously at any given time. Average degree of concurrency: average number of tasks that can run concurrently over the entire duration of the program. Influenced by the decomposition and the corresponding task dependency graph. Path length: sum of weights of all nodes (amount of work corresponding to a node) in the path. Critical path: the longest directed path between any pair of start and finish nodes. Average degree of concurrency = (total work done) / (critical path length)
Concurrency: Example
Total work = 63 Critical path length = 27 Max. Deg. Conc. = 4 Ave. Deg. Conc. = 63/27 = 2.33
Total work = 64 Critical path length = 34 Max. Deg. Conc. = 4 Ave. Deg. Conc. = 64/34 = 1.88
Task Interactions
Task interaction = sharing/exchanging data among the tasks. Task-dependency graphs indicate a specific form of interaction where the output of a task is the input of other tasks. Tasks that are executed in parallel also interact by sharing/exchanging data. Task-interaction graph: represents the pattern of interactions among tasks. A task-dependency graph is a subgraph of taskinteraction graph.
Task-interaction graph
A[i, j ]b[ j ]
Process: abstract entity that uses the code and data corresponding to a task to produce output in a finite amount of time. A process may communicate or synchronize with other processes as needed. Mapping = assigning tasks to processes for execution. Good mapping:
maximize the degree of concurrency minimize the total completion time minimize interaction
Mapping determines how much concurrency is actually utilized and how efficiently. Processes are mapped to physical processors.
Decomposition Techniques
Decomposition: split the computation into a set of tasks for concurrent execution. => fundamental step in designing parallel algorithms. Decomposition techniques: 1. Recursive decomposition 2. Data decomposition 3. Exploratory decomposition 4. Speculative decomposition
Recursive Decomposition
Suitable for problems solved using the divideand-conquer strategy. Divide-and-conquer strategy:
Decompose a problem into multiple smaller problems of the same type. Each of these problems are further decomposed recursively until they are simple enough to solve directly. Results of smaller problems are combined to provide the complete solution. Example: quicksort
Example 1: Quicksort
Task dependency graph based on recursive decomposition for sorting a sequence of 12 numbers. Task = partitioning a given subsequence.
Serial algorithm No concurrency Divide-and-conquer algorithm Recursive decomposition used to extract concurrency.
Task-dependency graph for finding the minimum number in the sequence {4, 9, 1, 7, 8, 11, 2, 12}
Data Decomposition
Often used for extracting concurrency in algorithms that operate on large data structures. Decomposition is done in two steps:
Input or output data are partitioned. Computations associated with the data are partitioned into tasks. Input data Intermediate data Output data Combination of input and output data
Final step: combine the intermediate results from the two tasks
Final step: combine the intermediate results from the four tasks
Exploratory Decomposition
Used for problems in which the underlying computation is a search in a solution space. Idea: Partition the solution space into smaller parts and search each part concurrently until the desired solution is found. Example: solving games - Only one of the tasks need to achieve the solution to stop the computation. - Could lead to anomalous speedups.
Speculative Decomposition
Used for programs involving many possible computational branches. Idea: while one task is performing the computation whose output is used in deciding the branch, other tasks start the computation in the next stage. Examples: - Concurrent execution of some of the cases in a switch statement in C program. - Discrete-event simulation.
Hybrid Decomposition
Finding the minimum element in an array of 16 elements using four tasks Pure recursive decomposition => 8 tasks