Thinking Parallel

A Blog on Parallel Programming and Concurrency by Michael Suess

How-to Split a Problem into Tasks

AxeThe very first step in every successful parallelization effort is always the same: you take a look at the problem that needs to be solved and start splitting it into tasks that can be computed in parallel. This sounds easy, but I can see from my students reactions that at least for some of them it is not. This article shows five ways to do just that. It is a short version for a blog article, the long version can be found in the book Introduction to Parallel Computing by Grama and others.

Let’s start with a short paragraph on terminology: what I am describing here is also called problem decomposition. The goal here is to divide the problem into several smaller subproblems, called tasks that can be computed in parallel later on. The tasks can be of different size and must not necessarily be independent (although, of course, that would make your work easier later on :smile: ).

If many tasks are created, we talk about fine grained decomposition. If few tasks are created it is called coarse grained decomposition. Which one is better heavily depends on the problem, as many tasks allow for more concurrency and better scalability, while fewer tasks usually need less communication / synchronization.

There are several ways to do problem decompositions, the most well-known probably being recursive decomposition, data decomposition, functional decomposition, exploratory decomposition and speculative decomposition. The next few paragraphs will shortly explain how they are carried out in practice.

Recursive Decomposion

Divide-and-Conquer is a widely deployed strategy for writing sequential algorithms. In it, a problem is divided into subproblems, which are again divided into subproblems recursively until a trivial solution can be calculated. Afterwards, the results of the subproblems are merged together when needed. This strategy can be applied as is to achieve a recursive problem decomposition. As the smaller tasks are usually independent of one another, they can be calculated in parallel, often leading to well-scaling parallel algorithms. Parallel sorting algorithms often use recursive decompositions.

Data Decomposition

When data structures with large amounts of similar data need to be processed, data decomposition is usually a well-performing decomposition technique. The tasks in this strategy consist of groups of data. These can be either input data, output data or even intermediate data, decompositions to all varieties are possible and may be useful. All processors perform the same operations on these data, which are often independent from one another. This is my favorite decomposition technique, because it is usually easy to do, often has no dependencies in between tasks and scales really well.

Functional Decomposition

For functional decomposition, the functions to be performed on data are split into multiple tasks. These tasks can then be performed concurrently by different processes on different data. This often leads to so-called Pipelines. Although this decomposition technique is usually easy to do as well, it often does not scale too well, because there is only a limited amount of functions available in each program to split up.

Exploratory Decomposition

Exploratory decompostion is a special case for algorithms, who search through a predefined space for solutions. In this case, the search space can often be partitioned into tasks, which can be processed concurrently. Exploratory decompostion is a special case decomposition that is not generally applicable, an example is breadth-first search in trees.

Speculative Decomposition

Another special purpose decomposition technique is called speculative decomposition. In the case when only one of several functions is carried out depending on a condition (think: a switch-statement), these functions are turned into tasks and carried out before the condition is even evaluated. As soon as the condition has been evaluated, only the results of one task are used, all others are thrown away. This decompostion technique is quite wasteful on resources and seldom used (personally, I have never used it).

The different decomposition methods described above can of course also be combined into hybrid decompositions. I think these are the most common techniques. Do you happen to know any other important ones? Then please don’t hesitate to share it in the comments!

11 Responses to How-to Split a Problem into Tasks »»


Comments

  1. Comment by Petrov Alexander | 2007/09/06 at 17:32:59

    Good things at the same time, in one place. Thank you.

  2. Comment by Minesh B. Amin | 2007/09/06 at 21:37:42

    Have one interesting case: a collection of fine grain tasks that:
    +) need to be executed in parallel (obvious goal)
    +) but Quality of Result depends on each compute resource
    having access to the __complete__ list of the aforementioned
    fine grain tasks (the trick is to do this in a scalable manner)

    The reason: a task, when done, can be leveraged to prune a list of
    pending tasks, but this information is only available in real-time; i.e.
    one cannot precompute this, rather critical, piece of information
    ahead of time.

    Now, why is this important? Simply because this is the key when implementing
    scalable SAT solvers; in fact, this is the only way to do it :)

  3. Comment by Dmitriy Setrakyan | 2007/09/13 at 20:46:24

    Hi.

    Great outline of various decomposition techniques. I actually never separated Recursive Decomposition and Data Decomposition in my mind, but you are right – they are different, although often used together.

    A good special case to mention is Recursive Decomposition with depth of one (1). This one is most commonly used in my opinion.

    Best,
    Dmitriy Setrakyan
    GridGain – Grid Computing Made Simple

  4. Comment by Steve Reinhardt | 2007/10/14 at 15:39:41

    We’ve been working on parallel constructs for very high-level languages (Python, MATLAB(R), and R) that are suitable for programmers who are not expert in parallelism and who do not want to invest the time to become expert in parallelism. We now have a fair amount of evidence that data decomposition as a concept works for many programmers. We’ve implemented both data-parallel and task-parallel flavors, with each having its strengths and weaknesses. (See discussion of these at http://isc.hubspot.com/starp_blog/tabid/8706/bid/2464/Types-of-Parallelism-Task-Parallelism.aspx.) Extending the task-parallel approach to include communication looks like it will require some careful thought about what the right abstractions are, to keep the conceptual complexity within reason.

    Functional decomposition looks valuable, though as you note, its scalability is often limited. I’ve run into an interesting twist on functional decomposition (perhaps with a note of speculative decomposition in it as well), where people have different algorithms (one example is from image processing) and they don’t know in advance which algorithm will work best on the data in question. So, they just run all the candidate algorithms in parallel and choose the best answer.

    As for recursive decomposition, I question whether this is something that typical programmers will readily accept. I know for us computer science types recursion is a powerful tool, but for most of the rest of the planet it doesn’t seem intuitive. Perhaps the right abstractions could help overcome this.

  5. Comment by Carl Turner | 2007/12/28 at 16:40:17

    I linked to your description of problem decomposition in my own blog. My blog entry describes the results of failing to properly decompose a problem, a design anti-pattern I called Exploding Whale.

  6. Comment by Aamod | 2011/09/28 at 06:16:28

    Wow… so nicely explained… Gotcha… Thanks a lot….. :smile:

  7. Comment by woodylcy | 2011/11/29 at 10:35:56

    Now I am studying how to decompose applications into task on a realtime OS, OSEK. One of my friend told me, it needs experience.
    But Now I have no experience in embedded developing on a system. So I am so eagering to know how to do this. I am so happy to found your article here. I hope you can give me more materials on this topic and give me a hand.
    thanks

  8. Comment by mteffaha | 2012/10/26 at 23:16:08

    very nice explanation , even though it’s a little bit too late i would like to mention that there is another kind of decomposition i have seen in school recently wich is pipeline decomposition where we have multiple tasks each handling a data entry and passing it to the next task.


Trackbacks & Pingbacks »»

  1. Pingback by Decomposition Techniques | iface thoughts | 2007/09/10 at 15:35:56

    [...] Suess gives an introduction to various decomposiion techniques to split the problem into sub-problems. While Michael starts with reference to parallel [...]

  2. Pingback by problem decomposition « eORmap | 2007/09/15 at 12:09:15

    [...] и speculative decomposition. Если возникли трудности – читайте How-to Split a Problem into Tasks. Статья особенно интересна людям с опытом [...]

  3. [...] make things simpler, let me quote the excellent blog post How-to Split a Problem into Tasks. The very first step in every successful parallelization effort is always the same: you take a [...]

Leave a Reply

HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>