Thinking Parallel

A Blog on Parallel Programming and Concurrency by Michael Suess

A LockChristopher Smith has tried very hard to come up with a case where a recursive lock is a good thing (TM). And while the article is very well thought out and presented, apparently David Butenhof (the creator of my favorite book on POSIX Threads) disagrees. Actually, it was the other way around, as Butenhofs article is way older. But anyways, as I have voiced my own opinion on the subject in this article already (executive summary of my opinion on recursive locks: I have not seen this used in practice a whole lot and it is usually a sign that something is wrong with your synchronisation patterns anyways), I wanted to comment on some of the points Christopher makes. (more…)

Joe from scalability.org points to an interesting article. Apparently, ATI is moving into the field of stream processing. In this post, I will tell you a little about what stream processing is, what your graphics processor has to do with it and also what problems I see with it. But let me start by having a little fun with their announcement:

ATI has invited reporters to a Sept. 29 event in San Francisco at which it will reveal “a new class of processing known as Stream Computing.”

Wow. Looks like they invented a whole new paradigma. Or maybe not? I guess some smart guys in Stanford have been into stream processing for almost ten years now. And I bet they weren’t the first ones either. (more…)

Time to write about some of the things I am doing at work again. As you may remember from my last article related to work, I am presently toying around with OpenMP and C++, trying to implement commonly used patterns there. What I want to write about today is not exactly a pattern, but rather a language feature OpenMP does not have – the ability to specify a function that is supposed to be carried out exactly once. (more…)

I am writing this quick note, because today I wasted at least an hour looking for a bug in my code, when it was really a bug in the compiler. And when I like something even less than bugs, its wasted time :x. Therefore, I will repeat my simple advice to myself here, to make sure I remember the next time: Sometimes, it IS the compilers fault!

Huh, that felt good :). Perhaps a little more explanation about my background and why this is not self-evident for me is in order: When I started programming (yes, I still remember that time 8)) and something did not work out as expected, I was very quick to blame the compiler (the joys of youth :D). I see my students doing this today as well. There is one thing to keep in mind, though: I started programming in C (actually, I did Pascal before that, but that’s not the point) in about 1994. By that time, the language and its compilers were quite old and well-tested already. Therefore, most of the time, the compiler was right, and the real error-source was sitting in front of the keyboard. I do not remember finding a single real bug in a compiler at that time, especially since I kept away from the more advanced language features, because after all I was a beginner back then. At that time, I learned this simple lesson: the compiler is always right. (more…)

keysDo you know the difference between a mutex and a spinlock? You know exactly what kind of locks are employed in e.g. OpenMP? You find hierarchical locks boring and recursive spinlocks only make you yawn? Then you will most likely NOT enjoy reading this article, and I encourage you to stop reading now, as it is way too basic for you – don’t worry, though, as I have a way more advanced article in the pipeline :-). Seriously, stop!

Now wouldn’t that be an interesting statistic as to how many of my readers just stopped reading at this point, because they already know everything I have to offer in this article :-). Unfortunately, I will have to cope without this knowledge for now. Anyways, since you are still reading I will assume that I can tell you something interesting today (or so I hope) – and if not, you would CFLink, right? (Now, that’s an innovation! A call to action right at the beginning of my post! Will see how this turns out 😉 ) (more…)

There have been various articles on the merits of different programming languages, both parallel and sequential, during the last week (notice how elegantly I have avoided the term language wars 😉 ):

I don’t feel like commenting on all of these articles and getting involved into a language war, so I will leave them uncommented for now. Maybe at some later time…

Chris over at HPCAnswers has written a nice introductory article about locality optimization. He uses matrix multiplication as an example. The rest of this article assumes you have read his post and therefore know at least some basics about loop reordering and locality optimization.

I remember very clearly, when my supervisor first introduced the concept of locality to us in her parallel programming class a couple of years ago: I was stunned. I could not believe that such a simple change made such a huge difference in performance (I was younger then 😉 ). And above all: I could not believe that the compiler was not smart enough to reorder that loop by itself. After all, there are no complicated dependencies or function calls in there! But I did some tests, and the performance difference between the loop versions was there and clearly visible. (more…)

Larry O’Brien has written an introductory article on parallel programming with OpenMP on Windows and announced it in his blog. I enjoyed reading the article and think it is a really nice resource for people new to parallel programming. I would like to comment some parts of his article and since it does not have a comment section (and would be quite big, anyways), I will do it here: (more…)

When reading any recent book about using C++ and parallel programming, you will probably be told that scoped locking is the greatest way to handle mutual exclusion and is in general the greatest invention since hot water. Yet, most of these writers are not talking about OpenMP, but about lower level threading systems. OpenMP has its own rock-star directive for taking care of synchronisation, and it is called critical. Nevertheless, it is possible to implement scoped locking in OpenMP and I was curious: What would an honest comparison between the two contenders result in?

Before I start, a word of warning: This article requires some C++ and OpenMP knowledge, as well as some knowledge on general parallel programming concepts. It also concentrates on a very special problem, which may not be interesting for everyone. You have been warned :P. (more…)

Skimming through the activity logs of this blog, I can see that many people come here looking for information about pthread_setaffinity_np. I mentioned it briefly in my article about Opteron NUMA-effects, but barely touched it because I had found a more satisfying solution for my personal use (taskset). And while I do not have in depth knowledge of the function, maybe the test programs I wrote will be of some help to someone to understand the function better. I will also post my test program for sched_setaffinity here while I am at it, simply because the two offer similar functionality. (more…)