Thinking Parallel

A Blog on Parallel Programming and Concurrency by Michael Suess

A Short Guide to Mastering Thread-Safety

RevolverWhen I am talking on forums or to some of my students, from time to time I realize that I have gotten so used to expecting a certain vocabulary and certain concepts to be known to others that it isn’t even funny. In German we call this being betriebsblind, which basically means nothing more than expecting everyone to know what you know (at least to a certain extent), because the knowledge is so essential for what you do every day. The concept of Thread-Safety definitely belongs into this category for me, as I usually have to force myself to explain it and its implications to my students new to parallel programming, because it seems so trivial. Unfortunately, it is not. Therefore, I will spend the rest of this article trying to explain what Thread-Safety means, why it is important and also highlight some ways to achieve it without throwing your performance out of the window. I would also like to issue the usual warning to my more advanced readers: the beginning of this article is most definitely not for you, the end may be worth checking out as well :P.

What is Thread-Safety?

One of the defining properties of threads is, that multiple threads have access to the same address space. Therefore, they are able to write to the exact same memory location at the same time. I have explained why this is a problem in one of my earlier articles about mutual exclusion, therefore I can use the short version here:

Writing to the same memory location from multiple threads at the same time without any form of protection is like playing Russian Roulette: you may survive at first, but sooner or later you run out of luck and the results are disastrous.

You may quote me on that 8). For a more thorough explanation, please read the former article. Now that we have this part established, the rest is easy:

Thread-safe code is basically any code that can be called from multiple threads concurrently without breaking.

Or phrased differently: if your code does not access the same memory location from multiple threads without protection, it is most likely thread-safe. We can have thread-safe functions, thread-safe objects or thread-safe programs, although the usage of the term is most common when talking about functions.

So much for the easy part. The difficult part is actually spotting these accesses to the same memory location in your code and fixing them without loosing too much performance. But contrary to what some message-passing fans may want you to believe, this is not impossible. My students usually get it right after we have shown it to them in their programs once or twice. And with experience you will notice the little voice in the back of your head that warns you about it every time you want to write to a shared variable or call a function that may not be thread-safe. And if you don’t want to bet your career on little voices (probably a smart move on your part), let’s not forget about the high quality tools that are available to help you with this as well, e.g. the Intel Thread-Checker or the Sun Data Race Detection Tool. I have had very good experiences with the former tool and as soon as I find some time I will also try out the second one (especially since it is free – on the other hand the Intel tool is very reasonably priced as well, especially for universities).

How-to Achieve Thread-Safety

OK, so there are ways to reliably find the dangerous spots in your code. What do we do, once we have found them? Lets say we have isolated a function that is not thread-safe because it was written at a time when parallelism was not an issue. We have multiple ways to deal with it, some of which are highlighted in the following list:

  • Put a big, fat lock right at the start of the function and unlock it at the end of the function. This is the easy solution, because it is doable in every threading system I know and it always works. Unfortunately, it is also the one which may cost you the most performance, since only one thread can carry out the function at the same time now. On the other hand, I have written the magic word may in the last sentence not without a reason: if the function is only rarely called, that big, fat lock may not be a problem for performance at all. Your profiler will tell you, especially if it knows about parallelism.
  • Put one or many smaller locks in the function around the data that actually need protection. This finer grained approach may work better than the one sketched above, especially when there are only a few places in the function that actually need protection and there are larger areas that are thread-safe and can be carried out concurrently. It may also turn out to be a bad idea (TM), because setting and unsetting the locks also takes some time, even when there is no contention. As you can see by my repeated usage of the fine word may, there is no way to know which technique is better without taking a look at the actual code, all I am telling you here are just ideas which may or may not work.
  • Protect the data instead of the code. No, I am not senile (yet), this is actually a different advice than the one I just gave you. Instead of protecting a small region of code with a lock where critical data is accessed, associate a lock with the critical data in question. This may allow you to use two different locks for the same region of code but when working on different data, allowing concurrent execution by multiple threads. An example: When you have written a function that multiplies each element of an array by two, you can associate a different lock to different arrays, allowing two threads to carry out the function concurrently when working on two different arrays.
  • Use atomic operations to change the data that need protection. The availability of these operations varies from system to system, e.g. POSIX Threads does not have them at all, OpenMP gives you the possibility to mark certain simple operations on plain old data-types as atomic via the atomic-directive and in Java you can declare certain variables as atomic and all operations involving that variable will be carried out atomically. Check the documentation of your parallel programming system of choice and if you have atomic operations, it is most likely a good idea to use them (gosh, I cannot believe I wrote that whole bullet point without using may even once! :))
  • Change your functions to only work on thread-private data. This may sound harder than it actually is, but many times shared data are not really necessary. Even if they may seem necessary, there is little trick that can change this quite often: delegate the shared data to the caller of the function, which passes it in as a parameter (most likely a pointer or a reference in C++) and document that the caller is responsible for protecting that data before the function is called. This may sound not like a solution, but rather like a delegation of the whole problem. But considering that the caller may have more knowledge about the present state of the program and its data, it makes a whole lot of sense. As an example, the caller may know, that no other thread presently calls that function with the same data and therefore no protection from concurrent access is necessary. Of course, the moment you employ this last technique, the function is not thread-safe anymore, so make really sure to document this properly! It also changes the interface of the function, which may not be doable for your particular case.

Further Reading

If you want to know more about thread-safety and related topics, I wholeheartedly recommend what I like to call the POSIX Threads Bible by David Butenhof – Programming with POSIX Threads. Or if you are more into Java, I hear Java Concurrency in Practice is a very good read. I have not really found a great book on Thread-Programming with C# – the best one appears to be Programming C#, although it contains only a single chapter on threaded programming. If you find a better one, please let me know. All of them contain useful information on Thread-Safety as well.

Summary

This closes my little trip into the world of Thread-Safety. I hope to have brought across, that Thread-Safety is a challenging problem, but can be managed with a little experience and the right tools. I also tried to show you some techniques, that may (that was the last time I used that verb for today, I promise :P) be useful to achieve Thread-Safety for your code.

The name-patron of my old high-school Wilhelm Ostwald had a famous maxim, which can be (roughly) translated into English like this:

Good theory must soon become practical. That is how its value can be seen.

Following this spirit, I am planning on writing a followup article on this topic, explaining how the techniques introduced here may (guess I lied :twisted:) be used to make the rand()-function thread-safe (and explain in the process, why its performance often sucks in multi-threaded code).

P.S.: This essay was sponsored by the word may, thanks for listening! ;)

12 Responses to A Short Guide to Mastering Thread-Safety »»


Comments

  1. Comment by Vivek | 2006/10/18 at 22:32:03

    Hi

    You didnt mention that its possible to work out lock free data structures in some cases.
    A circular queue im writing needs no lock as one thread queues and the other deques. There are just 2 32 bit values that define the queue and on x86 int reads and writes are atomic… so look ma no locks..

  2. Comment by Stefan | 2006/10/30 at 09:34:10

    I wouldn’t expect reading about W. Ostwald on a website like this. Good translation (as far as I can evaluate this). You got the point so far. Now I’m looking forward to the next article on that. And to which extent the word ‘may’ will guide you again *g*.

  3. Comment by Lionell Griffith | 2006/11/06 at 15:41:04

    I suggest that you shift your thinking to dealing with shared resources. Its not only memory and code that’s involved. Its not only parallel processing that has the problem. Preemptive multi-tasking/threading also is challenged by shared resource access even on a single core CPU.

    I know you think you are discussing ONLY multi-processing issues. Preemptive multi-tasking/threading has EXACTLY the same set of issue that are best dealt with by EXACTLY the same set of strategies suitable for parallel processing. Solve the shared resource contention issues for one domain and you have solved them for the other.
    The only necessary implementation difference between the two domains is the assignment of priorities to threads in a single CPU and assignment of threads to specific CPU’s in multi-processing systems.

    I suspect the challenge is that few people programming today have ever had to deal with multi-priority real-time interrupt-driven data-acquisition and process-control systems. Especially ones on room sized computers with less compute power than your typical digital watch.

  4. Comment by Michael Suess | 2006/11/08 at 10:48:58

    @Vivek: I have not mentioned lock-free data structures, because I don’t consider them safe for use (not to mention portable) yet, except when hidden from the user in a well-implemented library. I will probaby write an article about these some time later, though.

    @Lionell: I am not sure I understand your comment. I am not only discussing multi-processing issues, I am discussing multi-threading issues. And I do know that the problems I have outlined do happen on single-processor machines, as the scheduler is free to preempt threads at any time. Will try to make that more clear in future articles…

  5. Comment by Dieter an Mey | 2006/11/23 at 18:21:13

    The problem with a lack of thread-safty is that data races may not appear during a thousand program runs, but then suddenly they do occur and cause the program to deliver wrong results or crash.

    A good idea to make a buggy program fail earlier is to start it on a machine which has less processors (or cores) then you are employing threads.

    So a single processor box may be a good platform for debugging (if your threads do not use spin-waiting, in which case the program may take a while … )

  6. Comment by James Brock | 2007/06/20 at 07:54:13

    Vivek, your thread-safe circular queue that leverages atomic increment-and-return integer operations is possible–if you only allow two threads. I tried to write a general thread-safe constant-time circular queue in C# using System.Threading.Interlocked. I thought I had it perfected, but my unit test kept failing. It took me a long time to discover the case for which it fails, and it is a case that I have no answer for:

    1. Thread A enters Enqueue
    2. Thread B enters Enqueue
    3. Thread B exits Enqueue
    4. Thread C enters Dequeue
    5. Thread C exits Dequeue
    6. Thread A exits Enqueue

    Thread C will Dequeue a nonsense value because thread A has not yet written to the underlying array, but thread B has already incremented the queue pointer.

    In general, your data structure will not be thread safe if you have three or more threads executing the Enqueue and Dequeue methods at different speeds. You must either abandon constant-time or use locks.

  7. Comment by Leniel Macaferi | 2008/01/06 at 20:55:30

    Hello,

    I’m also implementing a thread safe circular queue and as far as I went the only way I could achieve that was using locks.

    So I got the first option of the listing discussed in this post: put a big, fat lock right at the start of the function and unlock it at the end of the function.

    By the way Michael, your post is really enlightening.

    Leniel Macaferi

  8. Comment by Doug | 2008/01/30 at 01:22:56

    Or create worker threads that only read shared data and write to their own memory. Threading is much easier when threads don’t talk to each other and when everything they share is read only.

  9. Comment by Rachna | 2008/07/17 at 06:34:06

    I heard somewhere that if the data that is being shared by different threads is atomic, there is not need to use locks and that it will be automatically thread safe. Is this true? If it is, then please tell me what are the atomic data types available in VC++? Are atomic data types always threadsafe regardless of whether its used on 32-bit OS of 64-bit OS?

  10. Comment by Michael Suess | 2008/07/20 at 15:13:36

    It could be true, depending on the memory model of the particular threading-system you are using. But believe me, you probably don’t want to rely on stuff as low-level as this, because its very easy to introduce mistakes there.

  11. Comment by Doug | 2008/07/21 at 19:48:55

    Rachna atomic data is not thread safe when you write to the atomic data because of CPU memory caching issues. Read only is safe because the cached data is always the same as what exists in memory. Even if you have atomic variables you must mark them as volatile and access them in a critical section or some other locking mechanism if you wish to update them.

  12. Comment by mathk | 2012/07/31 at 14:24:28

    You ‘ve said that
    ” Therefore, they are able to write to the exact same memory location at the same time. I have explained why this is a problem”

    I am not totally agree on that because there is some situation where when two thread write to the same location it still thread safe. For example suppose that you want to do something when one of your child thread have finish a job. In this situation you could decide to have a shared boolean and the order at which thread write to it does not mater. I think the definition of thread safety-ness need thorough explanation.


Leave a Reply

HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>