Thinking Parallel

A Blog on Parallel Programming and Concurrency by Michael Suess

More information on pthread_setaffinity_np and sched_setaffinity

Skimming through the activity logs of this blog, I can see that many people come here looking for information about pthread_setaffinity_np. I mentioned it briefly in my article about Opteron NUMA-effects, but barely touched it because I had found a more satisfying solution for my personal use (taskset). And while I do not have in depth knowledge of the function, maybe the test programs I wrote will be of some help to someone to understand the function better. I will also post my test program for sched_setaffinity here while I am at it, simply because the two offer similar functionality.

Problem description

Short recap: The problem both functions are trying to solve is to bind a thread (pthread_setaffinity_np) or process (sched_setaffinity) to one or more user-defined processors. You may want to do this because the scheduler is doing something stupid or because you want to keep your caches hot at all costs in a multithreaded program. More information about affinity on Linux can be found in this article on CPU affinity by Robert Love.

sched_setaffinity example

Here it goes. Not much left for me to explain, as it is documented.

/* Short test program to test sched_setaffinity
* (which sets the affinity of processes to processors).
* Compile: gcc sched_setaffinity_test.c
* -o sched_setaffinity_test -lm
* Usage: ./sched_setaffinity_test
*
* Open a “top”-window at the same time and see all the work
* being done on CPU 0 first and after a short wait on CPU 1.
* Repeat with different numbers to make sure, it is not a
* coincidence.
*/

#include
#include #include

double waste_time(long n)
{
double res = 0;
long i = 0;
while(i < n * 200000) { i++; res += sqrt (i); } return res; } int main(int argc, char **argv) { unsigned long mask = 1; /* processor 0 */ /* bind process to processor 0 */ if (sched_setaffinity(0, sizeof(mask), &mask) < 0) { perror("sched_setaffinity"); } /* waste some time so the work is visible with "top" */ printf ("result: %f\n", waste_time (2000)); mask = 2; /* process switches to processor 1 now */ if (sched_setaffinity(0, sizeof(mask), &mask) < 0) { perror("sched_setaffinity"); } /* waste some more time to see the processor switch */ printf ("result: %f\n", waste_time (2000)); } [/c] The waste_time function does nothing meaningful at all, but is merely there to crank up the CPU. It returns a result and that result is printed out so the compiler cannot optimize out the whole calculation. You may have to adjust the parameter for the function if the program executes too fast and you don’t see the results with top. Also do not forget to press the “1”-key once while watching the output of top to see the CPU usage split by CPU.

pthread_setaffinity_np example

This is almost the same program, except adapted to use a thread doing all the work. Documentation included again. Beware that you need the NPTL-version of the pthreads library for this to work (if you do not have it installed, the compiler will complain that it does not know the pthread_setaffinity_np function).

/* Short test program to test the pthread_setaffinity_np
* (which sets the affinity of threads to processors).
* Compile: gcc pthread_setaffinity_np_test.c
* -o pthread_setaffinity_np_test -lm -lpthread
* Usage: ./pthread_setaffinity_test
*
* Open a “top”-window at the same time and see all the work
* being done on CPU 0 first and after a short wait on CPU 1.
* Repeat with different numbers to make sure, it is not a
* coincidence.
*/

#include
#include #include

double waste_time(long n)
{
double res = 0;
long i = 0;
while (i < n * 200000) { i++; res += sqrt(i); } return res; } void *thread_func(void *param) { unsigned long mask = 1; /* processor 0 */ /* bind process to processor 0 */ if (pthread_setaffinity_np(pthread_self(), sizeof(mask), &mask) < 0) { perror("pthread_setaffinity_np"); } /* waste some time so the work is visible with "top" */ printf("result: %f\n", waste_time(2000)); mask = 2; /* process switches to processor 1 now */ if (pthread_setaffinity_np(pthread_self(), sizeof(mask), &mask) < 0) { perror("pthread_setaffinity_np"); } /* waste some more time to see the processor switch */ printf("result: %f\n", waste_time(2000)); } int main(int argc, char *argv[]) { pthread_t my_thread; if (pthread_create(&my_thread, NULL, thread_func, NULL) != 0) { perror("pthread_create"); } pthread_exit(NULL); } [/c] So much for my short trip into the world of CPU-affinity on Linux, hope you enjoyed the ride...

14 Responses to More information on pthread_setaffinity_np and sched_setaffinity »»


Comments

  1. Comment by Christopher Aycock | 2006/08/19 at 11:49:05

    I thought I should include a link to the Portable Linux Processor Affinity from the same team behind Open MPI: http://www.open-mpi.org/software/plpa/

    They claim that this package provides a more portable, as well as more consistent, version of sched_setaffinity().

  2. Comment by Michael Suess | 2006/08/21 at 22:59:06

    Christopher, this is certainly an interesting project (and I was not aware that it existed). Nevertheless, the first thought that came to my mind when I read up on it was: “It gives a bad impression on the state of processor affinity in the Linux kernel when a project like this becomes necessary.” And this is how I still feel about it…

  3. Comment by Nils | 2007/02/21 at 00:47:02

    I have

    nils@virtualdream:~$ getconf GNU_LIBPTHREAD_VERSION
    NPTL 2.4

    But the compiler still does not know the pthread_setaffinity_np function..

    any ideas? my distro is ubuntu edgy

  4. Comment by Michael Suess | 2007/02/21 at 23:48:43

    Nils: are you sure you included pthread.h in your program? And did you link it using -lpthread? Other than that, you should have to do nothing to get it to work on Ubuntu Edgy…

  5. Comment by charles | 2007/03/02 at 00:37:46

    you have to #include “nptl/pthread.h” not pthread.h, which will probably use the linuxthreads header file which doesn’t have the affinity methods. you will also need to do a -I/usr/include/nptl and -L/usr/lib/nptl. be aware you’re breaking your program’s compatibility with linuxthreads as a result. http://osmirrors.cerias.purdue.edu/pub/slackware/slackware-10.2/README.NPTL explains it all.

  6. Comment by Bert Wesarg | 2007/04/11 at 15:55:22

    two notes:

    (1) pthread_setaffinity_np() is just a wrapper around sched_setaffinity() with the tid from the thread described by the pthread_t. this can be see here http://tinyurl.com/2slkve and here http://tinyurl.com/yobmfb (first note).

    (2) because of this, the sched_setaffinity() call can not handle the whole affinity mask of the process, so it is impossible to move a process with many threads with one sched_setaffinity() call to an other set of cpus. and the kernel does also not make a different between a process and a thread, which can be see here http://tinyurl.com/2kdu4f.

  7. Comment by antx | 2008/06/06 at 02:40:43

    After inspecting my headers a week ago I found that the pthread_setaffinity_np() is located inside a #ifdef GNU or something similar,
    then after some googling I found that in order to link the function you MUST use -D_GNU_SOURCE, so try appending it to GCC!

    Luck & Happy hacking.

  8. Comment by Ramki | 2009/02/28 at 01:47:00

    Turns out that RHEL doesn’t find the right NPTL enabled libpthread.so by default. Trying with -L/lib64/tls/nptl on RHEL works (without the _GNU_SOURCE define).


    Ramki Balasubramanian
    ramki_b@acm.org

  9. Comment by Ramki | 2009/02/28 at 01:54:39

    in the previous post, s,/lib64/tls/nptl,/usr/lib64/nptl,

    – Ramki

  10. Comment by Pradeep | 2009/06/30 at 02:40:34

    Hi,
    Any idea about “pthread_setaffinity” version, not _np (non preemptive) version?
    I am on a project that require job to be preempt on a single processor. (This is an experiment)

    -Pradeep

  11. Comment by tommy | 2010/02/10 at 08:16:32

    np stands for “non-portable”
    http://www.kernel.org/doc/man-pages/online/pages/man3/pthread_setaffinity_np.3.html

    I tried this function on an Intel Core2 Duo processor with two threads to run on the two cores. printf outputs tell me that all was well but the system monitor shows me that only one of the cores is being utilized. (The same application with OpenMP uses both cores). Anything i missed? Thanks

  12. Comment by John | 2010/03/26 at 13:17:49

    I har run my multithreaded application on a quad core but even I used pthread_setaffinity_np and it seems to set the affinity of each thread on the desired CPU-CORE, also TOP utility show that each core is loaded but the time reports that for a given TASK isung one THREAD it takes T seconds.millis…while for the same task spreaded on N=4 THREADS it takes 4xT secconds.millis…so four times longer which proves that there isn’t parallelism…strange

  13. Comment by manas | 2010/04/16 at 17:58:04

    I tried using this function to bind two cores using two threads… but i noticed that
    the function is not binding threads to the cores. when the threads are reading the values from shared memory ,means they have the same data set to work upon in that case scheduler is running both the threads on the same core irrespective of this function. I came to know by looking at the system monitor and the time of execution. even after using the threads the execution time was coming out to be approximately equal to the sequential code. Can anybody help me with this … can somebody give us a good example like the above one on sched_setaffinity…

    when i tried to use this function in the threads which are using totally different data sets then this functon worked perfectly . thanks for the example

  14. Comment by glass | 2010/05/29 at 04:46:11

    call can not handle the whole affinity mask of the process, so it is impossible to move a process with many threads with one sched_setaffinity


Leave a Reply

HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>