Bryan O’Sullivan has a beautiful summary of the present state of NVIDIA’s CUDA. He explains the programming model, along with the many different levels of memory and their restrictions (there are many ). I had been quite optimistic in my last post about CUDA (just from taking a quick glance at their source code), but Bryan’s very educated opinion brought me back to earth, as they say. Just a small quote so you can see what I mean exactly:
People with the expertise, persistence, and bloody-mindedness to keep slogging away will undoubtedly see phenomenal speedups for some application kernels. I’m sure that the DOE and NSA, in particular, are drooling over this stuff, as are the quants on Wall Street. But those groups have a tolerance for pain that is fairly unique. This technology is a long way from anything like true accessibility, even to those already versed with parallel programming using environments like MPI or OpenMP. Still, it’s a great first step.
I have talked to a student of ours who started looking into CUDA some time ago and asked him to compare it to his work on the Cell-Processor (he did an internship at IBM). His comment was (forgot his exact words, but the spirit is kept):
Working on the Cell was soo much easier!
Oh well. Anyways, I wish I had more time to look into this (even reading and understanding the user manual would take hours I just don’t have at the moment). But having the power of a supercomputer under your desk is just sooo tempting…