Thinking Parallel

A Blog on Parallel Programming and Concurrency by Michael Suess

Archive for the 'Optimization' Category

OpenMP Does Not Scale – Or Does It?

While at the Parco-Conference two weeks ago, I had the pleasure to meet Ruud van der Pas again. He is a Senior Staff Engineer at Sun Microsystems and gave a very enlightening talk called Getting OpenMP Up To Speed. What I would like to post about is not the talk itself (although it contains some […]

Parallel Programming Fun with Loop Carried Dependencies

There was a very interesting discussion on the OpenMP Mailing List last week about a for loop and how to parallelize / optimize it. I will comment on the problems and solutions presented there and also have an interesting observation to add at the end of this article. Although the examples are in OpenMP, the […]

Matrix Optimization Gone Wrong – Reloaded

Who would have thought that such a small article on a Matrix Optimization and its impact on execution time would generate that many comments in such a short time, thanks for all of them! The original point I was trying to make in that article was to always check, if an optimization actually improves matters […]

Matrix Optimization Gone Wrong

The other day, two students of mine showed me some of their code. Part of their task was to fill a two-dimensional matrix using the formula: A[i][j] = 2 * i + 3 * j + 1; They showed me an optimization they did to speed up that calculation and I was skeptical. This article […]

Locality optimization experiments

Chris over at HPCAnswers has written a nice introductory article about locality optimization. He uses matrix multiplication as an example. The rest of this article assumes you have read his post and therefore know at least some basics about loop reordering and locality optimization. I remember very clearly, when my supervisor first introduced the concept […]

My views on high-level optimization

This article is about high-level-optimization, i.e. I will explain how I usually approach optimizing a program without going into the gory details. There are a million books and web pages about low-level-optimizations and the tricks involved out there, therefore I will not dive into that (could not possibly hope to cover this in a single […]