<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Matrix Optimization Gone Wrong - Reloaded</title>
	<atom:link href="http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/</link>
	<description>A Blog on Parallel Programming and Concurrency by Michael Suess</description>
	<pubDate>Fri, 16 May 2008 23:29:29 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: Michael Suess</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1992</link>
		<dc:creator>Michael Suess</dc:creator>
		<pubDate>Fri, 09 Feb 2007 15:17:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1992</guid>
		<description>@gwenhwyfaer: yes, it appears you were right. Sorry, I don't have a price to award :-).

What I don't get is why this is not a safe optimization. Obviously, there is some room for discussion here, since the Intel people seem to consider it unsafe and the gcc people do not. David Howard hints in an earlier comment that "A[i][j-1] could be aliased with a pointer somewhere else" - but where should that somewhere else be? After all, this is not a volatile variable and we are not even talking about multiple threads here (and even if we did - most memory models from threaded systems I know would still allow it). So I am afraid I don't get it. Can anyone enlighten me as to why this optimization is not safe? Thanks a lot.</description>
		<content:encoded><![CDATA[<p>@gwenhwyfaer: yes, it appears you were right. Sorry, I don&#8217;t have a price to award :-).</p>
<p>What I don&#8217;t get is why this is not a safe optimization. Obviously, there is some room for discussion here, since the Intel people seem to consider it unsafe and the gcc people do not. David Howard hints in an earlier comment that &#8220;A[i][j-1] could be aliased with a pointer somewhere else&#8221; - but where should that somewhere else be? After all, this is not a volatile variable and we are not even talking about multiple threads here (and even if we did - most memory models from threaded systems I know would still allow it). So I am afraid I don&#8217;t get it. Can anyone enlighten me as to why this optimization is not safe? Thanks a lot.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: gwenhwyfaer</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1954</link>
		<dc:creator>gwenhwyfaer</dc:creator>
		<pubDate>Fri, 09 Feb 2007 00:54:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1954</guid>
		<description>So I was right then? :)

As I said, it's not register allocation; it's whether or not the read from A[i][j-1] has to wait for the previous write to A[i][j] (when j was 1 less) to complete, because the processor *itself* can't tell that that the read is just recycling a value. In modern CPU architectures, writes can proceed asynchronously - but any subsequent reads from the same address (especially in the presence of multiprocessor systems) must wait for the write to complete first. I'm guessing that gcc spots that the read will actually fetch back the same value that was just written and substitutes in a temp to avoid that - but that's actually not a safe optimisation, so I'm not too surprised the Intel compiler didn't go for it.</description>
		<content:encoded><![CDATA[<p>So I was right then? <img src='http://www.thinkingparallel.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>As I said, it&#8217;s not register allocation; it&#8217;s whether or not the read from A[i][j-1] has to wait for the previous write to A[i][j] (when j was 1 less) to complete, because the processor *itself* can&#8217;t tell that that the read is just recycling a value. In modern CPU architectures, writes can proceed asynchronously - but any subsequent reads from the same address (especially in the presence of multiprocessor systems) must wait for the write to complete first. I&#8217;m guessing that gcc spots that the read will actually fetch back the same value that was just written and substitutes in a temp to avoid that - but that&#8217;s actually not a safe optimisation, so I&#8217;m not too surprised the Intel compiler didn&#8217;t go for it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Marc Brooks</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1554</link>
		<dc:creator>Marc Brooks</dc:creator>
		<pubDate>Thu, 01 Feb 2007 08:58:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1554</guid>
		<description>Why not hoist that if out of the inner loop, huh?

&lt;code&gt;
for (i = 0; i &lt; N; i++) {
    int temp = 2 * i + 1;
    A[i][0] = temp;
    for (j = 1; j &lt; N; +) {
         temp += 3;
         A[i][j] = temp;
    }
}
&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p>Why not hoist that if out of the inner loop, huh?</p>
<p>[
<div class="igBar"><span id="lcode-1"><a href="#" onclick="javascript:showPlainTxt('code-1'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-1">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#006600; font-weight:bold;">&#93;</span>DQpmb3IgKGkgPSAwOyBpIDwgTjsgaSsrKSB7DQogICAgaW50IHRlbXAgPSAyICogaSArIDE7DQogICAgQVtpXVswXSA9IHRlbXA7DQogICAgZm9yIChqID0gMTsgaiA8IE47ICspIHsNCiAgICAgICAgIHRlbXAgKz0gMzsNCiAgICAgICAgIEFbaV1bal0gPSB0ZW1wOw0KICAgIH0NCn0NCg==<span style="color:#006600; font-weight:bold;">&#91;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p>]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Suess</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1526</link>
		<dc:creator>Michael Suess</dc:creator>
		<pubDate>Wed, 31 Jan 2007 20:08:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1526</guid>
		<description>Thanks for the link, Daniel, it appears to work. Just wrap your code into code-tags and they should not be formatted anymore.

franjesus: the comments preview plugin I used turns all my articles into blank pages, not going to turn it on again.</description>
		<content:encoded><![CDATA[<p>Thanks for the link, Daniel, it appears to work. Just wrap your code into code-tags and they should not be formatted anymore.</p>
<p>franjesus: the comments preview plugin I used turns all my articles into blank pages, not going to turn it on again.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: franjesus</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1509</link>
		<dc:creator>franjesus</dc:creator>
		<pubDate>Wed, 31 Jan 2007 10:47:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1509</guid>
		<description>test


#include 


#include \


#include \


Preview of comments would be nice ;-)</description>
		<content:encoded><![CDATA[<p>test</p>
<p>#include </p>
<p>#include \</p>
<p>#include \</p>
<p>Preview of comments would be nice <img src='http://www.thinkingparallel.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: david howard</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1487</link>
		<dc:creator>david howard</dc:creator>
		<pubDate>Wed, 31 Jan 2007 02:13:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1487</guid>
		<description>&#62;Who would have thought that todays compilers still have issues with register allocation.

I would expect the compiler is not allowed to move the expression from the line

    A[i][j] = A[i][j - 1] + 3;

into a register because A[i][j-1] could be aliased with a pointer somewhere else. using the 'temp' variable eliminates the aliasing problem. So the slow version with the A[i][j-1] is forced to do an access to memory in the inner loop that is not present in the original or final optimization.</description>
		<content:encoded><![CDATA[<p>&gt;Who would have thought that todays compilers still have issues with register allocation.</p>
<p>I would expect the compiler is not allowed to move the expression from the line</p>
<p>    A[i][j] = A[i][j - 1] + 3;</p>
<p>into a register because A[i][j-1] could be aliased with a pointer somewhere else. using the 'temp' variable eliminates the aliasing problem. So the slow version with the A[i][j-1] is forced to do an access to memory in the inner loop that is not present in the original or final optimization.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Michels</title>
		<link>http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1484</link>
		<dc:creator>Daniel Michels</dc:creator>
		<pubDate>Wed, 31 Jan 2007 01:20:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.thinkingparallel.com/2007/01/31/matrix-optimization-gone-wrong-reloaded/#comment-1484</guid>
		<description>Uh. It ate my post. So here 's the link again:
http://www.coffee2code.com/archives/2005/03/29/plugin-preserve-code-formatting/</description>
		<content:encoded><![CDATA[<p>Uh. It ate my post. So here 's the link again:<br />
<a href="http://www.coffee2code.com/archives/2005/03/29/plugin-preserve-code-formatting/" rel="nofollow">http://www.coffee2code.com/archives/2005/03/29/plugin-preserve-code-formatting/</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
