Back to General and Gameplay Programming

Optimizing Read/Write of a Variable

General and Gameplay Programming Programming

Started by Haytil January 12, 2009 12:19 PM

1 comment, last by Zahlman 15 years, 3 months ago

Haytil

525

Author

January 12, 2009 12:19 PM

I'm reading a programming book, in which the following bit of code is written:


if (zi < z_ptr[xi])
{
   // write textel
   screen_ptr[xi] = color;

   //update z-buffer
   z_ptr[xi] = zi;
}

(The code is a basic software implementation of a Z-Buffer for a software rasterizer. This bit of code is being called quite a lot, but other than that, the purpose of the code is, I think, irrelevant) Afterwards, in a side-box, is written: "Notice that even after I determine that the Z-buffer should be updated, I do not update it with the value of zi. This is an optimization trick. It's usually a bad idea to immediately read and then write the same value - better to put somethign in between and then write the value" I presume the author is referring to the fact that the "if" statment reads the value of z_ptr and then writes to screen_ptr before writing back to z_ptr (rather than writing to z_ptr first, before writing to screen_ptr). In other words, he is suggesting that the above code is marginally faster (more optimized), than the following, similar code:


if (zi < z_ptr[xi])
{
   //update z-buffer
   z_ptr[xi] = zi;

   // write textel
   screen_ptr[xi] = color;
}

My questions are: Is this true? If so, why? I don't care if it's "marginally" true, or only true "some of the time." And I realize that little optimizations like this are generally irrelevant and should be put off till the end (more readable code is better than marginally faster code, and most optimizations that will make a difference are algorithmic in nature anyway). If this optimization IS real or possible, then please tell me - and explain WHY. Also, if this optimization IS real or possible, is this the kind of thing that the compiler will automatically handle during internal optimization? I'm very interested in people's explanations and thoughts. Thank you. -Gauvir_Mucca

outRider

852

January 12, 2009 12:40 PM

1) Write-after-read, as that process is called, is more of a problem in-order processors. On out-of-order processors it's not nearly as big a penalty.

2) Yes, compiler can (and many will) do such optimizations for you. It's basic instruction scheduling.

Technically that is not write-after-read however, there is a comparison in between the two, but the above still applies.

Zahlman

1,682

January 12, 2009 02:32 PM

Quote:Original post by Gauvir_Mucca
My questions are: Is this true? If so, why?

It can be true on processors created in the last X years that this may make a difference. It's not something that can be easily explained in much detail, because modern processors are very complex. But the general idea is that the processor has a limited ability to work on multiple things at a time, but it can't start work on "read address A" if work on "write address A" is in progress, because otherwise the value that it's supposed to read won't be there yet. And that means that it has to drop everything it's doing, wait for the read to finish, and then start the write (and thereafter resume operating in full swing).

However, while this has been true for the last X years, it has also been true for at last X years that the order in which things are described in your C++ code will NOT be preserved in the generated assembly. In fact, I would bet that it's been true for at least 2X years. And since compiler writers can generally be assumed to be operating in good faith to make your code faster ;) (at least in release mode :) ), you can be sure that if the code "suggested" by the first version is faster, then the compiler will produce that faster code if you supply either version of the source - certainly for examples this trivial, anyway.

Caveat: The compiler might not be able to prove that this is safe, and thus err on the side of caution, if it can't prove (for example) that screen_ptr and z_ptr refer to non-overlapping areas of memory.

Optimizing Read/Write of a Variable

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Optimizing Read/Write of a Variable

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines