Stack in raytracing

Started by
11 comments, last by Cypher19 15 years, 10 months ago
Hi, I have a question regarding Jacco Bikker's ray tracer tutorial. m_Stack = new kdstack[64]; m_Stack = (kdstack*)((((unsigned long)m_Stack) + 32) & (0xffffffff - 31)); I don't understand what the second line means. Please help me! Thanks! JJ
Advertisement
Then it would be a very bad tutorial.

I'm not sure myself. =/
Whatever it is actually supposed to do, it just looks as an atrociously ugly hack. Converting pointers into integers (or longs) and then converting them back is certainly not a good idea. And I really feel like this pointer-integer arithmetic will end up with a corrupt pointer.
What he's done is make a 64-element array of kdstack items. That's obvious.

What he does next is downright horrifying. He is using the worst pointer arithmetic I've ever seen to select the location in that stack closest to the middle which is aligned on a 32-byte boundary.

Pay no attention to that little monstrosity. Could you give us a link, so I can see if he provided any reason for creating that nasty little bit of memory tweaking?
RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.
If I'm not entirely mistaken (which is a posibility given the time of day), the code adjusts the pointer so that it's aligned on a 32 byte boundary. The "& (0xffffffff - 31)" part forces the last 5 bits bits to zero, while the "+ 32" part makes sure the new pointer is inside the allocated memory range. Imagine if the initial pointer was 0x1014, then setting the last 5 bits to zero would make it 0x1000, so it would point before the original location. Adding 32 would fix this, making the new pointer 0x1020.

For this to work, you'll need to allocate at least 32 bytes more than you actually need, so you won't be able to use all of the 64 elements allocated.

Since it won't affect the function of the code, I'd say such a trick is a bit missplaced in a tutorial, unless it was specifically about maxing the speed of your ray tracer :)
Yuck... yeah at the very least that sort of stuff should be wrapped up in an aligned allocation routine. Ideally you should just use declspec align (and _aligned_malloc or similar) in this day and age :)
I mean, the least the original coder could've done is not assume that a pointer is 32 bits, and write "0xffffffff-31" as "~31" instead (~ is the C/++ ones-complement operator). And that's assuming he's already made the compile-time assertion that sizeof(unsigned long) == sizeof(kdstack*).
RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.
Quote:Original post by Wyrframe
I mean, the least the original coder could've done is not assume that a pointer is 32 bits, and write "0xffffffff-31" as "~31" instead (~ is the C/++ ones-complement operator). And that's assuming he's already made the compile-time assertion that sizeof(unsigned long) == sizeof(kdstack*).


I expect that if he is going through the effort of trying to cache align his data structures, then the extra 4 bytes per 64bit pointer is probably just too much overhead to be considered a viable option... but yeah, in principle, I agree. ;-)
Ah, you encountered my little aligned malloc thingy. I agree it's not pretty, it's not 64-bit compatible and so on, but it's definitely relevant. For something close to the core like a kd-tree traversal, you don't want kd-tree nodes straddling cache boundaries. Not even if you're not in a hurry. It would just be silly. That being said, it should have been mentioned in the tutorial why this is done this way. And it should have been implemented using a _aligned_malloc instead.

I still hope you find the rest informative. ;)
Quote:Original post by phantomus
That being said, it should have been mentioned in the tutorial why this is done this way.

Actually, it should not have. A tutorial is to teach a person a concept, and possibly some highlevel optimizations (e.g. using a tree of some sort to check the ray's collisions against objects as opposed to a straight foreach(object) loop). If you include a line of code that would take an experienced C++ coder several minutes to decipher and understand, if ever, and have no notation or explanation on what it does in a tutorial*, then you have just made your tutorial worse, evne if the intent was to help make the code faster. If a person wants to squeeze all of the performance they can out of a program after learning the concept, then teach them about aligned malloc's (certainly NOT that atrocious line) afterwards.

This topic is closed to new replies.

Advertisement