# Stack in raytracing

This topic is 3863 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi, I have a question regarding Jacco Bikker's ray tracer tutorial. m_Stack = new kdstack[64]; m_Stack = (kdstack*)((((unsigned long)m_Stack) + 32) & (0xffffffff - 31)); I don't understand what the second line means. Please help me! Thanks! JJ

##### Share on other sites
Then it would be a very bad tutorial.

I'm not sure myself. =/

##### Share on other sites
Whatever it is actually supposed to do, it just looks as an atrociously ugly hack. Converting pointers into integers (or longs) and then converting them back is certainly not a good idea. And I really feel like this pointer-integer arithmetic will end up with a corrupt pointer.

##### Share on other sites
What he's done is make a 64-element array of kdstack items. That's obvious.

What he does next is downright horrifying. He is using the worst pointer arithmetic I've ever seen to select the location in that stack closest to the middle which is aligned on a 32-byte boundary.

Pay no attention to that little monstrosity. Could you give us a link, so I can see if he provided any reason for creating that nasty little bit of memory tweaking?

##### Share on other sites
If I'm not entirely mistaken (which is a posibility given the time of day), the code adjusts the pointer so that it's aligned on a 32 byte boundary. The "& (0xffffffff - 31)" part forces the last 5 bits bits to zero, while the "+ 32" part makes sure the new pointer is inside the allocated memory range. Imagine if the initial pointer was 0x1014, then setting the last 5 bits to zero would make it 0x1000, so it would point before the original location. Adding 32 would fix this, making the new pointer 0x1020.

For this to work, you'll need to allocate at least 32 bytes more than you actually need, so you won't be able to use all of the 64 elements allocated.

Since it won't affect the function of the code, I'd say such a trick is a bit missplaced in a tutorial, unless it was specifically about maxing the speed of your ray tracer :)

##### Share on other sites
Yuck... yeah at the very least that sort of stuff should be wrapped up in an aligned allocation routine. Ideally you should just use declspec align (and _aligned_malloc or similar) in this day and age :)

##### Share on other sites
I mean, the least the original coder could've done is not assume that a pointer is 32 bits, and write "0xffffffff-31" as "~31" instead (~ is the C/++ ones-complement operator). And that's assuming he's already made the compile-time assertion that sizeof(unsigned long) == sizeof(kdstack*).

##### Share on other sites
Quote:
 Original post by WyrframeI mean, the least the original coder could've done is not assume that a pointer is 32 bits, and write "0xffffffff-31" as "~31" instead (~ is the C/++ ones-complement operator). And that's assuming he's already made the compile-time assertion that sizeof(unsigned long) == sizeof(kdstack*).

I expect that if he is going through the effort of trying to cache align his data structures, then the extra 4 bytes per 64bit pointer is probably just too much overhead to be considered a viable option... but yeah, in principle, I agree. ;-)

##### Share on other sites
Ah, you encountered my little aligned malloc thingy. I agree it's not pretty, it's not 64-bit compatible and so on, but it's definitely relevant. For something close to the core like a kd-tree traversal, you don't want kd-tree nodes straddling cache boundaries. Not even if you're not in a hurry. It would just be silly. That being said, it should have been mentioned in the tutorial why this is done this way. And it should have been implemented using a _aligned_malloc instead.

I still hope you find the rest informative. ;)

##### Share on other sites
Quote:
 Original post by phantomusThat being said, it should have been mentioned in the tutorial why this is done this way.

Actually, it should not have. A tutorial is to teach a person a concept, and possibly some highlevel optimizations (e.g. using a tree of some sort to check the ray's collisions against objects as opposed to a straight foreach(object) loop). If you include a line of code that would take an experienced C++ coder several minutes to decipher and understand, if ever, and have no notation or explanation on what it does in a tutorial*, then you have just made your tutorial worse, evne if the intent was to help make the code faster. If a person wants to squeeze all of the performance they can out of a program after learning the concept, then teach them about aligned malloc's (certainly NOT that atrocious line) afterwards.

• ### What is your GameDev Story?

In 2019 we are celebrating 20 years of GameDev.net! Share your GameDev Story with us.

• 28
• 16
• 10
• 10
• 11
• ### Forum Statistics

• Total Topics
634103
• Total Posts
3015539
×