Making an octree with OpenCL, thread-safe issues

Started by
12 comments, last by spek 11 years, 4 months ago
I can just say thanks for the paper (I kind of missed it) and you earned at least +1.

To the topic - I haven't met too much OpenGL compute shaders yet (because I'm currently staying more in wonderful land of OpenCL), but couldn't you just store everything as single 3D buffer, where each node wouldn't be just color - but whole ambient cube around point (with some additional data I think you could fit into some 128 bytes per node - 8 * 4 bytes (for child 'pointers') + 6 * 16 bytes for /color = 32 + 96 = 128 ... ye it could work ... and in case of textures you could actually put this whole thing into some RGBA32F texture)

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Advertisement
Thanks a lot, the tiny stinky details are becoming clear now. I promise if, IF!, I manage to get it working, I'll try to write it all down in small "understandable" bits. The interpolation part makes sense, sampling 8 voxels instead of 27 sounds a lot better. So, each node, also from higher "mipmapped" levels contains a brick... or multiple actually in case we want to store light fluxes from 6 directions or store it as a SH.

@Vilem
That would certainly be possible, although you won't benefit from hardware tri-linear filtering when sampling colors during raymarching later on. I believe that's pretty much the mean reason why they keep a separate 3D texture(s) (already asked this a while ago and got that for an answer :) ). Nevertheless, I'll start with storing colors into the octree buffer itself first, just for easy testing.

Thanks you guys, I'm finally getting a grip on this thing (I think)
Enjoy your weekends!
* Note about Semaphors.& atomics (for those who struggle as well hehe)
I thought my videocard didn't support them because the card driver would crash/time-out each time. But that was a bug from myside: you can't run multiple "workers" with semaphors. AFAIK, they all follow the same execution path so if ones waits in a loop until the semaphor gets unlocked, all workers in that same wavefront / warp will wait... forever.

If that's the case, I guess the splitting process described above can only use 1 worker per wavefront/warp then. At least, I'm still using a semaphor to allocate subnodes:

if ( parentNode.markedForSplitting )
{
getSemaphor( &lock );
int firstIndex = _globalVarNodeCounter[0]; // use a global int to keep track of the used nodes
_globalVarNodeCounter[0] += 8;
releaseSemaphor( &lock );

parentNode.childPtr1 = &octreeNodes[ firstIndex ];
...
}


However... OpenCL sais that the "atomic_Add" function returns its old value. So the code above could be replaced with:

if ( parentNode.markedForSplitting )
{
int firstIndex = atomic_add( &_globalVarNodeCounter[0], 8 );
parentNode.childPtr1 = &octreeNodes[ firstIndex ];

...
}

Correct me if I'm telling crap though, it's all a bit new to me too.
Sorry to blow this thread back alive, but I thought this question was related so maybe it's better to keep it all together. And since I got served here well previous time :P

I managed to get VCT "working", but without smooth-interpolation / bricks yet. Just stored the out going voxel radiance directly into the octree nodes. And the performance was horrible btw, but that probably also has to do with the age of my videocard. Anyway, about bricks and maybe OpenCL in general...

* How to draw brick pixels on an image?
I know how to use write_imagef, but the problem is that multiple voxels may write to the same brick, so I need some sort of blending rather than just overwriting. I thought using a "max filter" would be best, though an average may do as well. Additive blending is not a good option in my case because some spots will get affected by more voxels than others. The real question is, is it possible to apply a blending method when writing pixels via OpenCL?

* If not, how did the VCT guys deal with multiple voxels on one node/brick, or did they pre-filter the voxels somehow to assure only 1 would be inserted per node?

* I can't write to volume textures to make bricks btw, also the compiler warns about not knowing the "cl_khr_3d_image_writes" extension. Am I doing something wrong, or is it very well possible my nVidia GeForce 9800M (~2009) just doesn't support it? A work around would be to write to a 2D texture first, then let regular shaders construct a volume texture out of them finally.

* Another way to create bricks & work around the volume-texture issue
I could write colors first into the octree struct (like I already did). Then let ordinary shaders convert that buffer to a brick-3D texture somehow. But... again I still need to blend multiple voxels at a single node (corner). I tried the "atomic_max" function, but as far as I can see, that doesn't work on my structs from the __global address space. Asides that, I fear some hang ups with atomic operations.


Well, I just think I'm not shitting bricks in a proper way. Anyone managed to implement this part of the VCT technique?


EDIT
--------------
Now I see section 22.4.2 of the paper Eternal posted here above explains some of my "blending" questions... Seems they use atomic operations to first sum, then later divide to get an average. Got to read it further though.


Cheers,
Rick

This topic is closed to new replies.

Advertisement