[/font]
[font="Verdana, Helvetica, sans-serif"]http://gamedeveloper...1109?pg=37#pg37[/font]
[font="Verdana, Helvetica, sans-serif"][/font]
[font="Verdana, Helvetica, sans-serif"]The article suggests you could use a GPU to accelerate large-scale occlusion detection. I think I get their general approach: You have a scene containing a number of objects which are in themselves complex, like people or buildings. If one of these objects is completely obscured by another object, you'd prefer not to even try to render it-- maybe these objects are so complex that even with z-buffer testing you're spending a lot by trying to draw it only to have its pixels discarded. So before you actually try to draw, you draw a simplified version of the scene-- instead of the building, you draw a plain box that has the same dimensions as the building-- and then you use a shader to report back information about which of the boxes wound up in the final scene. Okay, that makes sense. But then I start thinking about how to implement the technique they describe and I just wind up feeling like an idiot.[/font]
[font="Verdana, Helvetica, sans-serif"][/font]
[font="Verdana, Helvetica, sans-serif"]Here's where I'm lost:[/font]
[font="Verdana, Helvetica, sans-serif"][/font]
[font="Verdana, Helvetica, sans-serif"]- When do you do this? Are they really proposing doing this every frame?[/font]
[font="Verdana, Helvetica, sans-serif"]- Once the GPU has created the list of "here's what is and isn't occluded", the CPU has to act on it. Is it really cheaper to ship back data every frame from the GPU to the CPU than to do the occlusion tests in CPU? I thought shipping data GPU->CPU was basically the most expensive thing you could do.[/font]
[font="Verdana, Helvetica, sans-serif"]- My big point of confusion. They say:[/font][font="Verdana, Helvetica, sans-serif"]
[/font]
[font="Verdana, Helvetica, sans-serif"]
After dispatching the compute shader to test all the bounds for visibility, you'll have time to do some processing on the CPU while you wait for the results... Once the compute shader finishes, it's just a matter of locking the writable buffer from steps 3-5, and copying the float values representing visible or hidden objects into a CPU-accessible buffer so that a final list of visible objects can be prepared for rendering.[/quote][/font][font="Verdana, Helvetica, sans-serif"]
[/font]
[font="Verdana, Helvetica, sans-serif"]They suggest storing the floats in a "writable buffer". I don't think I've ever heard of such a concept. They also make (sec. 3) reference to a "compute shader", a piece of vocabulary I'm not familiar with. Are these DirectX things? Do they have OpenGL analogues? It sounds as if they are running a shader program that has access to a plain array, and can sample a texture at will and write into random-access places in that array. They're having individual indexes in the array correspond to objects that are being tested for occlusion, and somehow also the shader has access to the bounding boxes of the individual objects it's working on (they have a diagram where they show a 3D object, and then draw a bounding box for the object on the screen; they seem to think that they can get this bounding box information for any given object, in fact, they seem to believe getting this information is so easy they don't even bother telling you how to do it. Once they have this box they plan to test the four corners for visibility and conclude that if any corner is visible, so is the object. Doesn't sound like a fair assumption for a sphere, but...). Are these techniques that translate to OpenGL-land at all?[/font]
[font="Verdana, Helvetica, sans-serif"][/font]
[font="Verdana, Helvetica, sans-serif"]The closest I can get to making something like this work with the OpenGL tools I know of are:[/font][font="Verdana, Helvetica, sans-serif"]
[/font]
[font="Verdana, Helvetica, sans-serif"]1. Render the "simplified" scene such that the depth buffer goes into a texture.[/font]
[font="Verdana, Helvetica, sans-serif"]2. Render a very small scene (like, 16x16) consisting of one polygon covering the screen, with the vertex shader just passing a varying to each pixel which tells it its screen coordinate. This will give me 256 runs into a pixel shader.[/font]
[font="Verdana, Helvetica, sans-serif"]3. Each pixel shader runthrough uses its coordinate to determine which object it's responsible for testing; it samples the depth texture from (1), computes whether the object was visible or not, and writes either black or white at its pixel. [/font]
[font="Verdana, Helvetica, sans-serif"]4. Copy the small 16x16 texture back to the CPU and read each pixel to decide whether to draw the object.[/font]
[font="Verdana, Helvetica, sans-serif"][/font]
[font="Verdana, Helvetica, sans-serif"]...but, this doesn't work because I have NO IDEA how I would get the "bounding box", or, if I magically had a list of screen-bounding-boxes-for-objects. how I would pass this list of information into the pixel shader.[/font]
[font="Verdana, Helvetica, sans-serif"]
[/font][font="Verdana, Helvetica, sans-serif"]Am I missing anything?![/font]
[font="Verdana, Helvetica, sans-serif"][/font]
[font="Verdana, Helvetica, sans-serif"]Thanks for any responses! I don't actually exactly plan to use this technique, but I do want to develop my shader programming skillset to the point where I can at least UNDERSTAND an article like this :/[/font]