Fast grass rendering

Started by
16 comments, last by Bagel Man 18 years, 8 months ago
Quote:Original post by spek
About building those matrices, you mean I should pass arrays with the positions, rotations, and maybe one with scales?

That would mean to keep a dedicated VBO per terrain patch. While this will work (and doesn't require CPU interference), it will take more memory and is less flexible.

If you have SM 3.0 capable hardware, then one could build it like this:

* Create a single generic VBO, that contains a number of origin centered squares, all aligned along the z axis. In addition to the vertex positions, each vertex gets a vertex attribute that assigns it to a patch ID. So, all four vertices forming a quad would get the same ID. This is so that the vertex shader can see to what patch a vertex belongs to.

* Create a patch attribute texture map: a group of texels represents one patch, and contain data about it's position, rotation and scale, as well as type (index into a texture atlas), and several other optional parameters.

* When rendering a terrain patch set, bind the corresponding grass patch attribute map so that you can access it in a vertex shader. Then, render the shared VBO, with as many quads as you need for the current patch.

* The vertex shader will use the patch ID index from the vertex to address the right attribute texels in the attribute map. It will then read back the patch attributes, and use them (combined with the current camera matrix) to form an billboard alignment matrix. Each pregenerated standard quad vertex gets transformed by this matrix. Finally, additional vertex parameters, such as texcoords, colours, etc, are also generated using the patch attributes.

Of course, depending on how smart you compress the attributes, this can take a few texture reads in the VS. And such reads are not (yet) as fast as one would like them to be. On the other hand, the fact that vertex attributes are stored as textures opens a whole new world for animation: you can use the GPU to modify the map. Using pixel shader tricks, you can animate the grass patches without ever needing the CPU (by rendering the original attribute map to a second one, and swapping them - think double buffering).
Advertisement
I don't know what SM 3 exactly is... I have a GeForce 5700 ultra at the moment.

The attribute map, could that be a 1D or 2D image with for example,
pixel1.xyz=position
pixel2.xyz=rotation
pixel1.w=material ID
<and maybe more stuff like width/height or something>

As far as I know, vertex shaders on my card can't read images, only the pixel shaders (is that what you mean with SM3?). But even if my card could do that, it would need the CPU in my case as well I think. Not every frame but once the the camera moves and new grass comes in sight (or our of sight), this map has to be recreated and uploaded to the video-card I think. Or is this not what you mean?

Anyway, I think passing data like position with something like glDrawArray in combination with this VBO would already be a big improvement. By the way, would the usage of indices improve the rendering speed in this case?

Thanks!
Rick
I think it may only be supported on some of the newer graphics cards and I don't know what the technology is called, but I remember watching the premiere of the Geforce 6800s. The speaker talked about some kind of geometry technology that did fast rendering of multiple instances of the same object. I think one of the examples was an astroid field.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]


sm3 = shader model 3

it is only supported on newer cards (some functionality needing beta drivers to function appropriately, or close to appropriately)

I dont believe your card (5700) supports sm3 - it looks like your solution will have to differ from some of the suggestions for implementation
SM3 is shader model 3, and is supported on Geforce 6 series graphics cards and up.

As a result I don't believe that you will have much luck with the texture generating method.
Quote:Original post by spek
As far as I know, vertex shaders on my card can't read images, only the pixel shaders (is that what you mean with SM3?). But even if my card could do that, it would need the CPU in my case as well I think. Not every frame but once the the camera moves and new grass comes in sight (or our of sight), this map has to be recreated and uploaded to the video-card I think. Or is this not what you mean?

You don't need to touch the map while rendering at all (unless you want to animate the grass). Each terrain tile has its own map (or submap), that remains static. The grass system has to be integrated into your terrain engine, as it will share the visibility culling with the terrain tiles. Geomipmapping would be perfect. Visibility is determined hierarchically for each terrain tile, using frustum culling, occlusion culling, whatever. If a tile is visible, all grass on it is also assumed visible. A few patches will obviously be outside the view, but its going to be faster to let the GPU cull those away, than fiddling around with data in VRAM using the CPU.

But this approach will not work on your GF 5700, since it doesn't support vertex texture access. Still, you can use the "one VBO per tile" approach, streaming in the data over vertex streams. The results will be the same, although it will take more memory. You could even animate in on the GPU, by using the render-to-vertex-array feature (which is supported by your chipset).

Quote:Original post by spek
Anyway, I think passing data like position with something like glDrawArray in combination with this VBO would already be a big improvement. By the way, would the usage of indices improve the rendering speed in this case?

Most probably. In your specific case the speedup will be less pronounced than in common rendering applications (where glDrawArrays should be avoided like the plague), because you share considerably less vertices. But it would still be advisable to switch to indexed VBOs.
Quote:Original post by Yann L
Quote:Original post by spek
How to sort on depth on the GPU to avoid Z-problems (in case I don't want to use alpha-test)?

That's tough. For now, I wouldn't worry about sorting, and use alpha test instead.


Out of interest how would you sort them or at least have alpha blending work correctly without needing a Z-sort?

There is one way I can think of (well know of) but to do it you'd need to hold the positions of all the grass blades in a texture which you update over many passes which perform the sort.
I've found that when rendering a large number of semi-transparent overlapping objects (grass in particular) it can help to render them last and just turn off z-buffer writes. Even intersecting planes seem to look acceptable in most cases. Not sure if it would work for your case though.

This topic is closed to new replies.

Advertisement