Updating vertex normals on the CPU versus GPU

Started by
6 comments, last by CC Ricers 11 years, 1 month ago

Right now I am taking care of the final steps to make a terrain geo mipmapping system. It creates all vertex data for every mesh LOD at loading time. There are two concerns with this- the CPU time required to calculate normals, tangents and bitangents when creating the meshes for each terrain tile, and the memory footprint it takes.

Because I was careless with putting buffer creation into too many loops, I ran into OutOfMemory exceptions before the code was stable. Doing more GPU work would also be helpful when I start on the terrain editor so that vertices are updated quickly with minimal lag.

So to alleviate these two problems to some degree I want to know if there are huge benefits to calculate some of the vertex data on the GPU in real-time. I have a custom vertex declaration structure that's 56 bytes per vertex, and on 2048x2048 maps, this can easily exceed 300 megabytes (lower LOD meshes included). That structure includes Position, Texture Coordinate, Normal, Bitangent and Tangent. It might be possible to reduce this to Position and Texture Coordinate which would be 20 bytes per vertex and have the shaders figure out the other three values.

I'm using XNA so geometry shaders are not available. I know it's possible to compute normals using partial derivatives, but for a non-faceted shading look, I don't know how to achieve this.

My first thought is to have a stream with a vertex buffer containing the XZ locations, and a transformation matrix for each tile so each terrain LOD can share the same buffer. Then I can either have a separate stream with Y (height) locations to push up each vertex. I'm just not sure if or how it's possible to look up adjacent vertices in the shader to calculate normal data per vertex.

New game in progress: Project SeedWorld

My development blog: Electronic Meteor

Advertisement

Do you really need the tangent and bitangent? At the very least, you could just supply the normal and tangent, and then calculating the bitangent is just a cross product away, but unless you have algorithms that actually need the data then I would just dump them...

Have you also considered using instancing to get your vertex data into the pipeline? This would let you have just one grid representation for each of the LOD levels, and then you would have instance vertex buffers to generate each instance in your terrain, and you could sample from a buffer / texture in your vertex shader to get the rest of the specific data (vertex height, normal, tangent and bitangent (see above...)). This would cut your position information way down by somewhere around 2/3.

Hey Jason, so I took your advice and removed some of the vertex element components like the bitangent and computing that on the vertex shader. I also noticed the texture coordinate wasn't being used in the shaders anymore since switching to triplanar texture mapping. So I removed that element also, and compute the tangent by way of normal direction on the CPU. Then I started reduced space needed even further by making all the tiles of a given LOD use the same index buffer. Right now the program uses about 80mb less memory. I could do better, but it's a start. I was mainly stuck with how to offload tangent/bitangent calculation off to the GPU since I overlooked how the bitangents are calculated. Then all the optimizations became more straightforward.

When you said to use instancing for the vertex buffers, that's about what I was thinking too. Since all vertices of a given tile LOD look the same from the top down I only the XZ coordinates would matter for the reusable buffers. So handful of small vertex buffers with just XZ coordinates, and each tile has its own Y coordinate vertex buffer with a world position to transform the tile's mesh into world space. I'm also getting my ideas from this map editor which is helping out on streamlining the data. Would you consider using then, the Y paired with world position as the second vertex stream?

However right now I would just keep a non-instancing version and have an instancing version for the best configuration, as I plan to port to MonoGame and that doesn't support instancing in its current state.

New game in progress: Project SeedWorld

My development blog: Electronic Meteor

You can even get rid of the position altogether - if you pass the 'tile' position and scale as constants to your vertex shader, then you can use the SV_VertexID to generate the X and Z coordinates, and then look up the Y coordinate from a shader resource view (probably a buffer...). I would just sample/load the Y data from a resource in the shader, rather than adding it as another vertex buffer.

I think that would save a significant amount of memory, and cost very little to generate in the shader. Note that if you use instancing, you would have to pass the position and scale information as per-instance vertex data, but the memory used is still significantly lower than directly putting it into the per-vertex format.

Additionally, your vertices are too large. Our verts that contain pos+uv+nrm+tan+bin are 28 bytes. Use compressed (4 byte) encodings for normal, tangent and binormal. Use half-precision (2 bytes per component) for UVs.

Osmanb, I just now realize you can encode the normals in a more compact format as you could with a G-Buffer, but I already figured out a solution for that. I completely removed UVs as my texture mapping is now triplanar and UV lookup is all normals-based and on the shader.

Jason Z, I cannot use the SV_VertexID semantic because it's not supported in DirectX 9/XNA, but I'll keep it in mind for a future project.

New game in progress: Project SeedWorld

My development blog: Electronic Meteor

Osmanb, I just now realize you can encode the normals in a more compact format as you could with a G-Buffer, but I already figured out a solution for that. I completely removed UVs as my texture mapping is now triplanar and UV lookup is all normals-based and on the shader.

Jason Z, I cannot use the SV_VertexID semantic because it's not supported in DirectX 9/XNA, but I'll keep it in mind for a future project.

You could still just use a single vertex attribute to identify the vertex order, then proceed in the same fashion. You will still remove the majority of the position data from your vertex format, and it can be replaced with an ID that is sized according to the max number of vertices in an instance. This should be safely less than what can fit into a 16-bit integer...

Thanks for your help, everyone. I have moved the calculation of tangents and bitangents to the vertex shader with no real impact on performance. And I have settled on a vertex format that I will be later using, which takes up 8 bytes in a format of four short integers. Two for the vertex ID and height, and two for encoded normals as either 2 HalfVectors or 2 normalized short values. 16 bit precision should still be good enough for normals, I would think.

New game in progress: Project SeedWorld

My development blog: Electronic Meteor

This topic is closed to new replies.

Advertisement