Jump to content
  • Advertisement
Sign in to follow this  
CDProp

A few general performance questions...

This topic is 2998 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Just a few questions from a noob:

1. I've seen some people use a texture to perform skinned animation. Instead of sending the bone matrices down as uniforms, they'll bake them into a texture, which uses just one sampler uniform. But is this really quicker? Every frame, you have to lock textures and write bone information to them, and then you add like 4 textures sample to every vertex.

2. Even if you combined the above into just 1 large texture for every character on screen, I'm usually quite worried about locks. I don't want to serialize the CPU and GPU. Am I being overly-worried?

3. I want to split my level geometry into chunks, so that I can do some frustum culling and also some LOD. I recently read a quote from John Carmack who said that, with the way hardware works today, LOD is a waste of time and it's better just to have it all in one buffer and draw the whole thing every frame. It sounds tempting, because splitting the level into chunks would require 1 draw call per chunk, whereas drawing the whole thing at once requires one draw call total. But I try to think of the expense of transforming all of those unnecessary vertices and it makes me cringe. What do you think?

To add to question #3, let's suppose I'm rendering an ocean surface. Close to the camera, I may want to resolve geometric detail as small as 10cm. But if I created a 6km grid of 10cm squares ... that's a lot of verts, man. Is it really easier to just draw the whole thing, rather than splitting the water surface into chunks and using chunks of lesser tessellation at further distances? It seems that, even without the LOD, the savings that come from frustum culling would be worth it.

Share this post


Link to post
Share on other sites
Advertisement
I'm not sure what Carmack was referring to, specifically, but I can attest from personal (and very recent) experience on current-gen games with high-res assets that both LODs and basic visibility/occlusion culling are still very much worth the effort.

Share this post


Link to post
Share on other sites
I thought so, and thank you very much. It's probable that I misunderstood Carmack.

I suppose I could have experimented to find this out for myself, but that takes a bit of work and I'm not confident that the results I would see would truly answer my question. It seems probable at this stage that I would mess something else up, unknowingly, that would affect my results. So thanks!

Share this post


Link to post
Share on other sites
Quote:
Original post by CDProp
1. I've seen some people use a texture to perform skinned animation. Instead of sending the bone matrices down as uniforms, they'll bake them into a texture, which uses just one sampler uniform. But is this really quicker? Every frame, you have to lock textures and write bone information to them, and then you add like 4 textures sample to every vertex.


There's a few things to consider here:

1. Indexing into constant shaders forces vertex shader executions to stall due to an issue known as "shader constant waterfalling". Basically since constant fetches are synchronous reads, the fetches have to be serialized since each execution will need a different constant value.

2. Texture fetches are asynchronous cached reads. This means there's potentially a lot of latency and some bandwidth use for cache misses, but the scheduler can potentially hide that latency by switching to another thread.

3. In DX9 you don't have constant buffers, and instead a flat register file. This means you are limited in the number of bones you can use in a single draw, which means you may have to split meshes.

4. With textures your number of bones is only limited by the max dimensions of a texture, which gives you wayyy more than enough to work with.

5. In DX9, shader constant values go in the command buffer which slows things down if you do it a lot.

6. Textures are resources even in DX9, which means you can just lock and memcpy.

7. Vertex texturing is dog-slow on GeForce 6/7 hardware. Any DX10-capable GPU does it just fine.

So the word isn't necessarily "faster", but "different". You should check out this presentation, it has some good information.

Share this post


Link to post
Share on other sites
Quote:
Original post by CDProp
LOD is a waste


Imo it is not - even if we step away from the optimization point of view, there is the quality/antialiasing - LOD of an distant object will fight aliasing ton times better, compared to any AA solution.
Besides, the performance hit would be here too, especially today, where pixel shaders grow heavier - lots of small triangles at distance will cut down your pixel shader performance down to 25% very quickly.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!