UpdateSkinnedMesh vs GPU Update

Started by
6 comments, last by InvalidPointer 12 years, 2 months ago

[font=verdana,geneva,sans-serif]Hi~~~[/font]

[font=verdana,geneva,sans-serif]It's a simple question...[/font]

[font=verdana,geneva,sans-serif]Lock()[/font]

[font=verdana,geneva,sans-serif]UpdateSkinnedMesh() vs GPU(.fx) Update Blendweigth [/font]

[font=verdana,geneva,sans-serif]UnLock()[/font]

[font=verdana,geneva,sans-serif]Which One is more efficient ?[/font]

[font=verdana,geneva,sans-serif]So simple ? ^^a[/font]

Advertisement
The only accurate answer for questions like this is that you need to measure the performance of both options on the hardware you care about and compare the results.

Performance will vary based on details like:

- How many times per frame are you rendering the mesh?
- What GPU and CPU do you have?
- How many bones are there?
- How many vertices are there?
- Do you need to read back the results on the CPU?
- Which thread is this being calculated on?

[font=verdana,geneva,sans-serif]My answer is ALL Condition SAME...[/font]

I'm going to echo Adam here and say 'depends what else you're doing.' 'Same' does not answer questions like the relative power of the GPU/CPU combination(s) installed, extra data availability requirements or additional workload.
clb: At the end of 2012, the positions of jupiter, saturn, mercury, and deimos are aligned so as to cause a denormalized flush-to-zero bug when computing earth's gravitational force, slinging it to the sun.

[font=verdana,geneva,sans-serif]Yes Commander ![/font]

[font=verdana,geneva,sans-serif]How many times per frame are you rendering the mesh?[/font]

[font=verdana,geneva,sans-serif]- 1 rendering per 1 frame[/font]

[font=verdana,geneva,sans-serif]What GPU and CPU do you have?[/font]

[font=verdana,geneva,sans-serif]- ATI 3870[/font]

[font=verdana,geneva,sans-serif]- 2.0 core2duo[/font]

[font=verdana,geneva,sans-serif]How many bones are there?[/font]

[font=verdana,geneva,sans-serif]- basic biped : 30[/font]

[font=verdana,geneva,sans-serif]How many vertices are there?[/font]

[font=verdana,geneva,sans-serif]- 5430[/font]

[font=verdana,geneva,sans-serif]Do you need to read back the results on the CPU ?[/font]

[font=verdana,geneva,sans-serif]- I don't understading "read back" meaning.[/font]

[font=verdana,geneva,sans-serif]- "Read back" is Loading again ?[/font]

[font=verdana,geneva,sans-serif]- Answer is loading once and reuse..[/font]

[font=verdana,geneva,sans-serif]Which thread is this being calculated on ?[/font]

[font=verdana,geneva,sans-serif]- 1 thread[/font]

[font=verdana,geneva,sans-serif]Please down answer, Sir ![/font]

[font=verdana,geneva,sans-serif]I'm sorry for my veryveryvery short + rigid answer....[/font]

[font=verdana,geneva,sans-serif]I wanna know just absolutely answer in common.[/font]

[font=verdana,geneva,sans-serif]^^a[/font]

There's still no absolute answer. OK, in general terms, and everything else being equal, a GPU update is going to run much faster - maybe twice as fast or more, depending on the hardware. If however your GPU is loaded with other work and your CPU is relatively free then you'll probably want to do it on the CPU. Or if your CPU is loaded and your GPU is free then you'll want to put it on the GPU. If you can spawn an extra thread on the CPU and update in that then doing it on the CPU might get your update for free (aside from thread overhead). If you've a fast CPU but a slow GPU you might want to put it on the CPU. If you've a fast GPU but a slow CPU you'll put it on the GPU. If you're tight on constants register space or you can't do vertex texture fetches you'll consider the CPU. But drawing multiple models of the same type would mean a lot of Lock/Unlock and so on, so the GPU wins.

So like the others said, there is no right answer. You need to know what kind of environment you're running in and what the rest of the program is doing, how your GPU vs CPU usage is balancing out, where your bottlenecks are, and so on.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

[font=verdana,geneva,sans-serif]I just wanna know Absolutic common answer in.[/font]

[font=verdana,geneva,sans-serif]My English writting skill level is very low...[/font]

[font=verdana,geneva,sans-serif]I'm sorry --a[/font]

[font=verdana,geneva,sans-serif]Your answer is I wanna know answer.[/font]

[font=verdana,geneva,sans-serif]Thank you very much ^^a[/font]

[font=verdana,geneva,sans-serif]Have a Nice DAY !!![/font]

[font=verdana,geneva,sans-serif]I just wanna know Absolutic common answer.[/font]

[font=verdana,geneva,sans-serif]My English writting skill level is very low...[/font]

[font=verdana,geneva,sans-serif]I'm sorry --a[/font]

[font=verdana,geneva,sans-serif]Your answer is I wanna know answer.[/font]

[font=verdana,geneva,sans-serif]Thank you very much ^^a[/font]

[font=verdana,geneva,sans-serif]Have a Nice DAY !!![/font]


[font=verdana,geneva,sans-serif]I just wanna know Absolutic common answer.[/font]

[font=verdana,geneva,sans-serif]My English writting skill level is very low...[/font]

[font=verdana,geneva,sans-serif]I'm sorry --a[/font]

[font=verdana,geneva,sans-serif]Your answer is I wanna know answer.[/font]

[font=verdana,geneva,sans-serif]Thank you very much ^^a[/font]

[font=verdana,geneva,sans-serif]Have a Nice DAY !!![/font]



It's fine, we're (or at least, I'm not) not yelling at you, period, or questioning your English ability, just pointing out that you're asking for an absolute answer that doesn't exist, much like, for example, the question "How do I downsample depth?" In that case, you can get a local maximum, a local minimum or a regular, boring old mean; all are useful for different purposes and we need to know more about what else you need it for in order to give you a meaningful answer.


Since you're asking about 'readbacks,' this is simply the process of reading the skinned vertex data back onto the CPU, probably for something like physics calculations or the like. More on that in a second.

As an example of 'what else are you doing,' the use of shadow volumes is an excellent argument in favor of keeping everything CPU-side, as the actual process of getting this sort of stuff back into main memory from the GPU can be rather slow under typical game circumstances due to how the graphics driver actually goes about implementing your draw calls; generally, they're queued up into a big FIFO list behind the scenes and the GPU will need to burn through a few frames(!) worth of commands before it can be in a state where it can copy things out of VRAM. Again, the specifics of what else you're doing will influence the commands actually sent to the GPU and can in turn affect that latency. We just want some detailed information about what you plan on doing with your mesh(es) :)
clb: At the end of 2012, the positions of jupiter, saturn, mercury, and deimos are aligned so as to cause a denormalized flush-to-zero bug when computing earth's gravitational force, slinging it to the sun.

This topic is closed to new replies.

Advertisement