How to design a mesh class ?

Started by
9 comments, last by Frederick 15 years, 11 months ago
Hi! I recently started thinking about how the mesh class of my graphics engine should look like. What I want to achieve (hopefully one day :-)) is something like a 3D Platformer like Mario, Zelda etc. So I basically need skeletal or other deformations of meshes and collision proxies also - sure ;-) I am using vertex/fragments programs which require different vertex formats. Thats where my problem starts. Say, I would like my deformations run on the CPU, then I would like to access all vertices of my mesh in a uniform way. What I thought of in the past is, that it would be a great idea to use a memory layout in my mesh class, which may be passed to the hardware without further processing. The vertex data should be structured in a way, that can be directly understood by the GPU programs. In this case interleaved buffers. I thought it would be a good idea to prevent a data conversion every frame. Lets imagine a character with a bump-mapped body and some nice hair on his head, created by a fur shader. So we need two different vertex formats (bump/fur) to render the character. How to access that uniformly ? We don't want to care about the vertex format, when we apply the deformation. This could be solved via an iterator that has a built in case distinction (after vertex #1000 skip 6 bytes, after vertex #1500 skip 8 bytes ... and so on). What I am asking myself (err you ;-)) is, is it really worth the trouble or is it just easier to have two different formats - one to run the algorithm on, which will be compiled into a hardware-friendly format ? Maybe there is no big difference runtime-wise at all because all vertices of the model have to be touched in a typical animation every frame, regardless which method is being used. This is my first attempt at writing an engine and I really lack the experience how a properly designed mesh class should look like. I would be really glad if I could get some advise. Maybe there are requirements to the mesh class down the road I don't even see now. Also I am sure there MUST be some thread relating this rather basic question, but I apologize I couldn't find any, that helped me out. Thanks a lot !!! Looking forward to your suggestions. Frederick
Advertisement
Try to separate static geometry from dynamic geometry. Static geometry can be uploaded to VRAM and never touched with the CPU. Dynamic geometry should be kept in AGP memory for easy CPU access. Other data can be uploaded via shader constants. So, going back to your bump mapped character, I would split the data this way:

* position, normal, binormal, UVs -> base geometry, static vertex buffer in VRAM, never changes
* delta position, delta normal, delta binormal -> deform geometry, dynamic vertex buffer in AGP memory, changes every frame
* bone matrices -> animation, vertex shader constants in system RAM, changes every frame

When you split the data this way you can access it independently without caring about different formats. If you want to deform/morph your characters, simply change the delta position, normal and binormal stream, no need to touch the base geometry with the CPU. Need to animate your characters? Simply calculate the bone matrices and upload them as vertex shader constants. This way you can also save memory and have only 1 copy of base geometry, while having multiple copies of deltas and bone matrices.

Another thing to keep in mind is that, while keeping the data in GPU friendly thing is a good thing, it might be a real pain to manipulate this data on the CPU. So, in the case of deform geometry, the GPU wants it in a compressed format, while the CPU wants to use raw floats. I recommended calculating the data in a CPU-friendly and then compressing it to a GPU-friendly format every frame. Obviously, you can keep the base geometry in a compressed format, since you won't be touching it with the CPU.

EDIT:

I know your question was about class design, but my answer was about data design . In the case of GPU programming it's important to design your class around the data, not other way around. Ultimately the GPU will be processing your data and if it's not designed right, there will be performance hits. Also, I forgot to mention that in order for any of my ideas to work you'll need to write a vertex shader that supports skinning and morphing.
deathkrushPS3/Xbox360 Graphics Programmer, Mass Media.Completed Projects: Stuntman Ignition (PS3), Saints Row 2 (PS3), Darksiders(PS3, 360)
Quote:Original post by deathkrush
In the case of GPU programming it's important to design your class around the data, not other way around.


Seconded. Unlearn the old ways of thinking!
--== discman1028 ==--
Hi Deathcrush,

thanks a lot for your kind answer !

Well that is different approach to consider, thank you very much ! All the papers /forum posts , I have read in the past, stated that it is advantageous to put all vertex data into interleaved buffers and that that method is preferable performance-wise to the use of different buffers (cache maybe ? - really no clue ;-)) So I simply didn't come up with this.

I am still not perfectly happy with the use of delta vectors. Do they have any advantage in comparison to absolute position vectors ? It is the same amount of data that needs to be transferred to the GPU and no vector addition needs to be performed. Also I believe it is easier, at least in my terms of thinking, to generate a series of vertex positions rather than a series of delta vectors. Lets say I want to do cage deformations, there is a recent nice paper from pixar about harmonic coordinates (most likely I won't do the harmonic stuff, but maybe I will try cage deformations), then I would retrieve the deformed vertices by a weighted sum of all cage vertices and the result is a position. Seems kind of awkward to calculate a delta vector from it, send it to the GPU where it will be added again.

So could you explain, if you like of course, why using delta vectors may be advantegous ?
Maybe its because you have a uniform framework then - do you ever switch shaders ?

Quote:
So, in the case of deform geometry, the GPU wants it in a compressed format, while the CPU wants to use raw floats.


Another surprise ;-) What kind of a compressed format do you mean here ? At the moment I crunch raw floats in a vertex array, which is then send to OpenGL. I believed CPU float and GPU float are the same ? Maybe I am doing something suboptimal here, is "float" not the right format to talk to the CPU and if not, which is then ?


Quote:
I know your question was about class design, but my answer was about data design . In the case of GPU programming it's important to design your class around the data, not other way around.


No, no, no no - that was the perfect answer - it was exactly what i asked. I wanted to know how to combine a convenient interface with a good low-level memory layout. I like to abstract those things away and forget about it :-)
And class design is both data and interface !!???

Thank you very much for your time and by the way coool job ;-)

Quote:
Seconded. Unlearn the old ways of thinking!

What old ways - I am a newbie ;-)
Quote:Original post by Frederick
Well that is different approach to consider, thank you very much ! All the papers /forum posts , I have read in the past, stated that it is advantageous to put all vertex data into interleaved buffers and that that method is preferable performance-wise to the use of different buffers (cache maybe ? - really no clue ;-)) So I simply didn't come up with this.


Yes, it is advantageous for the GPU to have vertex data arranged in a single interleaved buffer. However, two interleaved buffers should be just as fast. The main advantage of two different buffers is that one can be completely static, while another can be dynamic and discarded each frame.

Quote:
So could you explain, if you like of course, why using delta vectors may be advantegous ?


Using delta vectors is advantageous if the data is pre-computed in an offline tool. Or if the runtime calculations don't depend on the absolute position and can use relative positions instead. Since delta vectors require very little precision, they can be compressed in a GPU-friendly format (UBYTE4), whereas absolute positions require full float precision. So, if you have a single base geometry mesh and multiple pre-computed sets of delta vectors you can save a lot of memory.

If delta vectors are awkward to use in your situation, then don't use them! I was just giving an example how you can separate static data from dynamic data so that there is complete independence between them. Delta vectors probably have some other advantages (tweening?) but I'm not an expert at that, maybe someone can shed some light on that subject?

Quote:
Another surprise ;-) What kind of a compressed format do you mean here ? At the moment I crunch raw floats in a vertex array, which is then send to OpenGL. I believed CPU float and GPU float are the same ? Maybe I am doing something suboptimal here, is "float" not the right format to talk to the CPU and if not, which is then ?


Floats are not great for the GPU because they use a lot of memory and bandwidth. The GPU is much happier with compressed formats such as UBYTE4, they use 4x less memory and the "decompression" is free. For highest GPU performance and better memory usage always compress vertex data as much as possible. That depends on how much precision you need. Usually, this kind of minimum precision is required:

positions: FLOAT3
normals, binormals, tangents, blendweights, deltas: UBYTE4
UVs: USHORT2, maybe FLOAT16_2

However, if memory usage and vertex fetch performance is not a problem then use floats, because they are much easier!
deathkrushPS3/Xbox360 Graphics Programmer, Mass Media.Completed Projects: Stuntman Ignition (PS3), Saints Row 2 (PS3), Darksiders(PS3, 360)
Hi deathkrush,

I finally found some time to think about my mesh class. Just wanted to let you know, that I find that the separation between static geometry and deform geometry is the most elegant approach in my case - I am pretty happy with it and will start to implement it right now ;-)

Thanks a lot for the hint with the datatypes - without your advice I would have continued to use floats all over the place.

Thank´s a lot,
Frederick
Hmm maybe the use of ubyte4 is too troublesome for me...

1) I use Java as my programming language. You may know that it is not as suited as C++ to invent new elementary datatypes. So I would have to "bit-fiddle" a java float into a ubyte-float. Not nice I think.

2)Can´t find much useful information on ubyte4... at least not how to fiddle floating point information into a single byte.

Did I get something wrong ?

Maybe I will have to stick with floats... :-/
Hey Deathkrush, (or anyone else :-))

I am really confused about ubytes. How may I ever extract a component of a deform vector from a byte ?

The byte will give me a range from -128 to 128. I read somewhere that these values will get normalized to the intervall [-1,1] (actually I read 256 gets normalized to [0,1] - but I am trying to apply that to my case, where I need the directions). I may use the ubyte4 to describe vectors then, in the unit sphere, but what if I want to move a vertex by 1.5 units ? I could choose a different scale then, but... hmm that´s complicated.

I believe I misunderstood something - as I said, that´s my very first engine ;-)

It would be nice if you our anybody else could help me out - I am not very good at lowlevel "bit-touching" programming. Also I find that information is really scarce on the internet. I would read up on my own, if I could :-/

Thanks!!!
I guess manipulating unsigned bytes is hard in Java because the only unsigned type is 16-bit char! Which graphics API are you using? There might be an API-specific function to pack 4 floats into a UBYTE4. Have you tried something like this:

float x, y, z, w; // assuming these are in the range 0..1int ubyte4 = (((int)(x * 255.0f) & 0xff) << 24) | (((int)(y * 255.0f) & 0xff) << 16) | (((int)(z * 255.0f) & 0xff) << 8) | (((int)(w * 255.0f) & 0xff)); // the endianness might be wrong, but you get the idea


UBYTE4 is great for bone indexes, because it's expanded to the range (0..255). UBYTE4N is the same format, but expanded to (0..1), so it's good for color, blendweights and normals. If you need signed values, simply scale and bias. For example, if you need (-1..1) range, do this in the shader:

float4 n : NORMAL; // passed in as UBYTE4Nfloat4 normal = n * 2.0f - 1.0f;
deathkrushPS3/Xbox360 Graphics Programmer, Mass Media.Completed Projects: Stuntman Ignition (PS3), Saints Row 2 (PS3), Darksiders(PS3, 360)
To answer your other question, if you need a range other than (0..1) or (0..255), just apply a scale and bias to the values in the shader. Be careful though, UBYTE4 format has very low precision, only 256 discrete values can be represented by each component.

// (0..2) rangefloat4 value = v * 2.0f;// (0..10) rangefloat4 value = v * 10.0f;// (-2..2) rangefloat4 value = v * 4.0f - 4.0f;// (-10..10) rangefloat4 value = v * 20.0f - 20.0f;


Also, are you compressing vertices to save memory, increase performance or both? If you don't have memory or performance problems in the vertex fetch pipeline, then compressing vertices won't buy much.
deathkrushPS3/Xbox360 Graphics Programmer, Mass Media.Completed Projects: Stuntman Ignition (PS3), Saints Row 2 (PS3), Darksiders(PS3, 360)

This topic is closed to new replies.

Advertisement