Sign in to follow this  
Frederick

How to design a mesh class ?

Recommended Posts

Hi! I recently started thinking about how the mesh class of my graphics engine should look like. What I want to achieve (hopefully one day :-)) is something like a 3D Platformer like Mario, Zelda etc. So I basically need skeletal or other deformations of meshes and collision proxies also - sure ;-) I am using vertex/fragments programs which require different vertex formats. Thats where my problem starts. Say, I would like my deformations run on the CPU, then I would like to access all vertices of my mesh in a uniform way. What I thought of in the past is, that it would be a great idea to use a memory layout in my mesh class, which may be passed to the hardware without further processing. The vertex data should be structured in a way, that can be directly understood by the GPU programs. In this case interleaved buffers. I thought it would be a good idea to prevent a data conversion every frame. Lets imagine a character with a bump-mapped body and some nice hair on his head, created by a fur shader. So we need two different vertex formats (bump/fur) to render the character. How to access that uniformly ? We don't want to care about the vertex format, when we apply the deformation. This could be solved via an iterator that has a built in case distinction (after vertex #1000 skip 6 bytes, after vertex #1500 skip 8 bytes ... and so on). What I am asking myself (err you ;-)) is, is it really worth the trouble or is it just easier to have two different formats - one to run the algorithm on, which will be compiled into a hardware-friendly format ? Maybe there is no big difference runtime-wise at all because all vertices of the model have to be touched in a typical animation every frame, regardless which method is being used. This is my first attempt at writing an engine and I really lack the experience how a properly designed mesh class should look like. I would be really glad if I could get some advise. Maybe there are requirements to the mesh class down the road I don't even see now. Also I am sure there MUST be some thread relating this rather basic question, but I apologize I couldn't find any, that helped me out. Thanks a lot !!! Looking forward to your suggestions. Frederick

Share this post


Link to post
Share on other sites
Try to separate static geometry from dynamic geometry. Static geometry can be uploaded to VRAM and never touched with the CPU. Dynamic geometry should be kept in AGP memory for easy CPU access. Other data can be uploaded via shader constants. So, going back to your bump mapped character, I would split the data this way:

* position, normal, binormal, UVs -> base geometry, static vertex buffer in VRAM, never changes
* delta position, delta normal, delta binormal -> deform geometry, dynamic vertex buffer in AGP memory, changes every frame
* bone matrices -> animation, vertex shader constants in system RAM, changes every frame

When you split the data this way you can access it independently without caring about different formats. If you want to deform/morph your characters, simply change the delta position, normal and binormal stream, no need to touch the base geometry with the CPU. Need to animate your characters? Simply calculate the bone matrices and upload them as vertex shader constants. This way you can also save memory and have only 1 copy of base geometry, while having multiple copies of deltas and bone matrices.

Another thing to keep in mind is that, while keeping the data in GPU friendly thing is a good thing, it might be a real pain to manipulate this data on the CPU. So, in the case of deform geometry, the GPU wants it in a compressed format, while the CPU wants to use raw floats. I recommended calculating the data in a CPU-friendly and then compressing it to a GPU-friendly format every frame. Obviously, you can keep the base geometry in a compressed format, since you won't be touching it with the CPU.

EDIT:

I know your question was about class design, but my answer was about data design . In the case of GPU programming it's important to design your class around the data, not other way around. Ultimately the GPU will be processing your data and if it's not designed right, there will be performance hits. Also, I forgot to mention that in order for any of my ideas to work you'll need to write a vertex shader that supports skinning and morphing.

Share this post


Link to post
Share on other sites
Hi Deathcrush,

thanks a lot for your kind answer !

Well that is different approach to consider, thank you very much ! All the papers /forum posts , I have read in the past, stated that it is advantageous to put all vertex data into interleaved buffers and that that method is preferable performance-wise to the use of different buffers (cache maybe ? - really no clue ;-)) So I simply didn't come up with this.

I am still not perfectly happy with the use of delta vectors. Do they have any advantage in comparison to absolute position vectors ? It is the same amount of data that needs to be transferred to the GPU and no vector addition needs to be performed. Also I believe it is easier, at least in my terms of thinking, to generate a series of vertex positions rather than a series of delta vectors. Lets say I want to do cage deformations, there is a recent nice paper from pixar about harmonic coordinates (most likely I won't do the harmonic stuff, but maybe I will try cage deformations), then I would retrieve the deformed vertices by a weighted sum of all cage vertices and the result is a position. Seems kind of awkward to calculate a delta vector from it, send it to the GPU where it will be added again.

So could you explain, if you like of course, why using delta vectors may be advantegous ?
Maybe its because you have a uniform framework then - do you ever switch shaders ?

Quote:

So, in the case of deform geometry, the GPU wants it in a compressed format, while the CPU wants to use raw floats.


Another surprise ;-) What kind of a compressed format do you mean here ? At the moment I crunch raw floats in a vertex array, which is then send to OpenGL. I believed CPU float and GPU float are the same ? Maybe I am doing something suboptimal here, is "float" not the right format to talk to the CPU and if not, which is then ?


Quote:

I know your question was about class design, but my answer was about data design . In the case of GPU programming it's important to design your class around the data, not other way around.


No, no, no no - that was the perfect answer - it was exactly what i asked. I wanted to know how to combine a convenient interface with a good low-level memory layout. I like to abstract those things away and forget about it :-)
And class design is both data and interface !!???

Thank you very much for your time and by the way coool job ;-)

Quote:

Seconded. Unlearn the old ways of thinking!

What old ways - I am a newbie ;-)

Share this post


Link to post
Share on other sites
Quote:
Original post by Frederick
Well that is different approach to consider, thank you very much ! All the papers /forum posts , I have read in the past, stated that it is advantageous to put all vertex data into interleaved buffers and that that method is preferable performance-wise to the use of different buffers (cache maybe ? - really no clue ;-)) So I simply didn't come up with this.


Yes, it is advantageous for the GPU to have vertex data arranged in a single interleaved buffer. However, two interleaved buffers should be just as fast. The main advantage of two different buffers is that one can be completely static, while another can be dynamic and discarded each frame.

Quote:

So could you explain, if you like of course, why using delta vectors may be advantegous ?


Using delta vectors is advantageous if the data is pre-computed in an offline tool. Or if the runtime calculations don't depend on the absolute position and can use relative positions instead. Since delta vectors require very little precision, they can be compressed in a GPU-friendly format (UBYTE4), whereas absolute positions require full float precision. So, if you have a single base geometry mesh and multiple pre-computed sets of delta vectors you can save a lot of memory.

If delta vectors are awkward to use in your situation, then don't use them! I was just giving an example how you can separate static data from dynamic data so that there is complete independence between them. Delta vectors probably have some other advantages (tweening?) but I'm not an expert at that, maybe someone can shed some light on that subject?

Quote:

Another surprise ;-) What kind of a compressed format do you mean here ? At the moment I crunch raw floats in a vertex array, which is then send to OpenGL. I believed CPU float and GPU float are the same ? Maybe I am doing something suboptimal here, is "float" not the right format to talk to the CPU and if not, which is then ?


Floats are not great for the GPU because they use a lot of memory and bandwidth. The GPU is much happier with compressed formats such as UBYTE4, they use 4x less memory and the "decompression" is free. For highest GPU performance and better memory usage always compress vertex data as much as possible. That depends on how much precision you need. Usually, this kind of minimum precision is required:

positions: FLOAT3
normals, binormals, tangents, blendweights, deltas: UBYTE4
UVs: USHORT2, maybe FLOAT16_2

However, if memory usage and vertex fetch performance is not a problem then use floats, because they are much easier!

Share this post


Link to post
Share on other sites
Hi deathkrush,

I finally found some time to think about my mesh class. Just wanted to let you know, that I find that the separation between static geometry and deform geometry is the most elegant approach in my case - I am pretty happy with it and will start to implement it right now ;-)

Thanks a lot for the hint with the datatypes - without your advice I would have continued to use floats all over the place.

Thank´s a lot,
Frederick

Share this post


Link to post
Share on other sites
Hmm maybe the use of ubyte4 is too troublesome for me...

1) I use Java as my programming language. You may know that it is not as suited as C++ to invent new elementary datatypes. So I would have to "bit-fiddle" a java float into a ubyte-float. Not nice I think.

2)Can´t find much useful information on ubyte4... at least not how to fiddle floating point information into a single byte.

Did I get something wrong ?

Maybe I will have to stick with floats... :-/

Share this post


Link to post
Share on other sites
Hey Deathkrush, (or anyone else :-))

I am really confused about ubytes. How may I ever extract a component of a deform vector from a byte ?

The byte will give me a range from -128 to 128. I read somewhere that these values will get normalized to the intervall [-1,1] (actually I read 256 gets normalized to [0,1] - but I am trying to apply that to my case, where I need the directions). I may use the ubyte4 to describe vectors then, in the unit sphere, but what if I want to move a vertex by 1.5 units ? I could choose a different scale then, but... hmm that´s complicated.

I believe I misunderstood something - as I said, that´s my very first engine ;-)

It would be nice if you our anybody else could help me out - I am not very good at lowlevel "bit-touching" programming. Also I find that information is really scarce on the internet. I would read up on my own, if I could :-/

Thanks!!!

Share this post


Link to post
Share on other sites
I guess manipulating unsigned bytes is hard in Java because the only unsigned type is 16-bit char! Which graphics API are you using? There might be an API-specific function to pack 4 floats into a UBYTE4. Have you tried something like this:


float x, y, z, w; // assuming these are in the range 0..1

int ubyte4 = (((int)(x * 255.0f) & 0xff) << 24) | (((int)(y * 255.0f) & 0xff) << 16) | (((int)(z * 255.0f) & 0xff) << 8) | (((int)(w * 255.0f) & 0xff)); // the endianness might be wrong, but you get the idea




UBYTE4 is great for bone indexes, because it's expanded to the range (0..255). UBYTE4N is the same format, but expanded to (0..1), so it's good for color, blendweights and normals. If you need signed values, simply scale and bias. For example, if you need (-1..1) range, do this in the shader:


float4 n : NORMAL; // passed in as UBYTE4N
float4 normal = n * 2.0f - 1.0f;


Share this post


Link to post
Share on other sites
To answer your other question, if you need a range other than (0..1) or (0..255), just apply a scale and bias to the values in the shader. Be careful though, UBYTE4 format has very low precision, only 256 discrete values can be represented by each component.


// (0..2) range
float4 value = v * 2.0f;

// (0..10) range
float4 value = v * 10.0f;

// (-2..2) range
float4 value = v * 4.0f - 4.0f;

// (-10..10) range
float4 value = v * 20.0f - 20.0f;



Also, are you compressing vertices to save memory, increase performance or both? If you don't have memory or performance problems in the vertex fetch pipeline, then compressing vertices won't buy much.

Share this post


Link to post
Share on other sites
Hi Deathkrush,

thank you for your answer =)

Quote:

To answer your other question, if you need a range other than (0..1) or (0..255), just apply a scale and bias to the values in the shader.


Okay, it is really that complicated - so I will have to decide which range of values I need, for example for my deformations. I believe that is really too much trouble, because I don´t have performance problems at all at the moment and so much else to care for. I am not pushing anything, just building the engine and game framework. I don´t even have a mesh loader at the moment, because I am figuring out how to lay out my meshes in memory.
Then again I am not striving too much for performance - it is a single person project and there have to be some cuts. I will have to do (and actually like to do) the 3D arts by myself - so probably average perfomance would do for me.

Quote:

Also, are you compressing vertices to save memory, increase performance or both?


Oh I thought about it, because it was your advice and it sounded pretty cool to do updates in 1/4 of memory and time, but I believe that this is an optimization that could be eventually done, when everything is already working and some extra performance is needed.

Quote:

I guess manipulating unsigned bytes is hard in Java because the only unsigned type is 16-bit char! Which graphics API are you using?


I am using OpenGL (with the Option to do DirectX later, if ever :-). Actually it is a mixed C++/Java project. C++ for talking to the hardware and Java for everything else. So I always have the option to do data conversion in the C++ part, for example pass a buffer of javafloats to C++ and convert them to ubyte. Unfortunately I won´t be able to use them directly.

So I am not sure if I will stay with delta vectors, because it is doubling the vertex data of my deformable geometry (static + delta buffer) I will also have to do an addition for every vertex (maybe thats not too expensive even if its done 9x times in a fur shader). But it would reduce complexity and I wouldn´t need to deform normals and binormals (I will have to read up what this is ;-)) on the CPU separately. So its maybe worth the additional amount of memory. While writing this I decide to stay with it :-) Its just fine. Thank you!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this