Jump to content

  • Log In with Google      Sign In   
  • Create Account

Compressing vertex attribute normals


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
13 replies to this topic

#1 theagentd   Members   -  Reputation: 602

Like
0Likes
Like

Posted 29 July 2012 - 05:16 PM

Hello. I am making a game which generates a huge amount of geometry which is using up too much VRAM. I am using almost 250MBs of data for just a simple test level, and I expect to have up to 4 times as large levels, so I'm in a pretty bad situation. Only a small part of the screen is visible at any moment, and it seems like the driver is really good at swapping out unused data to RAM so even if I go over the amount available, it won't grind to a halt. It still produces pretty bad stuttering if I quickly turn around for the first time in a while and a big amount of data has to be swapped back. All in all, reducing the amount of data seems like a good start in reducing the problem, but I might need other measures too.

The geometry is static and stored in VBOs. Each vertex contains the following information:

Position, 3 x 4 bytes.
Normal. 3 x 4 bytes.
Texture coordinates. 2 x 2 bytes
Animation data, 4 x 2 bytes.


Total: 36 bytes per vertex

I definitely need 4 bytes per dimension for position and 16-bit texture coordinates. Animation data can't be compressed either. Therefore I was hoping to at least be able to compress the normals as much as possible. They do not need to be very accurate. I was hoping to compress each normal to only 4 bytes, meaning I'll save 8 bytes from each vertex (~22% less data per vertex). Is this possible? Packing would be done at load time, but unpacking will be done in the vertex shader, meaning that it'll have to be pretty fast to be a worthwhile trade-off.


TL;DR: Is it possible to compress a 3-float world space vertex normal to only 4 bytes and quickly unpack it to in a vertex shader?

Sponsor:

#2 dpadam450   Members   -  Reputation: 946

Like
0Likes
Like

Posted 29 July 2012 - 05:46 PM

I am using almost 250MBs of data for just a simple test level,

Yea I would post where your memory is going to because that sounds terrible when you say the word test.

#3 theagentd   Members   -  Reputation: 602

Like
0Likes
Like

Posted 29 July 2012 - 07:37 PM

Yea I would post where your memory is going to because that sounds terrible when you say the word test.

Like I said, to vertex data for terrain. A majority of all levels will be of the size of the "test" level, but larger levels are also possible. Running at the maximum supported world size results in 1080.4 MBs of data; 997.3MBs of vertex data and 83.1MBs of index data. With compressed normals the vertex data would drop by 22% to around 774.9MBs, putting the total at 858MBs, which barely fits my 896MB VRAM graphics card.

Like I said, most of the level will be far away from the player, so I'll probably add some code to unload data for far away terrain. Try to focus on the question please. =S

#4 dpadam450   Members   -  Reputation: 946

Like
0Likes
Like

Posted 29 July 2012 - 09:32 PM

Well your question leads to basically: you are doing something wrong. 858MB/32bytes = 24 million verts.You need to LOD your terrain. I asked where the memory is going because, you have no memory set aside for textures or shadow maps.

Is it possible to compress a 3-float world space vertex normal to only 4 bytes and quickly unpack it to in a vertex shader?

Put 2 components in and cross product to get the 3rd. (2 bytes per component)

Edited by dpadam450, 29 July 2012 - 09:33 PM.


#5 larspensjo   Members   -  Reputation: 1557

Like
1Likes
Like

Posted 30 July 2012 - 12:23 AM

Are you using indexed drawing? That way, up to 6 triangles can share the same vertex. Normals can sometimes be shared for terrain. It may even be desired, to get a smooth view.

texture coordinates can be more tricky.
Current project: Ephenation.
Sharing OpenGL experiences: http://ephenationopengl.blogspot.com/

#6 L. Spiro   Crossbones+   -  Reputation: 14245

Like
1Likes
Like

Posted 30 July 2012 - 05:29 AM

Put 2 components in and cross product to get the 3rd. (2 bytes per component)

A cross-product is between 2 vectors. To get the 3rd component of a vector you have to determine what value normalizes the vector.
sqrt( 1.0 - ((X * X) + (Y * Y)) )
Assuming Z is the forward vector and thus always positive.


L. Spiro
It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#7 japro   Members   -  Reputation: 887

Like
1Likes
Like

Posted 30 July 2012 - 06:31 AM

I think what he meant is store two components (x,z) and then get the third one from y = sqrt(1-(x^2+z^2)). that doesn't give you the sign of y but for example in a heightfield you can usually assume that the y component has to point "up".

Edited by japro, 30 July 2012 - 06:37 AM.


#8 theagentd   Members   -  Reputation: 602

Like
0Likes
Like

Posted 30 July 2012 - 08:48 AM

Well your question leads to basically: you are doing something wrong. 858MB/32bytes = 24 million verts.You need to LOD your terrain. I asked where the memory is going because, you have no memory set aside for textures or shadow maps.

I am well aware of this. I will take steps to reduce the amount of data too. Are you saying I should focus on that instead?

Are you using indexed drawing? That way, up to 6 triangles can share the same vertex. Normals can sometimes be shared for terrain. It may even be desired, to get a smooth view.
texture coordinates can be more tricky.

Yes, vertices are reused whenever possible.

Assuming Z is the forward vector and thus always positive.

This is not the case. The normals are in world space.

I think what he meant is store two components (x,z) and then get the third one from y = sqrt(1-(x^2+z^2)). that doesn't give you the sign of y but for example in a heightfield you can usually assume that the y component has to point "up".

The normals can face any direction, so this is probably not going to work well.


I was recommended this article in another forum: http://aras-p.info/texts/CompactNormalStorage.html I may use method 4 in that article, but for now I'll focus on reducing the amount of data.

#9 L. Spiro   Crossbones+   -  Reputation: 14245

Like
-1Likes
Like

Posted 31 July 2012 - 05:29 AM

Assuming Z is the forward vector and thus always positive.

This is not the case. The normals are in world space.

Not after they have been transformed into view space and culled.
My assumption is correct and the technique works. You aren’t doing lighting in world space (if you are, shame on you) and you don’t need the Z component until you are in view-space.


I think what he meant is store two components (x,z) and then get the third one from y = sqrt(1-(x^2+z^2)). that doesn't give you the sign of y but for example in a heightfield you can usually assume that the y component has to point "up".


The normals can face any direction, so this is probably not going to work well.

Read above. They can’t face any direction, they can only face towards the viewer. We are not talking about world space here.
So again, I restate, store the X and Y only and calculate the Z in the vertex shader.
The Z will always be positive, but a false positive will never reach the screen—any faces that point away from the screen will be culled, thus all false positives will be culled.


L. Spiro
It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#10 Hodgman   Moderators   -  Reputation: 31800

Like
-1Likes
Like

Posted 31 July 2012 - 05:46 AM

We are not talking about world space here.

Then "we" aren't helping -- theagentd posted the thread and is reiterating that the original (unanswered) question of the thread was about compressing world-space normals in a vertex buffer...
Also, view-space normals can have both + and - z values with a perspective projection! Making the assumption that z is always positive will cause strange lighting artefacts on surfaces at glancing angles.

You can drop one component and reconstruct it with this trick, but you do need to store a sign-bit somewhere to correctly reconstruct the missing component.
 
32-bits per component for normals is definitely overkill, 16-bits per component will definitely suffice. You can either use a fractional integer format (where -32768 is unpacked to -1, and 32767 is unpacked to +1, automatically for free, by the input assembler before being fed into the vertex shader), or half-floats (which are also automatically unpacked for free).

Even 8-bits per component may be enough if you actually make use of all 24-bits of data. When storing normalized values, then most of the possible values in that 24-bit space are unusable (because they don't represent normalized vectors), but there is a crytek presentation somewhere (search for "best fit normals") that makes the observation that these unusable/non-normalized values can be normalized in your vertex shader to re-create the original input value fairly accurately. The theory is that at data-build-time you take your high-precision normal, and then search the full non-normalized 3*8-bit space for the value that when normalized will best recreate the input value. At runtime, the decoding cost is 1 normalize in the vertex shader (again, the input-assembler should be able to do the scale/bias from int8 to float for free). This increases the precision of 8-bit world-space normals by about 50x.

Also, you may be able to get away with using half-floats for your positions if you break the mesh up into several local chunks -- each chunk with it's own local origin in it's center -- and use an appropriate transformation matrix for each chunk to move it back to the correct world location.
Half-floats have amazing precision between 0-1, decent precision between 1-2, but start getting pretty inaccurate for large values, so you'd have to experiment with scaling factors too. For vertices that are a large distance from the origin, the visible artefact will be a quantisation of positions, which will make originally-straight lines look wobbly and smooth curves look faceted.

Edited by Hodgman, 31 July 2012 - 06:35 AM.


#11 L. Spiro   Crossbones+   -  Reputation: 14245

Like
-1Likes
Like

Posted 31 July 2012 - 08:47 AM

We are not talking about world space here.

Then "we" aren't helping -- theagentd posted the thread and is reiterating that the original (unanswered) question of the thread was about compressing world-space normals in a vertex buffer...
Also, view-space normals can have both + and - z values with a perspective projection! Making the assumption that z is always positive will cause strange lighting artefacts on surfaces at glancing angles.

My first reply is in regards to compressing world-space normals.
Yes they are world-space within the vertex buffer, but you must recognize that they are not actually used until later. The values when they are used are what matter, so when approximating them, or in any other way modifying them before that point, really has no meaning. They could be XYZARB, up until the moment when they are used within a lighting equation as long as the equation is using the standard terms.

Basically, we are helping because the original topic poster assumed the normals need to be used in world coordinates. You can define 2 components in world space and then derive the third in view space later, inside the shaders, at whatever point lighting needs to be done.

If you think about it you can see how the sign for the Z component is always positive and then how that leads to my previous suggestion.
But if you need more than just my word, I can tell you that we use this type of compression at work, and I can promise with actual hands-on experience that it works.
The only way in which it fails is when you don’t reject back-facing polygons, but that is a rare condition and we have special cases for that.

That being said, I also recommend 16-bit normals. Normals are usually confined to a small range and you gemerally don’t need to exceed this range.
Using 2 16-bit floats instead of 3 32-bit floats will save you a lot of memory.


L. Spiro

Edited by L. Spiro, 31 July 2012 - 08:04 PM.

It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#12 Hodgman   Moderators   -  Reputation: 31800

Like
1Likes
Like

Posted 31 July 2012 - 08:30 PM

If you think about it you can see how the sign for the Z component is always positive

Think some more. The rasterized triangles are not in view-space; they have gone through a perspective transformation, which allows them to be facing slightly away from the camera and still be visible, and still pass the 'backfacing' test (i.e. they are not back-facing in post-projection space, but are back-facing in view-space, so they aren't culled, but do have negative view-space z).
To visualise, imagine standing in a room where the floor is slightly sloping downwards away from you, and the roof is slightly sloping upwards away from you -- both the floor and roof normals are pointing away from the camera, but because of perspective, they are both visible!

But if you need more than just my word, I can tell you that we use this type of compression at work, and I can promise with actual hands-on experience that it works.

Yes, it mostly works, but has artefacts on glancing angles (and gets worse the larger the FOV).
Tactfully raise this issue with your senior graphics programmer, or show the artefacts to a senior artist, and make yourself look good Posted Image

Basically, we are helping because the original topic poster assumed the normals need to be used in world coordinates. You can define 2 components in world space and then derive the third in view space later, inside the shaders, at whatever point lighting needs to be done.

Would you care to explain to the OP how to compress his 3 world-space values down to just two values, how to transform these 2 values into view-space, and then how to reconstruct the missing 3rd view-space component correctly?
e.g. say I've got two test world-space normals, for the floor and roof of a room A=(0,0,1) B=(0,0,-1) and my view-matrix is identity, to simplify things.
Step 1) Drop the world-space z component: A=(0,0) B=(0,0)
Step 2) Transform to view-space: A=(0,0) B=(0,0)
Step 3) Reconstruct view-space z component: A=(0,0,1) B=(0,0,1) ... now the floor and the roof have the same normal.

Again, if you want to use this technique and correctly reconstruct the missing component, you need to store a sign-bit somewhere so that you can flip the reconstructed component in some cases. e.g. you could store x in 16-bits, y in 15 bits, and the sign of z in y's spare bit.

Edited by Hodgman, 31 July 2012 - 11:40 PM.


#13 L. Spiro   Crossbones+   -  Reputation: 14245

Like
1Likes
Like

Posted 31 July 2012 - 08:42 PM

Now that I am sober I can easily see that you are correct.
I also now remember more accurately that at work we are not sending only 2 components to the GPU. I was completely mixed up regarding the point in the pipeline where that happens and made an idiotic “promise” regarding hands-on experience.

My previous advice in this thread can be disregarded.


L. Spiro
It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#14 tanzanite7   Members   -  Reputation: 1378

Like
0Likes
Like

Posted 01 August 2012 - 04:56 AM

1) format
Position, 3 x 4 bytes.
Normal. 3 x 4 bytes.
Texture coordinates. 2 x 2 bytes
Animation data, 4 x 2 bytes.

2) I definitely need 4 bytes per dimension for position

3) I was hoping to compress each normal to only 4 bytes, meaning I'll save 8 bytes from each vertex (~22% less data per vertex). Is this possible? Packing would be done at load time, but unpacking will be done in the vertex shader, meaning that it'll have to be pretty fast to be a worthwhile trade-off.

I had a similar problem with terrain data not fitting into my target memory spec and had to shrink my 32byte vertex format to 16byte. So, based on that experience:

1) My format ended up being (2 variants):
* 3x2B - position
* 2x1B - misc material params
* 4x1B - more material data OR 2x2B texcoord
* 4x1B - normal + 1 byte extra data OR quaternion - normal & tangent vectors

2) ... so, I have to challenge this. I use a dangling attribute to move a whole "chunk" of vertexes to where they should be (hence why a full float is not needed to get the exact same precision and range). Understanding how floating point numbers work is a must tho or cracks can appear - when done correctly it is guaranteed to give the exact same end result than when one would have used 4B-floats for positions. Might be worth considering.

3) using bytes has been sufficient for me, might be to you too. Unpacking is just one MAD instruction which is so ridiculously cheap that it wont have any effect on performance at all.

So, i would recommend:
Position, 3 x 4 bytes. => 3x2B (+dangling attribute)
. => 2B padding data for sane alignment
Normal. 3 x 4 bytes. => 4x1B (normal + something else to not completely screw up the alignment)
Texture coordinates. 2 x 2 bytes => 2x2B (if you really-really need it, i generate them in vertex shader based on position)
Animation data, 4 x 2 bytes. => 4x2B (as you said you can not shrink it, sure?)

=> 6+2+4+4+8 = 24B

edit: or use 3x2 normal instead of the 2B padding + 4x1B.

Edited by tanzanite7, 01 August 2012 - 05:06 AM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS