very compact reflective shadow maps format

Started by
10 comments, last by Lightness1024 11 years, 7 months ago
Hello gamedev,

There is something I've been wanting to do for quite some time, I have an RSM that is stored in 3 textures:
one R32F, and two RGBA8.

  • R32F : depth in meters (from camera that rendered the RSM)
  • RGBA8 : normal storage in classical shifted to color space with " * 0.5 + 0.5 "
  • RGBA8 : albedo

so we have 3 render targets simultaneously, and a waste of two alpha components.

For optimisation, because 3 RT can be very heavy for some cards, I thought about compacting ALL of that, into ONE RGBA16F texture.

  • R : depth, part 1
  • G : depth, part 2 (+ sign bit for normal)
  • B : normal
  • A : color

It must be compatible with DX9 so no integer targets, and not bit fiddling in shaders.

I thought of for the depth, a simple range splitting should do the trick.
we decide of some multiple of a distance that the "MSB" will store, and the LSB will store the rest.
example:

R = floor(truedepth / 100) * 100;
G = truedepth - R;


for the normal, we could store the x in the first 8 MSbits using the same trick, and the y in the 8 LSB.
the z can be reconstructed using the sign stored in the depth. knowing we are on a sphere there is just a square root to evaluate.
(and when reading depth we just always think of doing abs(tex2d(depth).r))

for the color, it would be a 16 bits color, stored in HLS with the same trick, again, of "floor" and "modulo" to park the values in 6/5/5 bits.

now,

knowing we have 16 bits IEEE754 half floats per channel here.
checking wikipedia, the precision is at least integer until 1024,
therefore should be increasing by steps of 32 between 32k and 65k.
and by steps of 0.00098 max between 0 and 1.

the issue is, what space slices should we use for the depth divisor ??
and would it be better stored if using a logarithmic depth ?
but in that case it would still need slicing since we need to store the depth in 32bits, so on two components, I suppose in that case the slicing will be logarithmic too ?

about the normal, I feel that there is a danger storing them like this, because some direction will have more precision than others.

the color is not really an issue, RSM don't need precise albedo.

what do you think ?

thanks in advance !
Advertisement
Interesting idea.
Packing data into F16 components is possible, but it is tricky. It's best to think of F16 as a 1.5.10 format, where you've got a sign bit, an exponent and a fraction.
The sign field can be 0/1. Exponent is an integer from -14 to +15. Fraction is a binary fraction with a hidden/non-stored "1." in front of it, ranging from 1.0 to 1.9990234375 (incrementing at a resolution of 1/512).

As an alternative, you could consider outputting to 2x regular 8888 textures, which is the same output bandwidth. The advantage is that it's easier to deal with simple 8-bit fractions and you've got 8 of these components to split your data over.
e.g. you could store depth over 3 (24 bit), normal over 2 and albedo over the remaining 3 (maybe with normal.z's sign packed into one of the albedo channels).
Yes, I thought of the 2xRGBA8 targets, it seems to have advantages. The only thing : I was afraid that there is a possibility of inferior perf, even if theoretically it shouldn't.
Because we verified that it happens on some cards, the example here, is the ATI FireGL 5600, (which is a bad card really); if you plug 4x32bits render targets you get much less perf than a 2x64bits render targets setup.
This is pretty crazy, and in my opinion, could be due to a voluntarily sub-optimized driver for marketing reason.
It is not impossible, we have seen examples : NVidia is allegedly doing that with OpenGL pixel read up between GeForce and Quadro lines; they are also putting some limitations into double computing performance for not-Tesla cards...

Well, but while between 2 and 4 RT, this reasoning could hold since it was verified once, between 1 and 2 it is a bit paranoiac.
So I might as well go for that, we can never tell before trying anyway.

24bits depth: yes, should be more than enough. (little thinking : a map thick of 1km will have a 60µm precision !)
Also, normal could be stored as 2 angles, but it requires some acos and atan2. I have no idea of the actual cost of those operations.

Thanks :)
I just had another crazy idea, we can push it down to one RGBA8, 16 bits for depth, still enough because we get 1.5cm precision for a 1km thick map. (providing we store depth fitting the map bounding cube into the 0-65536 space)
then 8 bits for normal, 8 bits for color.

for the color, we know how 8 bits color looks, not terrible, but once dithered it is not bad, and then you can reconstruct the high precision color by blurring with neighbor pixels. we loose spatial resolution for colors, but hey, with this trick even 8 bits storage is near perfect. I proved it in paint shop pro: convert an image into dithered 8 bits, reconvert to 24 bits, blur. you get the same thing than the original blurred image ! Though this is to be expected because it is the same to say "I store my color into 4x8bits, but I put this info into my neighbor cells".
Doing a dithering with a shader should be feasible with a fixed pattern. I could even make a reconstruction aware of depth/normal difference to preserve high frequencies. (same issue than depth of field blurring, and PCF)

then the problem will be to store the normal into 8 bits.
with Euler angles: 4 bits for alpha, 4 bits for phi, it means only 22° precision.
or maybe with 9 bits, and hemisphere projection principle, it would mean we slice a hemisphere quadrant into 64 values
-> 16*16 square to project a quadrant -> which means roughly 5° of precision ?
but it will not be distributed evenly, the poles will have better precision the equator.

hm..
I proved it in paint shop pro: convert an image into dithered 8 bits
Was this converting it to a 256-colour palletized image, or actually a 3.3.2 mode? The former will give much better results than the latter, but will be hard to pull off in a single-pass shader.
You won't be able to generate a 256 colour palette on the fly (as that would require every pixel to be able to inspect every other pixel), so you'd have to use a fixed palette, and even then, choosing which palette entry you should quantize your input to will be difficult -- the naive solution requires you to compare against every palette entry. You could precompute a lookup-table, but it would be a few MB.

Another option; in GPU Pro 2, there's a chapter "Shader Amortization using Pixel Quad Message Passing" which explains that the pixel shader can actually share information with the neighbouring pixels in a 2x2 area, via the ddx/ddy functions. You could use this to share the 4 albedo values, average them so that you're only outputting a single albedo per 2x2 area, and then split the storage of the colour over that whole area (such as top-left writes red, bottom-right writes blue, other two write green).
e.g. In digital cameras, every pixel either captures a red, green OR blue value, and then a demosaicing filter merges them into an RGB image.


I'm not sure about 8-bit normals, but this page here is the bible for 16-bit normal formats. Maybe start with the spheremap transform, but halve the number of output bits..
Hey, sorry for kind of offtopic but,

RGBA8 : albedo


What is albedo? Is it the flux of the object?

Hey, sorry for kind of offtopic but,

[quote name='Lightness1024' timestamp='1346410821' post='4975106']RGBA8 : albedo


What is albedo? Is it the flux of the object?
[/quote]

Albedo = diffuse color (I think)


Lightness1024 - 8 bit normals will look just awful when specular is applied (Maybe you could make it a 'low graphics' option?). I recommend you stick with a 64 bit buffer. Also, see this slideshow
http://www.insomniacgames.com/tech/articles/0409/files/GDC09_Lee_Prelighting.pdf
slides 12-14 for why you should store normals as spherical coordinates and not use z = sqrt(x^2 + y^2)
I think the misconception about "view space normals" is about assuming that z component is always negative, so thinking storing only x and y is enought.
In fact storing the sign of z is needed.
The normal can be stored (in world space directly) via its x and y components, and 1 bit for the sign of its z component.
And then : z = sign * sqrt(1 - x[sup]2 [/sup]- y[sup]2 [/sup])
You don't get negative z values if you use "per-pixel view space normals". Obvious drawback is that you have to compute the required matrix on per-pixel base.
Hodgman : ah yes I've looked at this ddx/ddy article vaguely before, but I feel it should be simpler to do the exploded color storage, just using "floor"/"fmod" will give a pixel index that gives us enough to select between r, g and b. Thanks for the idea :)
I stumbled uppon the aras-p article comparing normal storage quality while looking for materials for this problem, I recommend it to future googlers of this thread.

Hupsilardee : they may look bad, but it doesn't matter much. It is important to keep nice normals in a deferred GBuffer, but for an RSM we don't care much really. In my case it will be used to initialize cells of light propagation (LPV).

This topic is closed to new replies.

Advertisement