Mipmapping 3D texture

Started by
12 comments, last by JoeJ 6 years, 11 months ago

Alright, did those 2 things on 3-level version, which dropped down to:

Dispatch[0] (64 64 64): 7.839040ms
Dispatch[1] (8 8 8): 0.017760ms
Dispatch[2] (2 2 2): 0.000320ms
Total Time: 7.859200ms
Call overhead: 0.002080ms
Dispatch[0] (64 64 64): 7.937120ms
Dispatch[1] (8 8 8): 0.018400ms
Dispatch[2] (2 2 2): 0.000320ms
Total Time: 7.957600ms
Call overhead: 0.001760ms

Dispatch[0] (64 64 64): 8.029280ms
Dispatch[1] (8 8 8): 0.018560ms
Dispatch[2] (2 2 2): 0.000320ms
Total Time: 8.050240ms
Call overhead: 0.002080ms

Let me try the same on 2-level version.

EDIT: To answer you - they work all when loading from source, but right after only one of each 2x2x2 subgroup build 1st miplevel of the source - that's why the maskin is in that place.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Advertisement

EDIT: To answer you - they work all when loading from source, but right after only one of each 2x2x2 subgroup build 1st miplevel of the source - that's why the maskin is in that place.

Ah, finally get it.

There is a better approach:

Lets say we have a 8 pixel 1D texture in LDS, and t is the thread index of a 4 thread workgroup:


float4 reg = lds[t*2] + lds[t*2+1]; // 0+1, 2+3, 4+5, 6+7
write mipmap1(reg)
lds[t*2] = reg; // results in 0,2,4,5
ldsBarrier();
 
if (t<2)
{
reg = lds[t*4] + lds[t*4+2]; // 0+2, 4+6
write mipmap2(reg)
lds[t*4] = reg; // results in 0,4
}
 
ldsBarrier();
if (t==0)
{
reg = lds[t*4] + lds[t*4+2]; // 0+4
write mipmap3(reg)
}

You see initially all threads have work, but you need to load multiple texels per thread.

I did not realize this initially but i wondered why your LDS rquirements are so low.

Bad news is with 8 texels per thread your LDS usage would become too high, so there is no simple answer how to balance all this to get the best possible result.

Trial and profiling fun expected... :D

Edit: You could use float3, add a tiny value to colors so (0,0,0) can be used to determinate an empty voxel.

Yeah, it might be interesting. Building mipmaps is, after all, very similar to parallel sums. I'll be trying that shortly.

Using float3 or different type is probably going to be the way in the end. What I need to store:

  • Color - For me as low as RGB8 is enough
  • Normals - I'm going to use some packing, which needs to be packed in RG16F most likely
  • Roughness, Metallic and Emissivity each needs 8-bit value only (maybe even less, if I'd really want to compress down)

Now, nothing of this needs to actually be mip-mapped (unless I'd want multi-bounce GI/reflections). On this storage I need to perform lighting calculation per voxel (compute lighting per each voxel) outputting RGBA8 value. That result needs to be mip-mapped to be used later for cone tracing (GI/reflections/AO).

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Half floats would be nice too, you could load the packed stuff directly to LDS and un/pack it from there as discussed here: https://www.gamedev.net/topic/688773-how-will-fp16-affect-games/

How do you plan to deal with the thin wall problem? Think of a thin wall where both front and backside fall into one voxel. You can't build a proper normal.

I've tried spherical harmonics for this purpose - 4 values did not really help and 9 was still not good enough (I tried to encode directional surface area instead normals, but the problem is similar).

This topic is closed to new replies.

Advertisement