Increase texture reading speed

Started by
2 comments, last by StanIAm 10 years, 2 months ago

Hello I have a question to you about some HLSL stuff.I implemented a global illumination compution with RSM and get with the cornell model and a 12*12 lightmap samples for each pixel a framerate of about 60 fps. That means I have to sample 3 textures ( diffuse/flux,position and normal, all R16G16B16A16_FLOAT format ) 144 times per pixel and that makes the computation slow.

When I don't sample the textures and give them const values ( just for a test) I get avbout 800 FPS so this is a big difference.

Now the question is how can I increase the speed of reading my textures to get a better speed ??

Advertisement

first off all, 144 tex lookups per pixel is a lot. a solution would be to simple use less ( find a balance between performance and visuals).

Try to make your textures smaller and/or slimmer.

fat textures (like your (R16G16B16A16) are slower so try to pack your data.

Normals for example could be stored as X8Y8 and reconstruct Z inside the shader.

also look to the access patterns and filtering.

reduce filtering quality or try to use the Gather instruction if possible.

you could even refactor your renderer to make use of a compute shader where you can sample your textures into group-shared memory

Some ideas:

1) Reduce the amount of bits per sample.
e.g.
* maybe for normals you can get away with 8bit channels instead of 16. You might also be able to store them using two channels instead of 3.
* For diffuse/flux, maybe you can use a "packed HDR" 8bpp format, like RGBM/RGBE/LogLUV/etc
* Instead of storing positions, maybe you can just store depth from the light-source, and use these depth values along with the light's projection matrix to reconstruct the positions.
* Try and make use of your alpha channels -- e.g. you might be able to store normals and depth in the one texture.

2) You could re-use work from previous frames, using a real-time/reverse reprojection cache, etc.

3) Try and make your texture samples coherent in screen-space. i.e. ideally, two pixels that are next to each other on the screen should be sampling two texels that are next to each other in the texture, so that the texture cache is more effective.

4) Reducing the resolution of your textures will help with #3.

Thank you very much for the answer. Your right it isn't very efficient when I use one 144 samples of a 1024*768 pixel rendered RSM...

My idea would be to store the pixellights in an array structure with position/normal/flux so I have not to sample the points every pixel n times and have only a 1D array and only one for loop ( which I think doesn't really inscrease the speed...)

But storing the VPLs in a buffer would be more effecient I think but I have one problem with that. I don'T really know how to do that without a rendertarget...so only with a constantbuffer or any type of buffer. Is there a way to same the array of my pixellightstructure in a buffer and read it like a constant buffer in each frame ??

This topic is closed to new replies.

Advertisement