Numerous Textures or Numerous Texture Changes

Started by
3 comments, last by xycsoscyx 17 years, 11 months ago
I am rehauling my engine considerably (busy at work, it's been in indefinite ice lately), and am working on shadow mapping again. I am deciding between two different code paths, and was wondering which was the more optimal solution. Here's the setup: I am using deferred shading, which means three huge floating point textures that I render the scene too. These textures are the input data for my lighting calculations, but I assume slow down the system to set them again and again as textures? A) Create a shadow texture for each light, before rendering the scene, fill all these shadow buffers. During lighting, just get the associated texture and use it. This requires a lot of additional textures for all the shadow maps per light, but it means only setting my floating point buffers as textures once, and setting the shadow texture per light. B) Create a single shadow map texture. When doing lighting, fill in the shadow map for that light, then set the floating point textures as targets (as well as my shadow map texture), and light the scene. Repeat per light. This means only a single texture for the shadow map, but it means setting my huge floating point textures again and again (once per light, in fact). Which is more optimal? Waste memory, or waste time? Could setting these big floating point textures again and again slow down my performance? Or would saving (potenially a lot of) memory provide better performance?
Advertisement
My PC DX knowledge is sketchy. But I'd say its totally dependent on the memory available on your gfx card. If everything can fit in video memory then A would work great, and would avoid alot of costly state changes. However the moment you start paging main memory and video memory it will become seriously inefficent.

Of course in most situations you can't define in advance what hardware the end user will be using.

Quote:Original post by xycsoscyx
This means only a single texture for the shadow map, but it means setting my huge floating point textures again and again (once per light, in fact).

Unless you have a very large number of lights, this will be an insignificant overhead. If you have under 10 lights, which sounds reasonable, don't worry about this.

It would be optimal to transfer as much data as possible in a single write to GPU RAM. The latency grows for larger groups of smaller packets of data.

Transferring a 4K texture between GPU and CPU RAM repeatedly is incredibly slow on the AGP 4x 6800GT. It is needed though, for some algorithms too complex for current GPU capabilities. It's sometimes faster to do it in CPU RAM entirely.

Your idea of using a massive texture for multi-image ping-pong (Render To Texture) lighting is in line with what the GPUs are meant to do right now architecturally.

I say go for it. Using a large viewport, a large quad, and a large input texture, you should be able to shade it whole and precisely using an orthogonal perspective.

I believe that the D3D device-derived capabilities object contains the maximum texture size information at runtime.

OpenGL 3D textures usually adhere to the same dimension size maximum as with 1D and 2D textures. On the 6800GT, I believe that this limit is 4096 elements per dimension, for all three types of OpenGL texture objects. Of course, your card may not meet this specific criteria, and can be determined through the use of:

GLint max_size;
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &max_size);


The pixel indexing system's limits grows exponentially when another dimension is added. When 3D comes into the picture, it can be used to optimally emulate a large stack of 2D textures. GPU Gems 2 has various chapters on GPU memory packing and other use optimizations.

If you don't really need enough GPU RAM for a max*max*max sized array, you can also use a map_width*map_height*number_of_maps approach.

If you manage to implement this algorithm, then you can literally copy in as many light maps as you want in one shot, then never touch it with the CPU again if you can help it.

If in your tests you find that:
1) The pure GPU algorithm runs more efficiently than the non-pure GPU version at the desired workload
2) The pure GPU algorithm is even possible with today's GPU capabilities, even if only on the newest of technology

If so, I have to suggest that that it be implemented on the GPU firstly.

I personally wouldn't begrudge you for implementing an algorithm that cannot be run on my old-ish GPU. Instead, it would make me want to upgrade. A lot of capitalist techno-geeks would also probably see it the same way.

[Edited by - taby on May 25, 2006 2:11:52 AM]
taby, wow, I never thought about using 3D textures like that, it's not a bad idea, and saves me from setting the shadow texture over and over, but can I use it for a render target? I'll hafta implement it and find out. :D

Yes, currently, the engine is already starting to push it, and I haven't even reimplemented glassy surfaces yet. As it is, I even had to make a small additional path for my work computer, which doesn't support MRT Post Shader Operations (Raedon X600 at work). My card at home does, though (GeForce 7600), and it took me a moment to figure that one out when I first ran it at work.

Fortunately, as long as the card can support 3D textures as render targets (which it should, since I can lock a single layer, if I remember correctly), then I should be able to use the whole thing like that, but I still need to constantly switch the render targets per light source. The only thing it would save is setting this texture for lighting, I no longer have to constantly switch between shadow map textures, only track the layer and change a constant.

This topic is closed to new replies.

Advertisement