Calculate average scene luminance in HDR

Started by
5 comments, last by MJP 14 years, 5 months ago
When we calculate the average luminance .Why not Calculate the log value of every pixel and use a static number to record sum of all log pixel value by using a pixel Shader(Can we? Does it support it?) I found the way to calcualte the average scene luminance in DirectSDK is downScale the texture times by times,and at last we got the value of average scene luminance in a 1x1 texture. Do we have to do like this ? Thanks
Advertisement
I consider myself a newbie in shaders, but I' m guessing if you use a static variable it takes much longer to calculate because pixel shaders need to wait for eachother. So the whole thing is only as fast as your CPU would do it or even slower.

So you don't have to, but if you wanna make it fast it's recomendable.

Hope my advice was usefull!
The only thing a pixel shader can output is pixels, so you can't have it output any variables or anything unless your outputting those variables into the values of pixels.
"WARNING: Excessive exposure to politicians and other bureaucrats has been linked to aggressive behavior." - Henk Hopla
Quote:Original post by ttthink
Why not Calculate the log value of every pixel and use a static number to record sum of all log pixel value by using a pixel Shader(Can we? Does it support it?)
Traditional pixel shaders do not have access to shared memory (e.g. a static variable) -- they can only output a pixel.
Quote:Do we have to do like this ?
Valve has published their technique for calculating scene luminance (they actually get a full histogram, instead of a single value), which uses the stencil buffer and occlusion query hardware instead of the "down scale" technique.

If you're using CUDA or another low-level GPU language, you can access shared memory from a "shader", so you could possibly implement your idea. I think DX11 shaders (SM 5.0) gives access to shared memory as well.
That Valve technique sounds interesting, I don't like the idea of a bunch of passes just to compute luminance. it makes me nervous :P

Ya DX11 has the compute shaders that are going to be amazing for all kinds of GPGPU which of course includes computing scene luminance. One youtube demo I saw on DX11 compute shaders showed them computing the height field of water waves like Crysis did on the CPU except the compute shader did it over a 512x512 height field instead of a 64x64 like Crysis did I think.

Edit: I also believe that the moderator of the DX forum did a really nice dev journal post in May about computing the standard deviation of a terrain height field for use in LOD in compute shaders.

Sorry if I'm getting kind of off track but I get really excited about new graphics tech :D
"WARNING: Excessive exposure to politicians and other bureaucrats has been linked to aggressive behavior." - Henk Hopla
This may sound stupid, but actually... what prevents you from generating a histogram by doing a glCopyBufferSubData into a vertex buffer object, followed by a call to glDrawElements (on width*height vertices, into a render target of size desired_levels*1), with additive blending enabled, constant color set to 1.0/255.0 (or 1.0/65535.0 if you use a 16bit render target) and a vertex shader similar to this:

void main()
{
gl_Position = vec4(dot(gl_Vertex, vec4(0.3, 0.59, 0.11, 0)), 0.5, 0.5, 1.0);
}

I'm not sure if additively blending 1-2 million pixels is so terribly high performance, but then again why not -- and all other algorithms take one pass per histogram level, which is not precisely cheap either (though admittedly, you can spread it over several frames).
Quote:Original post by samoth
This may sound stupid, but actually... what prevents you from generating a histogram by doing a glCopyBufferSubData into a vertex buffer object, followed by a call to glDrawElements (on width*height vertices, into a render target of size desired_levels*1), with additive blending enabled, constant color set to 1.0/255.0 (or 1.0/65535.0 if you use a 16bit render target) and a vertex shader similar to this:

void main()
{
gl_Position = vec4(dot(gl_Vertex, vec4(0.3, 0.59, 0.11, 0)), 0.5, 0.5, 1.0);
}

I'm not sure if additively blending 1-2 million pixels is so terribly high performance, but then again why not -- and all other algorithms take one pass per histogram level, which is not precisely cheap either (though admittedly, you can spread it over several frames).


it would work but to do that you need to synchronize cpu and gpu. for drawing operation cpu sends commands to gpu and and gpu executes next command on the queue after finishing the current one. if you get an input from gpu to cpu, gpu needs to finish all of the older commands before replying that so cpu must wait for gpu

also something like "for(each pixel on a texture)" would cause too much work on cpu
taytay
Quote:Original post by shultays
it would work but to do that you need to synchronize cpu and gpu. for drawing operation cpu sends commands to gpu and and gpu executes next command on the queue after finishing the current one.
Synchronisation isn't an issue, and it is the driver's responsibility anyway. For the glCopyBufer (or glReadPixels, if CopyBuffer is not available), the driver can postpone the sync until you actually use the the buffer. Unless you force it (e.g. by mapping the buffer) it won't have an issue there. Also, the drawing will in all likelihood be done anyway at the time you're reading the buffer, since you normally do that kind of thing on the previous frame.
Besides, both the scale-down technique and Simon Green's GDC 2005 occlusion query technique (now known as "Valve technique") need driver synchronisation, and in practice it's not a problem at all.
The data never leaves the GPU, I'm not sure where the CPU comes into play here.

Quote:Original post by shultays
also something like "for(each pixel on a texture)" would cause too much work on cpu
How do you think all postprocess effects or all GPGPU algorithms work? The CPU is not involved at all in this. Your "for(each pixel)" runs in a GPU shader (usually a pixel shader, but in this case in a vertex shader).
The CPU is not involved if you draw a bunch of vertices that are in GPU memory (it would not even be involved if the data was in main memory [assuming there's no "weird" vertex format that the driver has to convert first], although in that case it might block if the DMA transfer had not finished before the draw call starts, but this is a totally different story).

I might agree if you said that the massive additive blending is a problem, but CPU load and synchronisation is certainly not the issue.

On the other hand, blending might indeed not be such a terrible problem, after all. There are approximately size*width/N additive blended writes (for a histogram with N levels) in a single pass. You have size*width*N non-blended stencil writes for Green's algorithm (plus branch inconsistency with texkill), and size*width*log(max(size,width)) non-blended color writes for the scale-down algorithm, which are all 100% dependent and require new buffer binds. Thus, neither of the alternatives can be called particularly cheap, so blending might actually be faster. This is hard to tell without having tried.

I might also agree if you said that accurracy is a problem. Maybe one would need to use a fp16 or fp32 render target to avoid saturation. Since the histogram render target would be very small (maybe something like 1x32), that shouldn't be a problem, though.
But then again, saturation might not be much of a problem anyway, who knows. For example, if the topmost few values in the histogram are saturated, it's a strong hint that the tone mapping is too bright, so even if the values are not all exact, it might just work. This is something one would have to try, I guess.
Quote:Original post by mikfig
That Valve technique sounds interesting, I don't like the idea of a bunch of passes just to compute luminance. it makes me nervous :P


If you're actually worried about it, profile it. I think you'll find that in practice, it won't take a large portion of your frame time.

Valve does things the way they do because they don't have an HDR source buffer to work off of...they directly apply tone-mapping in the shader.

This topic is closed to new replies.

Advertisement