Sign in to follow this  

fastest way to push light data to the gpu

This topic is 3291 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, As many of you might know from the hundred or so post i've made about deferred rendering. I am working on, well, deferred rendering. I am nearly done with the first pass ( just need to figure out how to pack two 16bit floats into a single 16bit float so I can drop it down to a easy to mange 128bits per, or two MRT's ) But now the second pass comes threw, and it has left me woundering, What is the best way to pass in light data. Woulg passing in a struct (btw how DO i pass in strucs to the gpu??) holding the diffuse(vec3), specular(vec3), position(vec3), type(int), direction(vec3); be faster then simply storing and passing in those values threw uniforms? Also would it be better to use a single light per pass with a clipping plain, or could I do multipul lights at a time with a stencil ( tho this might take another shader to generate for multipul lights)? PS: While reading a presintation on this (Nvidia's 6800 Leagues Deferred Shading), I found that nvidia has a fast normalize function that works on 'half' values. According to them the line half3 n = normalize( tex2D( normalMap, coords).xyz ); will compile down to a single operation 'nrmh', is there any hope of getting this optimization with GLSL? And how can I (if it can) make it work with other normalize calls?Like any good coder I want my app to be lean. :P

Share this post


Link to post
Share on other sites
There is no way to pass a struct.
You can use struct, but you have to update each member with calls to glUniformXX

half3 n = normalize( tex2D( normalMap, coords).xyz );
If you ignore the texture fetch going on in there....let's convert that to
half3 n = normalize(something);
yes I beleive nvidia said their Gf6 can't do it in a single cycle.

Do they say
vec3 n = normalize(something);
can be done in a single cycle?

Quote:
getting this optimization with GLSL

It should be done.

Share this post


Link to post
Share on other sites
Quote:
Original post by V-man
Do they say
vec3 n = normalize(something);
can be done in a single cycle?


I'd be surprised, it may be that it takes a cycle to issue but you get the result a little later. (And incurr a stall if the result is used too soon)

Share this post


Link to post
Share on other sites
Quote:
Original post by V-man
Do they say
vec3 n = normalize(something);
can be done in a single cycle?


Yes and no, They use half3 (which i understand would make a 8bit float into a 4bit float) when doing the normalize call with a float3 (which I understand to be the CG equilivant of vec3) they state that it is slower and will not compile down to the single call.

Share this post


Link to post
Share on other sites
There is no such thing as 8 bit float. vec3 and float3 are 3 component 32 bit floats. half3 is 16 bit float, 3 components.

http://www.opengl.org/registry/specs/ARB/half_float_pixel.txt
describes the half float

Anyway, go ahead and use half, half2, half3, half4 with GLSL. Just don't define a version number at the top of your shader like this
#version 110

otherwise the compiler thinks you want the basic GLSL support and so half is not legal.
So if you don't add the version, it just gives warnings in the log.

Share this post


Link to post
Share on other sites
Quote:
Original post by V-man
There is no such thing as 8 bit float. vec3 and float3 are 3 component 32 bit floats. half3 is 16 bit float, 3 components.

http://www.opengl.org/registry/specs/ARB/half_float_pixel.txt
describes the half float

Anyway, go ahead and use half, half2, half3, half4 with GLSL. Just don't define a version number at the top of your shader like this
#version 110

otherwise the compiler thinks you want the basic GLSL support and so half is not legal.
So if you don't add the version, it just gives warnings in the log.


From reading about nvidia's glsl (cg) compiler that will only work on nvidia. And I belive they are becoming more strict on that. It is, apperently, due to the fact that nvidia takes some libertys when the compleation if you do not definine a version number, and adds support for they're CG libary.

For those googling and still want to do this you can do something like
#ifndef __GLSL_CG_DATA_TYPES
# define half2 vec2
# define half3 vec3
# define half4 vec4
#endif

While this would allow the code to be run on ATI and NVIDIA cards, I would rather not do this..

also, info is from download.nvidia.com/developer/presentations/GDC_2004/gdc_2004_NV_ GLSL.pdf

Share this post


Link to post
Share on other sites
GLSL1.3 has support for half.
GF6x,7x,8x indeed do half3 normalization in one cycle, or well at least much faster than with float3.

Almost everyone makes different runtime paths for different generations and vendors of vcards. Or you dumb down and slow down everything to a common denominator.

The true and only fast way to to upload uniforms in bulk is glProgramLocalParameters4fvEXT. Requires you ignore GLSL and use ARB asm (compiled via cgc.exe from Cg or nVidia-GLSL). ARB asm is simplistic, limited to SM2.0. NV asm is extremely powerful and supports everything NV gpus have, but there's no ATi asm to complement that.

On RadeonHD and GF8x, there are bindable uniforms. (I'm not sure whether ATi expose them yet, though). They are simply buffer-objects (use same API as VBOs and PBOs), that you write data to, and bind to a shader (much like using glUniform1i).

There was a recent discussion on the opengl forum, that glUniform is too slow in nV drivers; so maybe the best thing to do is provide NV-asm version of your shader for NV, and standard GLSL for ATi/Intel.

Share this post


Link to post
Share on other sites
Quote:
Original post by idinev
GLSL1.3 has support for half.
GF6x,7x,8x indeed do half3 normalization in one cycle, or well at least much faster than with float3.

Almost everyone makes different runtime paths for different generations and vendors of vcards. Or you dumb down and slow down everything to a common denominator.

The true and only fast way to to upload uniforms in bulk is glProgramLocalParameters4fvEXT. Requires you ignore GLSL and use ARB asm (compiled via cgc.exe from Cg or nVidia-GLSL). ARB asm is simplistic, limited to SM2.0. NV asm is extremely powerful and supports everything NV gpus have, but there's no ATi asm to complement that.

On RadeonHD and GF8x, there are bindable uniforms. (I'm not sure whether ATi expose them yet, though). They are simply buffer-objects (use same API as VBOs and PBOs), that you write data to, and bind to a shader (much like using glUniform1i).

There was a recent discussion on the opengl forum, that glUniform is too slow in nV drivers; so maybe the best thing to do is provide NV-asm version of your shader for NV, and standard GLSL for ATi/Intel.


Wow, thank you a lot for that information. Thinking about it a fast way (tho prob not to go with the bandwidth ) seems like I could make a light pass that renders data to a MRT, and then just throw that into a 3ed shader that uses those to calculate the lighting.. only issue with that is I belive my card only supports 4 sampler2D's :/ ill need to check. Wont matter for the finaly product tho becouse im going to make this a openGL 3.0 app with 1.3 shaders, just gotta upgrade my computer so I can stick a GF8 or GF9 in here..

I will check out the asm tho, just dont want to have to do a whole lot of extra work making code for a card (ATI) that I dont have. I tend to forget I do such things so if a bug on ATI cards where to pop up a few months later, I would be like wtf? its a driver issue. lol

Share this post


Link to post
Share on other sites

This topic is 3291 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this