Sign in to follow this  

GLSL: What code should we move (precalculate) from fragment to vertex shaders?

This topic is 3459 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

In the past week I've been reading about GLSL and I've seen many examples of code around the web. I'm under the impression that for a triangle (for example) the vertex shader runs 3 times and the fragment more than 3 depending of the triangle's size in the screen. Is that correct? Im asking this because if the above is correct it would be more optimized to precalculate as much as we can in the vertex shader and then pass the data in the fragment using varying vars. Am I right or am missing something here? Thanks in advance! [Edited by - Godlike on June 25, 2008 12:08:18 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Godlike
In the past week I've been reading about GLSL and I've seen many examples of code around the web. I'm under the impression that for a triangle (for example) the fragment shader runs 3 times and the fragment more than 3 depending of the triangle's size in the screen. Is that correct?

Did you mean the vertex shader? The vertex shader will run 3 times, and the fragment shader for as many fragments that "may" end up on the screen (and I say "may" because even if they are occluded by some other object, they will still be processed).
Quote:

Im asking this because if the above is correct it would be more optimized to precalculate as much as we can in the vertex shader and then pass the data in the fragment using varying vars.


Am I right or am missing something here?
Thanks in advance!

Yup, you're right. The point is that you can only interpolate things linearly from the vertex shader down to the pixel shader, so not everything is suitable for calculating inside the vertex shader. For example, if you had a gigantic quad (only 4 vertices) you wanted to light, then calculating the eye and light vectors inside the vertex shader would be a big no-no.

Share this post


Link to post
Share on other sites
first of all thanks for the reply.

You talked about interpolation in fragment shader. Interpolation happens when we use normalize() ?? For example I want to calc the eye. I write in the vertex shader:

vert_pos = (gl_ModelViewMatrix * gl_Vertex).xyz;
eye = -vert_pos;

and in the fragment shader:

vec3 eye_n = normalize( eye );

Is this the optimal way? Does normalize() in the fragment shader interpolates the vector or its a wast of GPU cycles?

Share this post


Link to post
Share on other sites
Quote:
Original post by Godlike
first of all thanks for the reply.

You talked about interpolation in fragment shader. Interpolation happens when we use normalize() ?? For example I want to calc the eye. I write in the vertex shader:

vert_pos = (gl_ModelViewMatrix * gl_Vertex).xyz;
eye = -vert_pos;

and in the fragment shader:

vec3 eye_n = normalize( eye );

Is this the optimal way? Does normalize() in the fragment shader interpolates the vector or its a wast of GPU cycles?


No, what I meant is that when you pass some data down from the vertex shader to the fragment shader, it gets linearly interpolated between all of the "sharing" vertices. For example, for a triangle, if each vertex was a different colour, then the result fragment would be a blend of the 3 colours, linearly interpolated based on its distance from the "parent" vertices. Same goes for texture coordinates, etc.

Your above example would be very much exact, since you're passing the eye-space position/direction unnormalized to get interpolated down to the fragments, and manually normalizing it on the fragment level. If you were to normalize it on the vertex level, and simply use that without normalized it in the fragment shader, then it would not be as exact. Which is why it's fine when the camera is not close to something, since it looks "good enough".

Share this post


Link to post
Share on other sites
Do not forget Early-Z culling! It skips computation of fragments that won't be visible. And speeds-up things like occlusion-culling.

The decision is simple when your pixel-shaders are not simplistic.

const int numVertexProcessors = 8;
const int numFragmentProcessors = 24;

int NumVerticesInScene=3000000;
int ScreenWidth=1280,ScreenHeight=720;

int vertsCycles=23; // gpu cycles your vertex shaders take on average
int fragsCycles=107; // same, for fragment shaders


int vertsPerProc = NumVerticesInScene/numVertexProcessors;
int fragsPerProc = ScreenWidth*ScreenHeight/numFragmentProcessors;

if(vertsPerProc*vertsCycles < fragsPerProc*fragsCycles){
make your vertex-shaders compute more
}else{
make your vertex-shaders as slim as possible
}

As the tendency in newer games is to have more vertices onscreen than on-screen pixels, and meanwhile make fragment-shaders very intensive to compute lighting, and all decent GPUs have Early-Z, it's best to first draw the scene-depth with a simplified vtx shader, "gl_FragColor=0" frag shader, and color-writes disabled. Then, draw the scene again, and only visible fragments will be processed. Well, I'm not sure whether if you use dFdx/dFdy in your shader, that a whole 2x2 or 3x3 block of processors will be forced to compute the other nearby 3 or 8 fragments even if they'll be culled. (but it's usually not too bad in the worst case, as in most cases in real scenes at least half of the block isn't z-culled).

P.S. just see the "occlusion engine" at http://www.delphi3d.net/listfiles.php?category=5 , to see early-z culling in its might (except for they don't use complex frag-shaders, so a depth-pass is unnecessary and not used).

Share this post


Link to post
Share on other sites
Great, thanks for the replies. I have a final question. I have a program that emulates the phong shading. Somewhere I have to compute the parts of the colors:

vec4 diffuse = gl_FrontMaterial.diffuse * gl_LightSource[0].diffuse;
vec4 ambient = gl_FrontMaterial.ambient * gl_LightSource[0].ambient;
vec4 specular = gl_FrontMaterial.specular * gl_LightSource[0].specular;

The choicea are 2. Either to calculate these color vals in the vertex shader and then the fragment will interpolate them (even if the dont need interpolation) OR calculate them in the fragment shader. X number of interpolations or X number of vec4 cross products?


@idinev: I never heard of Early-Z. Its worth taking a look. I had heard that Croteam (serious sam) had a technique for culling using pixel shaders. Is that it?

Share this post


Link to post
Share on other sites
Quote:
Original post by Godlike
Great, thanks for the replies. I have a final question. I have a program that emulates the phong shading. Somewhere I have to compute the parts of the colors:

vec4 diffuse = gl_FrontMaterial.diffuse * gl_LightSource[0].diffuse;
vec4 ambient = gl_FrontMaterial.ambient * gl_LightSource[0].ambient;
vec4 specular = gl_FrontMaterial.specular * gl_LightSource[0].specular;

These are usually constant terms, no need to calculate them in the vertex shader, just calculate them directly in the pixel shader.

Quote:

@idinev: I never heard of Early-Z. Its worth taking a look. I had heard that Croteam (serious sam) had a technique for culling using pixel shaders. Is that it?

Not sure about Serious Sam. All this technique does is render the basic geometric info first, then rerender the scene with colours and let the hardware reject expensive fragment processing based on occluding depths that were rendered in the early-z pass.

Share this post


Link to post
Share on other sites
If you do the lighting calculations inside the pixel-shader, do not forget to normalize the interpolated normal-vector :). (the one normal that the vertex-shader passes as a varying, by multiplying gl_Normal with the normal-matrix). On nVidia cards, if you put the normal in a half3 variable, normalization will take just 1 cycle ;)

AMD's version of early-Z is called HyperZ. You do not need to do anything to enable or use early-Z culling. It's an internal "hidden" feature, just like L1 cache in CPUs is - you can't control it, but it's there, helping you out.
Actually, there is a way to disable temporarily the early-Z culling: in your frag-shader, write to gl_FragDepth. [in other words: don't do this!]

Share this post


Link to post
Share on other sites

This topic is 3459 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this