Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 09 Dec 2012
Offline Last Active Apr 19 2015 02:23 AM

Topics I've Started

Measuring frame time with AFR enabled

27 March 2015 - 07:41 AM

I am currently doing some measurements of a scene containing about 80k vertices that implements soft shadow mapping using PCF filter.

When im measuring the frame time with SLI enabled (using force alternate frame rendering 1) both GPUs are utilized but the performance drops. I can understand that behavior when using a low resolution where I achieve 1000+ FPS due the GPU synchronization overhead. However i dont understand that the performance is still lower when I increase the rendering load and the resolution. I have already took a look into the NVIDIA SLI optimization guide here:http://http.download.nvidia.com/developer/presentations/2005/GDC/OpenGL_Day/OpenGL_SLI.pdfbut there should be no issue concerning the facts from the presentation.

Can anyone tell me what I may be missing? Do I maybe need a SLI profile for my application?

I am using 2 GF 970GTX in SLI with the latest drivers installed.

PS: I am currently trying to look into this using NVIDIA Nsight, however it does not record any GPU frames when I enable force alternate frame rendering 1.

PS2: In the meanwhile i figured out that glGetQueryObjectui64v takes more time when SLI is enabled. I am now double buffering my results which gives me likely more accurate results (http://www.lighthouse3d.com/tutorials/opengl-short-tutorials/opengl-timer-query/). I figured that out because the FPS increases well but the frame time does not when using glGetQueryObjectui64v.



In the meanwhile I am assuming that the problem is just related to my way of measuring stuff:

So I am actually interested in how to correctly measuring frame time with AFR enabled without stalling the GPU/CPU using the Opengl timer_query object.

Per-pixel displacement mapping GLSL

02 May 2014 - 07:08 AM

Im trying to implement a per-pixel displacement shader in GLSL. I read through several papers and "tutorials" I found and ended up with trying to implement the approach NVIDIA used in their Cascade Demo (http://www.slideshare.net/icastano/cascades-demo-secrets) starting at Slide 82.

At the moment I am completly stuck with following problem: When I am far away the displacement seems to work. But as more I move closer to my surface, the texture gets bent in x-axis and somehow it looks like there is a little bent in general in one direction. I added some screen to illustrate the problem bellow.


EDIT: I added a video http://paxi.at/random/sfg_com.avi


Attached File  1.jpg   1.02MB   4 downloads

Attached File  2.jpg   188.9KB   3 downloads

Attached File  3.jpg   152.09KB   2 downloads

Attached File  4.jpg   132.25KB   3 downloads

Well I tried lots of things already and I am starting to get a bit frustrated as my ideas run out.

I added my full VS and FS code:


#version 400

layout(location = 0) in vec3 IN_VS_Position;
layout(location = 1) in vec3 IN_VS_Normal;
layout(location = 2) in vec2 IN_VS_Texcoord;
layout(location = 3) in vec3 IN_VS_Tangent;
layout(location = 4) in vec3 IN_VS_BiTangent;

uniform vec3 uLightPos;
uniform vec3 uCameraDirection;
uniform mat4 uViewProjection;
uniform mat4 uModel;
uniform mat4 uView;
uniform mat3 uNormalMatrix;

out vec2 IN_FS_Texcoord;
out vec3 IN_FS_CameraDir_Tangent;
out vec3 IN_FS_LightDir_Tangent;

void main( void )

   IN_FS_Texcoord = IN_VS_Texcoord;

   vec4 posObject = uModel * vec4(IN_VS_Position, 1.0);
   vec3 normalObject         = (uModel *  vec4(IN_VS_Normal, 0.0)).xyz;
   vec3 tangentObject        = (uModel *  vec4(IN_VS_Tangent, 0.0)).xyz;
   //vec3 binormalObject       = (uModel * vec4(IN_VS_BiTangent, 0.0)).xyz;
   vec3 binormalObject =  normalize(cross(tangentObject, normalObject));

   // uCameraDirection is the camera position, just bad named
   vec3 fvViewDirection  = normalize( uCameraDirection - posObject.xyz);
   vec3 fvLightDirection = normalize( uLightPos.xyz - posObject.xyz );

   IN_FS_CameraDir_Tangent.x  = dot( tangentObject, fvViewDirection );
   IN_FS_CameraDir_Tangent.y  = dot( binormalObject, fvViewDirection );
   IN_FS_CameraDir_Tangent.z  = dot( normalObject, fvViewDirection );

   IN_FS_LightDir_Tangent.x  = dot( tangentObject, fvLightDirection );
   IN_FS_LightDir_Tangent.y  = dot( binormalObject, fvLightDirection );
   IN_FS_LightDir_Tangent.z  = dot( normalObject, fvLightDirection );

   gl_Position = (uViewProjection*uModel) * vec4(IN_VS_Position, 1.0);

The VS just builds the TBN matrix, from incoming normal, tangent and binormal in world space. Calculates the light and eye direction in worldspace. And finally transforms the light and eye direction into tangent space.


#version 400

// uniforms
uniform Light 
    vec4 fvDiffuse;
    vec4 fvAmbient;
    vec4 fvSpecular;

uniform Material {
    vec4 diffuse;
    vec4 ambient;
    vec4 specular;
    vec4 emissive;
    float fSpecularPower;
    float shininessStrength;

uniform sampler2D colorSampler;
uniform sampler2D normalMapSampler;
uniform sampler2D heightMapSampler;

in vec2 IN_FS_Texcoord;
in vec3 IN_FS_CameraDir_Tangent;
in vec3 IN_FS_LightDir_Tangent;

out vec4 color;

vec2 TraceRay(in float height, in vec2 coords, in vec3 dir, in float mipmap){

    vec2 NewCoords = coords;
    vec2 dUV = - dir.xy * height * 0.08;
    float SearchHeight = 1.0;
    float prev_hits = 0.0;
    float hit_h = 0.0;

    for(int i=0;i<10;i++){
        SearchHeight -= 0.1;
        NewCoords += dUV;
        float CurrentHeight = textureLod(heightMapSampler,NewCoords.xy, mipmap).r;
        float first_hit = clamp((CurrentHeight - SearchHeight - prev_hits) * 499999.0,0.0,1.0);
        hit_h += first_hit * SearchHeight;
        prev_hits += first_hit;
    NewCoords = coords + dUV * (1.0-hit_h) * 10.0f - dUV;

    vec2 Temp = NewCoords;
    SearchHeight = hit_h+0.1;
    float Start = SearchHeight;
    dUV *= 0.2;
    prev_hits = 0.0;
    hit_h = 0.0;
    for(int i=0;i<5;i++){
        SearchHeight -= 0.02;
        NewCoords += dUV;
        float CurrentHeight = textureLod(heightMapSampler,NewCoords.xy, mipmap).r;
        float first_hit = clamp((CurrentHeight - SearchHeight - prev_hits) * 499999.0,0.0,1.0);
        hit_h += first_hit * SearchHeight;
        prev_hits += first_hit;    
    NewCoords = Temp + dUV * (Start - hit_h) * 50.0f;

    return NewCoords;

void main( void )
   vec3  fvLightDirection = normalize( IN_FS_LightDir_Tangent );
   vec3  fvViewDirection  = normalize( IN_FS_CameraDir_Tangent );

   float mipmap = 0;

   vec2 NewCoord = TraceRay(0.1,IN_FS_Texcoord,fvViewDirection,mipmap);

   //vec2 ddx = dFdx(NewCoord);
   //vec2 ddy = dFdy(NewCoord);

   vec3 BumpMapNormal = textureLod(normalMapSampler, NewCoord.xy, mipmap).xyz;
   BumpMapNormal = normalize(2.0 * BumpMapNormal - vec3(1.0, 1.0, 1.0));

   vec3  fvNormal         = BumpMapNormal;
   float fNDotL           = dot( fvNormal, fvLightDirection ); 

   vec3  fvReflection     = normalize( ( ( 2.0 * fvNormal ) * fNDotL ) - fvLightDirection ); 
   float fRDotV           = max( 0.0, dot( fvReflection, fvViewDirection ) );

   vec4  fvBaseColor      = textureLod( colorSampler, NewCoord.xy,mipmap);

   vec4  fvTotalAmbient   = fvAmbient * fvBaseColor; 
   vec4  fvTotalDiffuse   = fvDiffuse * fNDotL * fvBaseColor; 
   vec4  fvTotalSpecular  = fvSpecular * ( pow( fRDotV, fSpecularPower ) );

   color = ( fvTotalAmbient + (fvTotalDiffuse + fvTotalSpecular) );


The FS implements the displacement technique in TraceRay method, while always using mipmap level 0. Most of the code is from NVIDIA sample and another paper I found on the web, so I guess there cannot be much wrong in here. At the end it uses the modified UV coords for getting the displaced normal from the normal map and the color from the color map.

I looking forward for some ideas. Thanks in advance!


UPDATE: I added a video http://paxi.at/random/sfg_com.avi

getting pixel format with wglChoosePixelFormatARB

10 February 2013 - 04:24 PM

I currently have following code:




int pixelFormatIndex = 0;
    int pixelCount = 0;

    std::vector<int> pixAttribs;
    // specify the important attributes for the pixelformat used by OpenGL
    int standardAttribs[] = {
        WGL_SUPPORT_OPENGL_ARB, 1, // Must support OGL rendering
        WGL_DRAW_TO_WINDOW_ARB, 1, // pf that can run a window
        WGL_RED_BITS_ARB, 8,
        WGL_GREEN_BITS_ARB, 8,
        WGL_BLUE_BITS_ARB, 8,
        WGL_ALPHA_BITS_ARB, 8,
        WGL_DEPTH_BITS_ARB, 16, // 16 bits of depth precision for window
        //WGL_STENCIL_BITS_ARB, 8,
        WGL_DOUBLE_BUFFER_ARB, GL_TRUE, // Double buffered context
        WGL_PIXEL_TYPE_ARB, WGL_TYPE_RGBA_ARB, // pf should be RGBA type
        0}; // NULL termination
    // specify multisampling mode if it has been set

    // OpenGL will return the best format for the pixel attributes defined above
    BOOL result = wglChoosePixelFormatARB(mDeviceContext, pixAttribs.data(), NULL, 1, &pixelFormatIndex, (UINT*)&pixelCount);
    ASSERT(result != false, "wglChoosePixelFormatARB() failed");

The problem is when im using WGL_ACCELERATION_ARB, WGL_FULL_ACCELERATION_ARB my screen just stays black and when im not using it my device created is Version 1.1.


I listed two picks with and without the hw support flag (info obtained by wglGetPixelFormatAttribivARB())

Attached File  hwacc.png   1.87KB   45 downloads

Attached File  nohwacc.png   2.51KB   39 downloads


The format with hardware support got a 24bit depth buffer and from some trying I thought that may have something to do with the problem allthough i dont know why.

Im using the newest version of GLEW that is succesfully intialized before.

My GPU is a GTX680.


I currently render with following test code:





    glColor3f( 255, 0, 0 ); // red
    glVertex3f(-1.0f, -0.5f, -5.0f);
    glVertex3f(1.0f, -0.5f, -5.0f);
    glVertex3f(0.0f, 0.5f, -5.0f);

I already thought about OpenGL4.2 causing problems with render this old way?


Thanks in advance



DirectX 9 and Multisampling

03 January 2013 - 07:45 AM

Hey guys,


Im currently looking for a good implementation for changing the multisampling rate at runtime. I would be fine if i can switch between the modes listed in here: http://www.nvidia.com/object/coverage-sampled-aa.html. (got a GTX680)
I thought about creating multiple render targets, with different mutlisampling mode/quality and depending on the desired AA mode set on of these as render target and finally copy its content to the backbuffer. I already started with this approach but atm im very unsure if thats a good solution respectively if its even possible to work.
I wanted to use 


and then either use either

UpdateSurface(...) StretchRect(...) GetRenderTargetData(...)

to copy the contents from the Rendertarget in the backbuffer.


Now after I spend some time with this approach, first I thougt that its not really good to copy the content from the rendertarget to the backbuffer at each frame. And secondly, more important: all this functions have listed under the remarks that the surface must not have a multisampling type or it wont work. (which would definately "kill" this approach for me)


Another idea of mine was, just resetting/recreating the device every time the user wants to change the AA mode and fill in the new device parameter with the new multisampling type. But I guess, this is a very bad solution.


Now is it possible to do something like with my first apprach with different render targets to get this working? Or if not, how would one in general do something like this? I unfortunately havent found any code samples on this. Im using HLSL by the way, probably i can also control the Mulisampling rate within my shader. Performance in general is no issue for me, its just a little techdemo.