Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

199 Neutral

About Paxi

  • Rank
  1. Thanks for your input! You may be talking about that slides: http://developer.download.nvidia.com/whitepapers/2011/SLI_Best_Practices_2011_Feb.pdf right? I am quad buffering my queries in the meanwhile (code is not final yet but works for a hardcoded quad buffered example) template<size_t QueryCount> class BufferedAsyncTimer { public: void init() { mFirst = 0; mSecond = 1; mThird = 2; mFourth = 3; glGenQueries(QueryCount, mQueries[mFirst]); glGenQueries(QueryCount, mQueries[mSecond]); glGenQueries(QueryCount, mQueries[mThird]); glGenQueries(QueryCount, mQueries[mFourth]); } void start() { glBeginQuery(GL_TIME_ELAPSED, mQueries[mFirst][0]); } void stop() { glEndQuery(GL_TIME_ELAPSED); } double getElapsedTimeMS() { GLuint64 result; glGetQueryObjectui64v(mQueries[mFourth][0], GL_QUERY_RESULT, &result); if (mFirst == 0) { mFirst = 3; mSecond = 2; mThird = 1; mFourth = 0; } else if(mFirst == 1) { mFirst = 0; mSecond = 3; mThird = 2; mFourth = 1; } else if (mFirst == 2) { mFirst = 1; mSecond = 0; mThird = 3; mFourth = 2; } else if (mFirst == 3) { mFirst = 2; mSecond = 1; mThird = 0; mFourth = 3; } return static_cast<double>(result) / 1000000.0; } private: GLuint mQueries[4][QueryCount]; GLuint mFirst, mSecond, mThird, mFourth; }; The frame time when measuring with AFR enabled is not slower anymore but exact the same speed as with single GPU. However the FPS increases from ~600 (single) to ~930 even when receiving the results (so there should be no stalls anymore). Guess there is still a little thing I am missing...
  2. I am currently doing some measurements of a scene containing about 80k vertices that implements soft shadow mapping using PCF filter. When im measuring the frame time with SLI enabled (using force alternate frame rendering 1) both GPUs are utilized but the performance drops. I can understand that behavior when using a low resolution where I achieve 1000+ FPS due the GPU synchronization overhead. However i dont understand that the performance is still lower when I increase the rendering load and the resolution. I have already took a look into the NVIDIA SLI optimization guide here:http://http.download.nvidia.com/developer/presentations/2005/GDC/OpenGL_Day/OpenGL_SLI.pdfbut there should be no issue concerning the facts from the presentation. Can anyone tell me what I may be missing? Do I maybe need a SLI profile for my application? I am using 2 GF 970GTX in SLI with the latest drivers installed. PS: I am currently trying to look into this using NVIDIA Nsight, however it does not record any GPU frames when I enable force alternate frame rendering 1. PS2: In the meanwhile i figured out that glGetQueryObjectui64v takes more time when SLI is enabled. I am now double buffering my results which gives me likely more accurate results (http://www.lighthouse3d.com/tutorials/opengl-short-tutorials/opengl-timer-query/). I figured that out because the FPS increases well but the frame time does not when using glGetQueryObjectui64v.   ... In the meanwhile I am assuming that the problem is just related to my way of measuring stuff: So I am actually interested in how to correctly measuring frame time with AFR enabled without stalling the GPU/CPU using the Opengl timer_query object.
  3. Actually i have no idea why it works. But it works when i do NOT normalize the view and light direction in the VS. So instead of this: vec3 fvViewDirection = normalize( uCameraDirection - posObject.xyz); vec3 fvLightDirection = normalize( uLightPos.xyz - posObject.xyz ); I got this now: vec3 fvViewDirection = uCameraDirection - posObject.xyz; vec3 fvLightDirection = uLightPos.xyz - posObject.xyz; Here is a final video: http://www.paxi.at/random/final.avi
  4. Unfortunately no change. I tried that already before, the uModel matrix is in fact only a idently matrix at the moment.   Some more suggestions from my side:   I have seen in another algorithm in here (http://www.gamedev.net/page/resources/_/technical/graphics-programming-and-theory/a-closer-look-at-parallax-occlusion-mapping-r3262) that uses a dynamic ammount of samles depending on this formula: int nNumSamples = (int)lerp( nMaxSamples, nMinSamples, dot( E, N ) ); where E = the eye vec in tangent space, and N = the normal in tangent space. Is it possible that i need to handle some special cases with "my" approach? Is there maybe a problem about transforming my eye direction vector in tangent space? Everything is directly transformed from worldspace to tangent space.
  5. When I´m flipping the binormal, the effect is more visibile at the other side of the quad. In general it seems to me, that the uv coordinates are "moved" a little bit when im getting closer and therefore the texture is not applied correctly anymore.   [attachment=21258:5.jpg]   I also added a video http://paxi.at/random/sfg_com.avi
  6. Thank for your suggestion. My camera direction was really inverted, but I invert it in the TraceRay method again ( vec2 dUV = - dir.xy * height * 0.08; dir is the cameradirection in tangent space. ). I tried changing to  normalize( posObject.xyz - uCameraDirection ); and without inverting in the TraceRay method but I get the same result unfortunately :(
  7. Im trying to implement a per-pixel displacement shader in GLSL. I read through several papers and "tutorials" I found and ended up with trying to implement the approach NVIDIA used in their Cascade Demo (http://www.slideshare.net/icastano/cascades-demo-secrets) starting at Slide 82. At the moment I am completly stuck with following problem: When I am far away the displacement seems to work. But as more I move closer to my surface, the texture gets bent in x-axis and somehow it looks like there is a little bent in general in one direction. I added some screen to illustrate the problem bellow.   EDIT: I added a video http://paxi.at/random/sfg_com.avi   [attachment=21194:1.jpg] [attachment=21195:2.jpg] [attachment=21196:3.jpg] [attachment=21197:4.jpg] Well I tried lots of things already and I am starting to get a bit frustrated as my ideas run out. I added my full VS and FS code: VS: #version 400 layout(location = 0) in vec3 IN_VS_Position; layout(location = 1) in vec3 IN_VS_Normal; layout(location = 2) in vec2 IN_VS_Texcoord; layout(location = 3) in vec3 IN_VS_Tangent; layout(location = 4) in vec3 IN_VS_BiTangent; uniform vec3 uLightPos; uniform vec3 uCameraDirection; uniform mat4 uViewProjection; uniform mat4 uModel; uniform mat4 uView; uniform mat3 uNormalMatrix; out vec2 IN_FS_Texcoord; out vec3 IN_FS_CameraDir_Tangent; out vec3 IN_FS_LightDir_Tangent; void main( void ) { IN_FS_Texcoord = IN_VS_Texcoord; vec4 posObject = uModel * vec4(IN_VS_Position, 1.0); vec3 normalObject = (uModel * vec4(IN_VS_Normal, 0.0)).xyz; vec3 tangentObject = (uModel * vec4(IN_VS_Tangent, 0.0)).xyz; //vec3 binormalObject = (uModel * vec4(IN_VS_BiTangent, 0.0)).xyz; vec3 binormalObject = normalize(cross(tangentObject, normalObject)); // uCameraDirection is the camera position, just bad named vec3 fvViewDirection = normalize( uCameraDirection - posObject.xyz); vec3 fvLightDirection = normalize( uLightPos.xyz - posObject.xyz ); IN_FS_CameraDir_Tangent.x = dot( tangentObject, fvViewDirection ); IN_FS_CameraDir_Tangent.y = dot( binormalObject, fvViewDirection ); IN_FS_CameraDir_Tangent.z = dot( normalObject, fvViewDirection ); IN_FS_LightDir_Tangent.x = dot( tangentObject, fvLightDirection ); IN_FS_LightDir_Tangent.y = dot( binormalObject, fvLightDirection ); IN_FS_LightDir_Tangent.z = dot( normalObject, fvLightDirection ); gl_Position = (uViewProjection*uModel) * vec4(IN_VS_Position, 1.0); } The VS just builds the TBN matrix, from incoming normal, tangent and binormal in world space. Calculates the light and eye direction in worldspace. And finally transforms the light and eye direction into tangent space. FS: #version 400 // uniforms uniform Light { vec4 fvDiffuse; vec4 fvAmbient; vec4 fvSpecular; }; uniform Material { vec4 diffuse; vec4 ambient; vec4 specular; vec4 emissive; float fSpecularPower; float shininessStrength; }; uniform sampler2D colorSampler; uniform sampler2D normalMapSampler; uniform sampler2D heightMapSampler; in vec2 IN_FS_Texcoord; in vec3 IN_FS_CameraDir_Tangent; in vec3 IN_FS_LightDir_Tangent; out vec4 color; vec2 TraceRay(in float height, in vec2 coords, in vec3 dir, in float mipmap){ vec2 NewCoords = coords; vec2 dUV = - dir.xy * height * 0.08; float SearchHeight = 1.0; float prev_hits = 0.0; float hit_h = 0.0; for(int i=0;i<10;i++){ SearchHeight -= 0.1; NewCoords += dUV; float CurrentHeight = textureLod(heightMapSampler,NewCoords.xy, mipmap).r; float first_hit = clamp((CurrentHeight - SearchHeight - prev_hits) * 499999.0,0.0,1.0); hit_h += first_hit * SearchHeight; prev_hits += first_hit; } NewCoords = coords + dUV * (1.0-hit_h) * 10.0f - dUV; vec2 Temp = NewCoords; SearchHeight = hit_h+0.1; float Start = SearchHeight; dUV *= 0.2; prev_hits = 0.0; hit_h = 0.0; for(int i=0;i<5;i++){ SearchHeight -= 0.02; NewCoords += dUV; float CurrentHeight = textureLod(heightMapSampler,NewCoords.xy, mipmap).r; float first_hit = clamp((CurrentHeight - SearchHeight - prev_hits) * 499999.0,0.0,1.0); hit_h += first_hit * SearchHeight; prev_hits += first_hit; } NewCoords = Temp + dUV * (Start - hit_h) * 50.0f; return NewCoords; } void main( void ) { vec3 fvLightDirection = normalize( IN_FS_LightDir_Tangent ); vec3 fvViewDirection = normalize( IN_FS_CameraDir_Tangent ); float mipmap = 0; vec2 NewCoord = TraceRay(0.1,IN_FS_Texcoord,fvViewDirection,mipmap); //vec2 ddx = dFdx(NewCoord); //vec2 ddy = dFdy(NewCoord); vec3 BumpMapNormal = textureLod(normalMapSampler, NewCoord.xy, mipmap).xyz; BumpMapNormal = normalize(2.0 * BumpMapNormal - vec3(1.0, 1.0, 1.0)); vec3 fvNormal = BumpMapNormal; float fNDotL = dot( fvNormal, fvLightDirection ); vec3 fvReflection = normalize( ( ( 2.0 * fvNormal ) * fNDotL ) - fvLightDirection ); float fRDotV = max( 0.0, dot( fvReflection, fvViewDirection ) ); vec4 fvBaseColor = textureLod( colorSampler, NewCoord.xy,mipmap); vec4 fvTotalAmbient = fvAmbient * fvBaseColor; vec4 fvTotalDiffuse = fvDiffuse * fNDotL * fvBaseColor; vec4 fvTotalSpecular = fvSpecular * ( pow( fRDotV, fSpecularPower ) ); color = ( fvTotalAmbient + (fvTotalDiffuse + fvTotalSpecular) ); } The FS implements the displacement technique in TraceRay method, while always using mipmap level 0. Most of the code is from NVIDIA sample and another paper I found on the web, so I guess there cannot be much wrong in here. At the end it uses the modified UV coords for getting the displaced normal from the normal map and the color from the color map. I looking forward for some ideas. Thanks in advance!   UPDATE: I added a video http://paxi.at/random/sfg_com.avi
  8. Thank you very much. I just started with OpenGL but wanted to port my old Windows System class first, I used with my DX9 framework before getting into OpenGL itself so i did not know about that. It seems that immediate mode rendering stopped working with 3.2. Creating a render context >= 3.2 causes my screen to stay black.
  9. I currently have following code:       int pixelFormatIndex = 0;     int pixelCount = 0;     std::vector<int> pixAttribs;          // specify the important attributes for the pixelformat used by OpenGL     int standardAttribs[] = {         WGL_SUPPORT_OPENGL_ARB, 1, // Must support OGL rendering         WGL_DRAW_TO_WINDOW_ARB, 1, // pf that can run a window         WGL_ACCELERATION_ARB, WGL_FULL_ACCELERATION_ARB, // must be HW accelerated         WGL_RED_BITS_ARB, 8,         WGL_GREEN_BITS_ARB, 8,         WGL_BLUE_BITS_ARB, 8,         WGL_ALPHA_BITS_ARB, 8,         WGL_DEPTH_BITS_ARB, 16, // 16 bits of depth precision for window         //WGL_STENCIL_BITS_ARB, 8,         WGL_DOUBLE_BUFFER_ARB, GL_TRUE, // Double buffered context         WGL_PIXEL_TYPE_ARB, WGL_TYPE_RGBA_ARB, // pf should be RGBA type         0}; // NULL termination          pixAttribs.insert(pixAttribs.begin(),standardAttribs,standardAttribs+(sizeof(standardAttribs)/sizeof(int)));          // specify multisampling mode if it has been set     if(mMSMode != SLIMGF_MULTISAMPLING_OFF) {         pixAttribs.push_back(WGL_SAMPLES_ARB);         pixAttribs.push_back(mMSMode);     }     // OpenGL will return the best format for the pixel attributes defined above     BOOL result = wglChoosePixelFormatARB(mDeviceContext, pixAttribs.data(), NULL, 1, &pixelFormatIndex, (UINT*)&pixelCount);     ASSERT(result != false, "wglChoosePixelFormatARB() failed");   The problem is when im using WGL_ACCELERATION_ARB, WGL_FULL_ACCELERATION_ARB my screen just stays black and when im not using it my device created is Version 1.1.   I listed two picks with and without the hw support flag (info obtained by wglGetPixelFormatAttribivARB()) [attachment=13575:hwacc.png] [attachment=13576:nohwacc.png]   The format with hardware support got a 24bit depth buffer and from some trying I thought that may have something to do with the problem allthough i dont know why. Im using the newest version of GLEW that is succesfully intialized before. My GPU is a GTX680.   I currently render with following test code:     glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);     glBegin(GL_TRIANGLES);     glColor3f( 255, 0, 0 ); // red     glVertex3f(-1.0f, -0.5f, -5.0f);          glVertex3f(1.0f, -0.5f, -5.0f);     glVertex3f(0.0f, 0.5f, -5.0f);          glEnd();   I already thought about OpenGL4.2 causing problems with render this old way?   Thanks in advance    
  10. Thanks again. Sounds logical.
  11. I think he ment that you can just replace every LPDIRECT3DTEXTURE9 in your code with IDirect3DTexture9*.   LPDIRECT3DTEXTURE9 is nothing more than a typefef for IDirect3DTexture9*
  12. In the meanwhile I already implemented some MSAA modes with resetting the device. This approach was quite easy, the only thing i had some issues with first, was releasing all  data so Reset didnt give me an INVALIDCALL error.   Although this is enough for me for now, I would still appreciate some more info about the approach with using multiple render targets, concerning post#3 from me :)
  13. In general you write it like this: UINT numPasses = 0; mFX->Begin(&numPasses, 0); .... Like FLEBlanc said, DirectX will fill the variable you are passing in, with the amount of passes.
  14. You are right. Resetting the device seems fine for my purpose. I just realized that i dont have to really recreate the device but rather just pass in the new d3dparameters when resetting. I already got some code for this, so the integration shouldn´t take much time with this approach in addition :) I will still wait at a final answer from MJP regarding the render targets just because of interest. Using the shader for AA looks very interesting as well but Im not that comfortable with HLSL so far so I prefer the solution with resetting/render target for now.
  15. Thanks your for the explonation first. But I´m not sure if I understood this correctly: If i enable MSAA at my backbuffer (i only got one in my current application), I can only choose one mode of MSAA and than copy the content to a render target with no MSAA. But still I cannot switch between different MSAA modes if i understood you correctly. Or do I have to create more than one backbuffer (swap chain) with different MSAA settings set? And because u said "render target textures" before: how exactly is a render target handled within DX? I thought about it in a more abstract way, that more things could belong to the term "render target", a backbuffer but also the top level surface of a texture for example, so i was thinking just about a region of memory on the GPU.   Thanks in advance
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!