CrashyCartman

Members
  • Content count

    25
  • Joined

  • Last visited

Community Reputation

605 Good

About CrashyCartman

  • Rank
    Member

Personal Information

  • Interests
    Art
    Programming
  1. OpenGL Low framerate during first seconds

    Alright, here are the various causes I had some extra large (8k) textures I used to generate small sdf font textures, but never removed them after sdf generation. My shadow maps are packed into a single big GL_R32F atlas, and the depth buffer of this FBO is as large as the atlas, so it's a huge amount of memory. Correct me if I'm wrong, but I think if I use GL_DEPTH_COMPONENT32F as a format I'll reduce memory used by 2. Thanks again for pointing me in the right direction.
  2. OpenGL Low framerate during first seconds

    Running in 720p, and yes, I find this crazy to have so much vram used, should have tracked that before. I use only a few small textures for the moment, no reason to have so much memory used. Thanks for your help, I'll check what's wront and post the solution here
  3. OpenGL Low framerate during first seconds

    Alright here are the results of my quick benchmarks made using GPU-Z to track gpu memory. During loading and the first erratic frames, gpu memory is full (1000MB). After that, it falls down to around 800 MB, framerate is stable but low. And if after a few seconds framerate increases, GPU memory goes to around 950MB, if not, it stays at 800MB/low fps forever. So there is a real correlation between both memory usage & performance. I tried changing the format of the few FLOAT16 RGBA FBOs I use to RGBA8 in order to reduce size, but it didn't change anything to memory usage, which is strange. System memory shows the same relationship between memory & framerate. 1: first run, app runs fine at 70 fps 2: second run, app is stuck at 12 fps. 3: third run, same as 1 As you can see, the difference is small (100 MB) but noticeable. Thanks for trying to help me
  4. OpenGL Low framerate during first seconds

    Thanks for the answer:) I know driver must be busy after resource loading, and could accept having some long frames at start, but in this case, 33% of time I've got a game completely unplayable. I also made tests removing all of my threads, just to see if it was a concurrency issue, and unfortunately it didn't change anything. Everything is in c++, no managed code. I may add that if I run the exe by doing "Nsight/Start Graphics Debugging", everything is ok, of course I have some hiccups at start but then, game runs smooth every time.
  5. Hi gamedev ! I'm facing a really strange performance issue occurring during the first seconds of my app. When launching the app, three scenarios: Some erratic frames at the very beginning, then 70+fps. Some erratic frames at the very beginning, then ~12fps, then, after a few seconds, BAM, 70+fps. Some erratic frames at the very beginning, then stuck at ~12fps. Camera is static, so is the scene. I used nsight to try figure out what's happening, especially in case 2 where I can see what are the differences. When it's rendering at ~12fps, some glDrawRangeElementsBaseVertex/glDrawArrays calls takes a lot of time, some SwapBuffers calls too, and after switching to "cruise speed", those calls are taking a more moderate amount of time: Here is a capture of a nsight session timeline: 1->erratic framerate at the beginning 2->KICKSTART ! 3->stable and high framerate Memory usage is the same in all cases. CPU Usage is a bit higher when stuck at ~12fps, but nothing remarkable. My scene is really simple (only a few cubes), but I use a lot of post process (SSAO, SSR). Reducing the framebuffer size to a smaller one doesn't change anything (although top speed is faster). I also tried to disable all my post process, issue is still there but case 1 is the most common. it is really similar to what's described in this post, although I'm not using vsync at all (double checked code, and also forced vsync off with the nvidia control panel) nor SDL. https://www.opengl.org/discussion_bo...-slow-on-start I also tried to put some glFlush()/glFinish() here and there in my code, without success. A few specifications: OS: Windows 7 CPU: Core I7 6700 GPU: Nvidia 560Ti Drivers: 385.41 I'm stuck with this issue for months now, it's driving me crazy Thanks for any help/advice/question/test to make
  6. Sharpen shader performance

    Yup, I'm definitely going ot integrate it. Readme says it's not 100% compatible with GL330+, but I keep my fingers crossed.   However, I've the also bad habit to try to understand how things work. :D
  7. Sharpen shader performance

      I tried that too after reading some positive feedbacks about it, but without success.       Yeah, this is practically beyond capacity of an indie studio..       I'd be interested to know how a compute shader would behaves in that case.   Some news about the issue: Using the AMD Shader Analyser and analyzing both looped and unrolled versions of the shader, I saw that both were almost identical. One thing though: The unrolled version has only 5 texture lookups instead of 9. Why ? Because some of the "kernel" array values are just "0.0f" and the compiler strip the related texture lookups. But the assembly code shows that AMD compiler is able to unroll the loop by itself.   So, after removing them, I ran both versions too see how they behave: -Loop version: 1.7 ms -Unrolled version : 0.24 ms   Conclusion: no real changes on nVidia, except that both run a little bit faster. Is there a tool similar to AMD Shader Analyzer, but for nVidia gpus ?
  8. Sharpen shader performance

    I thought cg development was stopped a few years ago, they still use it as a transition language ?!
  9. Sharpen shader performance

    Thanks for the hint, I'm not targeting mobile platforms for now but it's always good to know such things.
  10. Sharpen shader performance

    Yeah, I saw that just after my reply. But too bad that's vendor specific.   Well, it might be the reason why a lot of people tends to use GLSLOptimizer. Thanks.
  11. Sharpen shader performance

    Ok, good news !   I've manually unrolled the loop to: vec4 result = vec4(0.0);     vec4 color = textureLod(resultSampler, Texcoord + offset[0]*screenSize.zw,0);     result += color * kernel[0];     color = textureLod(resultSampler, Texcoord + offset[1]*screenSize.zw,0);     result += color * kernel[1];     color = textureLod(resultSampler, Texcoord + offset[2]*screenSize.zw,0);     result += color * kernel[2];     color = textureLod(resultSampler, Texcoord + offset[3]*screenSize.zw,0);     result += color * kernel[3];     color = textureLod(resultSampler, Texcoord + offset[4]*screenSize.zw,0);     result += color * kernel[4];     color = textureLod(resultSampler, Texcoord + offset[5]*screenSize.zw,0);     result += color * kernel[5];     color = textureLod(resultSampler, Texcoord + offset[6]*screenSize.zw,0);     result += color * kernel[6];     color = textureLod(resultSampler, Texcoord + offset[7]*screenSize.zw,0);     result += color * kernel[7];     color = textureLod(resultSampler, Texcoord + offset[8]*screenSize.zw,0);     result += color * kernel[8]; And now the shader execution time is 0.325 ms ! Isn't the nVidia glsl compiler able to unroll this by itself ? In hlsl there is the [unroll] hint, is there anything similar in glsl ?
  12. Hi, After running my app through NSight, I've seen that the image sharpening shader I use takes around 3 ms to process a 1600x1200 texture. The GPU is a nVidia GTX 560 Ti.   This seems quite high to me, but I do not know why it takes so much time. The shader code, nothing complicated : #version 330 uniform sampler2D     sceneSampler; uniform vec4  screenSize; in vec2 Texcoord; out vec4 oColor; float kernel[9]=float[9] (  0.0, -1.0,  0.0, -1.0,  5.0, -1.0,  0.0, -1.0,  0.0 ); vec2 offset[9]=vec2[9] (     vec2(-1.0, -1.0),     vec2( 0.0, -1.0),     vec2( 1.0, -1.0),     vec2(-1.0,  0.0),     vec2( 0.0,  0.0),     vec2( 1.0,  0.0),     vec2(-1.0,  1.0),     vec2( 0.0,  1.0),     vec2( 1.0,  1.0) ); void main() {     vec4 result= vec4(0.0);     int i;     for (i = 0; i < 9; i++)     {             vec4 color = textureLod(sceneSampler, Texcoord + offset[i]*screenSize.zw,0);             result += color * kernel[i];     }     oColor=result; } One strange thing: if I use change the arrays to const, the shader execution times increases to 12 ms ! I've read such things about glsl shaders, but does anybody have an explanation ?   Thanks for any help/ hint.
  13. [solved]Irradiance cube map rendering

    Thank you, I'm reading it !
  14. [EDIT]Nevermind, I've found the answer by reading one more time the linked paper.     Hi Gamedev,   I'm currently implementing ambient lighting using spherical harmonics, inspired by this paper, and I'm looking for some details. Basically, I don't understand how they do the real time relighting stuff described in slides 7 to 10 Right now I'm rendering a bunch of environment cube maps with my scene rendered with only the textures color, no lighting at all.   Then I compute the spherical harmonics for the cube maps, and the final lighting shader looks like this: //final lighting code float4 finalColor = irradiance*texture + NDotL * shadows * texture; This gives nice results but I don't think this is the right way to do it, and I should render my scene with full shadowing, using something like: //cubemap render code float4 color = NDotL * shadows * texture; Which seems to be a better way, as lit zones will "generate" more irradiance than unlit ones. And then change the final lighting equation to something like that: //final lighting code float4 finalColor = texture+irradiance+NDotL * shadows * texture;   But this bothers me because if the light direction changes, I need to recompute all the cube maps, which is not "real time friendly". So what am I missing ?   Thank you very much for any help !
  15. Landing gear problem

    Thank you guys, I've changed the force direction, now it is the same as the "contact" normal, and it works like a charm.   The torque thing was already done :)