Sign in to follow this  
fir

OpenGL 300 000 fps

Recommended Posts

fir    460

i never run newer version of opengl, back then i was used only OGl 1 and i never got fps higher than about 900 for a some cube test with that, but yesterday i did it with my new OGL/freeglut framevork and it seems that i got 300 000 fps - at least the succesive timer cals in the called in display method raports that delta time 3 microsecond on each display - is this really flushing 300 thousands of screens per second or I mestaken something (something asynchronous is called or something) and should measure it in a differend way?

 

Share this post


Link to post
Share on other sites
haegarr    7372
and should measure it in a differend way?

This in the first place! FPS is a reciprocal measure and as such not useful if the range becomes bigger than some ten FPS, perhaps up to e.g. 100 FPS or so. The 900 is already a less meaningful number. Its better to use a linear measure: Compute the mean time per frame over a couple of frames as an absolute measure, and perhaps a percentage between such values for a comparative one.

 

Also: A cube is most probably not a meaningful test at all. In a real world you may encounter several limits: DMA transfer, texture sampling, pixel fill rate, shader complexity, …; none of them is in peril with a cube example (assuming you don't mean a Koch cube ;)).

 

Regarding the question of performance boost itself: OpenGL 1 is very, VERY old. None of the (more or less) modern techniques was supported. If you use a modern OpenGL it is much more adapted to existing graphics cards. So yes, it is principally possible, of course.

Edited by haegarr

Share this post


Link to post
Share on other sites
Waterlimon    4398

Try measuring the FPS across multiple frames, it might be that the timer resolution is not enough for the tiny duration of a single frame and is for some reason giving results smaller than reality.

Share this post


Link to post
Share on other sites
fir    460

Try measuring the FPS across multiple frames, it might be that the timer resolution is not enough for the tiny duration of a single frame and is for some reason giving results smaller than reality.

 

no the timer is good, i think that it gives good results it is when i have

 

IdleLoop()

{

  double timeDelta = TakeTime();

  displayScene();

 

}

 

the time delta (3 microseconds) is given properly but I am maybe not sure if the DisplayScene here is doing all the work of making whole

new pixelbuffer and show it or maybe not?

 

the display code itself is

 

 
void draw()
{
    glEnableClientState(GL_NORMAL_ARRAY);
    glEnableClientState(GL_COLOR_ARRAY);
    glEnableClientState(GL_VERTEX_ARRAY);
    glNormalPointer(GL_FLOAT, 0, normals2);
    glColorPointer(3, GL_FLOAT, 0, colors2);
    glVertexPointer(3, GL_FLOAT, 0, vertices2);
 
    glPushMatrix();
 
    glDrawElements(GL_TRIANGLES, 36, GL_UNSIGNED_BYTE, indices);
 
    glPopMatrix();
 
    glDisableClientState(GL_VERTEX_ARRAY);  
    glDisableClientState(GL_COLOR_ARRAY);
    glDisableClientState(GL_NORMAL_ARRAY);
}
 
 
 
////////////////////////////////////////
 
void display()
{
    frame++;
    glClear(GL_COLOR_BUFFER_BIT);
    glPushMatrix();
    glRotatef(frame/1000 , 1, 0, 0);   
    draw();
    glPopMatrix();
    glFlush();
}
 

 

can it be so fast? does it really whole frame generation (as a result

i see rotating rectangle (some tears on the sufface too) but ofc i cannot be sure if this is 300 tys frames per second or maybe just 300 or so

Share this post


Link to post
Share on other sites

You are measuring the speed at which you can submit render commands, not the speed at which your scene is drawn and displayed. Basically what you measure is the memcpy that OpenGL does on your vertex array (to a vertex buffer that you don't know about) when you call glDrawElements, plus the overhead of a dozen library calls. It's not very surprising that this is fast.

 

You are not swapping buffers, so there is really no notion of a "frame" at all. You do call glFlush, but that isn't the same thing (for the most part, glFlush is pretty useless).

Share this post


Link to post
Share on other sites
fir    460

You are measuring the speed at which you can submit render commands, not the speed at which your scene is drawn and displayed. Basically what you measure is the memcpy that OpenGL does on your vertex array (to a vertex buffer that you don't know about) when you call glDrawElements, plus the overhead of a dozen library calls. It's not very surprising that this is fast.

 

You are not swapping buffers, so there is really no notion of a "frame" at all. You do call glFlush, but that isn't the same thing (for the most part, glFlush is pretty useless).

atlight tnx, (i suspected thah things releted to asynchronicity) so how to measure real frame making speed - i am using ogl + freeglut, but not got much experience with this yet

 

I used glFinish and got only 9000fps :C

 

with  glutSwapBuffers() i got  75 fps-es (oscillating 74.7 - 75.3)

this is my screen refresh rate setting 

 

I use lcd right now - has this monitor the refresh rate like crt ones- 

it is better to set 60 or 75?

Edited by fir

Share this post


Link to post
Share on other sites

glFinish is much closer to what one would want to use since it blocks until all command execution has finished (glFlush doesn't wait for anything). It comes with a serious performance impact, however, since it causes a pipeline stall.

 

glutSwapBuffers, on the other hand, is the real, true thing. It actually swaps buffers, so there is really a notion of "frame". It also blocks, but synchronized to the actual hardware update frequency, and in a somewhat less rigid way (usually drivers will let you pre-render 2 or 3 frames or will only block at the next draw command after swap, or something else).

The reason why you only see 75 fps is that you have vertical sync enabled (in your driver settings). If you can "comfortably" get those 75 fps at all times (i.e. your frame time (worst, not average) is below 13.3 ms), it doesn't really matter how much faster you can render since that's all the monitor will display anyway. Rendering more frames than those displayed is only a waste of energy (and wearing down components due to heat development).

 

Now of course, if you only ever get at most 75 (or 60 on other monitors) frames per second displayed, it seems a bit hard to measure the actual frame time accurately. You might have a frame time of 13.3 ms or 10ms or 8ms and it would be no difference since it all comes out as 75fps because the driver syncs to that after finishing your drawing commands

 

glQueryCounter can be of help here. It lets you get accurate timing without having to stall as when using glFinish. So you can measure the actual time it takes to draw, regardless of how long the driver blocks thereafter to sync.

 

(Another less elegant but nevertheless effective solution would be to disable vertical sync during development.)

Edited by samoth

Share this post


Link to post
Share on other sites
fir    460

glFinish is much closer to what one would want to use since it blocks until all command execution has finished (glFlush doesn't wait for anything). It comes with a serious performance impact, however, since it causes a pipeline stall.

 

glutSwapBuffers, on the other hand, is the real, true thing. It actually swaps buffers, so there is really a notion of "frame". It also blocks, but synchronized to the actual hardware update frequency, and in a somewhat less rigid way (usually drivers will let you pre-render 2 or 3 frames or will only block at the next draw command after swap, or something else).

The reason why you only see 75 fps is that you have vertical sync enabled (in your driver settings). If you can "comfortably" get those 75 fps at all times (i.e. your frame time (worst, not average) is below 13.3 ms), it doesn't really matter how much faster you can render since that's all the monitor will display anyway. Rendering more frames than those displayed is only a waste of energy (and wearing down components due to heat development).

 

Now of course, if you only ever get at most 75 (or 60 on other monitors) frames per second displayed, it seems a bit hard to measure the actual frame time accurately. You might have a frame time of 13.3 ms or 10ms or 8ms and it would be no difference since it all comes out as 75fps because the driver syncs to that after finishing your drawing commands

 

glQueryCounter can be of help here. It lets you get accurate timing without having to stall as when using glFinish. So you can measure the actual time it takes to draw, regardless of how long the driver blocks thereafter to sync.

 

(Another less elegant but nevertheless effective solution would be to disable vertical sync during development.)

alright tnx for explanation, i disabled vsync in nvconsole (do not know why perprogram not working but global disable working) and got about 9000 fps with swap buffer close to the same as with glFinish

 

yet some doubt if glFlush is not drawing all the calls what it is doing with such calls? skips or queues?

Share this post


Link to post
Share on other sites
mhagain    13430

glFlush is supposed to mean "start processing all pending GL commands now and return immediately".  It doesn't wait for the commands to finish processing, it just signals to the driver that it can start processing them.  There are actually a lot of implicit glFlush cases in normal code, with the most obvious one being when the command buffer is full - the driver must start emptying the buffer before new commands can go in.

 

I see that Carmack has noted on his Twitter that with some drivers glFlush is a nop.  If this is the case, then calling glFlush at the end of a frame (or wherever in the frame) will have no effect and the actual flush won't occur until the command buffer fills.  Depending on how much work you do in a frame, and on how big the command buffer is (that's driver-dependent so don't ask) it means that you may get 10, 20, or even hundreds of frames worth of commands in there before anything actually happens.

 

It's easy to see how this kind of behaviour can seriously mislead you into thinking that you're running crazy-fast.  A large part of the blame here must seriously go to old GLUT tutorials that always create a single-buffered context.  That's just so unrepresentative of how things work in real programs.

Share this post


Link to post
Share on other sites
Dark Helmet    173

glutSwapBuffers, on the other hand, is the real, true thing. It actually swaps buffers, so there is really a notion of "frame". It also blocks, but synchronized to the actual hardware update frequency, and in a somewhat less rigid way (usually drivers will let you pre-render 2 or 3 frames or will only block at the next draw command after swap, or something else).



When timing with just SwapBuffers though, be careful. The problem ends up being the driver typically queues up the request quickly on the CPU and returns immediately (i.e. CPU does not block), after which it lets you start queuing up render commands for future frames. At some random point in the middle of queuing one of those frames when the FIFO fills, "then" the CPU blocks, waiting on some VSYNC event in the middle of a frame. This causes really odd timing spikes leaving you puzzled as to what's going on.

If you want reasonable full-frame timings, after SwapBuffers(), put a glFinish(), and then stop your timer.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By markshaw001
      Hi i am new to this forum  i wanted to ask for help from all of you i want to generate real time terrain using a 32 bit heightmap i am good at c++ and have started learning Opengl as i am very interested in making landscapes in opengl i have looked around the internet for help about this topic but i am not getting the hang of the concepts and what they are doing can some here suggests me some good resources for making terrain engine please for example like tutorials,books etc so that i can understand the whole concept of terrain generation.
       
    • By KarimIO
      Hey guys. I'm trying to get my application to work on my Nvidia GTX 970 desktop. It currently works on my Intel HD 3000 laptop, but on the desktop, every bind textures specifically from framebuffers, I get half a second of lag. This is done 4 times as I have three RGBA textures and one depth 32F buffer. I tried to use debugging software for the first time - RenderDoc only shows SwapBuffers() and no OGL calls, while Nvidia Nsight crashes upon execution, so neither are helpful. Without binding it runs regularly. This does not happen with non-framebuffer binds.
      GLFramebuffer::GLFramebuffer(FramebufferCreateInfo createInfo) { glGenFramebuffers(1, &fbo); glBindFramebuffer(GL_FRAMEBUFFER, fbo); textures = new GLuint[createInfo.numColorTargets]; glGenTextures(createInfo.numColorTargets, textures); GLenum *DrawBuffers = new GLenum[createInfo.numColorTargets]; for (uint32_t i = 0; i < createInfo.numColorTargets; i++) { glBindTexture(GL_TEXTURE_2D, textures[i]); GLint internalFormat; GLenum format; TranslateFormats(createInfo.colorFormats[i], format, internalFormat); // returns GL_RGBA and GL_RGBA glTexImage2D(GL_TEXTURE_2D, 0, internalFormat, createInfo.width, createInfo.height, 0, format, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); DrawBuffers[i] = GL_COLOR_ATTACHMENT0 + i; glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, textures[i], 0); } if (createInfo.depthFormat != FORMAT_DEPTH_NONE) { GLenum depthFormat; switch (createInfo.depthFormat) { case FORMAT_DEPTH_16: depthFormat = GL_DEPTH_COMPONENT16; break; case FORMAT_DEPTH_24: depthFormat = GL_DEPTH_COMPONENT24; break; case FORMAT_DEPTH_32: depthFormat = GL_DEPTH_COMPONENT32; break; case FORMAT_DEPTH_24_STENCIL_8: depthFormat = GL_DEPTH24_STENCIL8; break; case FORMAT_DEPTH_32_STENCIL_8: depthFormat = GL_DEPTH32F_STENCIL8; break; } glGenTextures(1, &depthrenderbuffer); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); glTexImage2D(GL_TEXTURE_2D, 0, depthFormat, createInfo.width, createInfo.height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, depthrenderbuffer, 0); } if (createInfo.numColorTargets > 0) glDrawBuffers(createInfo.numColorTargets, DrawBuffers); else glDrawBuffer(GL_NONE); if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) std::cout << "Framebuffer Incomplete\n"; glBindFramebuffer(GL_FRAMEBUFFER, 0); width = createInfo.width; height = createInfo.height; } // ... // FBO Creation FramebufferCreateInfo gbufferCI; gbufferCI.colorFormats = gbufferCFs.data(); gbufferCI.depthFormat = FORMAT_DEPTH_32; gbufferCI.numColorTargets = gbufferCFs.size(); gbufferCI.width = engine.settings.resolutionX; gbufferCI.height = engine.settings.resolutionY; gbufferCI.renderPass = nullptr; gbuffer = graphicsWrapper->CreateFramebuffer(gbufferCI); // Bind glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo); // Draw here... // Bind to textures glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, textures[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, textures[1]); glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, textures[2]); glActiveTexture(GL_TEXTURE3); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); Here is an extract of my code. I can't think of anything else to include. I've really been butting my head into a wall trying to think of a reason but I can think of none and all my research yields nothing. Thanks in advance!
    • By Adrianensis
      Hi everyone, I've shared my 2D Game Engine source code. It's the result of 4 years working on it (and I still continue improving features ) and I want to share with the community. You can see some videos on youtube and some demo gifs on my twitter account.
      This Engine has been developed as End-of-Degree Project and it is coded in Javascript, WebGL and GLSL. The engine is written from scratch.
      This is not a professional engine but it's for learning purposes, so anyone can review the code an learn basis about graphics, physics or game engine architecture. Source code on this GitHub repository.
      I'm available for a good conversation about Game Engine / Graphics Programming
    • By C0dR
      I would like to introduce the first version of my physically based camera rendering library, written in C++, called PhysiCam.
      Physicam is an open source OpenGL C++ library, which provides physically based camera rendering and parameters. It is based on OpenGL and designed to be used as either static library or dynamic library and can be integrated in existing applications.
       
      The following features are implemented:
      Physically based sensor and focal length calculation Autoexposure Manual exposure Lense distortion Bloom (influenced by ISO, Shutter Speed, Sensor type etc.) Bokeh (influenced by Aperture, Sensor type and focal length) Tonemapping  
      You can find the repository at https://github.com/0x2A/physicam
       
      I would be happy about feedback, suggestions or contributions.

    • By altay
      Hi folks,
      Imagine we have 8 different light sources in our scene and want dynamic shadow map for each of them. The question is how do we generate shadow maps? Do we render the scene for each to get the depth data? If so, how about performance? Do we deal with the performance issues just by applying general methods (e.g. frustum culling)?
      Thanks,
       
  • Popular Now