• Content count

  • Joined

  • Last visited

Community Reputation

108 Neutral

About Myownbitch

  • Rank
  1. VAO, VBO speed

    [quote name='Aks9' timestamp='1350220125' post='4990027'] Everything said is correct. Binding shader program and VAO should be outside of the loop. Binding VBO is meaningless, since it is part of VBO (as beans222 said). BUT, performance boost is not assured after all this changes since drivers optimize most of the operations. I could firmly claim that NV drivers will not update uniforms if the values are not changed from the previous call. So, dirty-flags and optimized updates are nice but do not necessary lead to higher performance. On the other hand, the number of driver's call certainly can be a bottleneck. [/quote] Uhm... I think I have misunderstood how thinks work. If you have 10 objects that requires 10 different shaders... should I only bind the shader once outside the drawing loop?
  2. VAO, VBO speed

    [quote name='Kaptein' timestamp='1350162456' post='4989889'] 1. 4000 x (bindvertexarray(0) + gluseprogram(0)) = un-needed remove them at the end of the function [/quote] Why is this un-needed? I thought you were supposed to do this to clean things up for the next object, that might not have the same setup and shaders? [quote name='Kaptein' timestamp='1350162456' post='4989889'] 3. why are you creating 6 doubles at the start of the function? they're kinda slow compared to regular float, and unless you can justify their use, replace them with float for the entire program also, remove them from the function, and use the position values directly either in the form of vectors, or matrices heres the thing: the cpu and gpu likes to use its cache ive been told that the cache is 800 times faster than reading from ram, in that respect, creating temp variables (the doubles) is fast once theyve been created you are however creating 2 matrices for 4000 objects, that SLOW and... on top of that, you are rendering ___4000___ objects, without justification! [/quote] Yeah, I agree about the floats and I should probably not update MVP unless something has been updated. However, I tried to disable all code until the shader bind, and that did not increase performance. Still 20fps, so the bottleneck is not in the matrix math. =/ 4000 object might seem a lot without justification... yes. What I am trying to do is converting my engine to opengl 3.2. Before this change the fps was 60, with drawing and updating. The drawing was not a bottleneck, at all. [quote name='Kaptein' timestamp='1350162456' post='4989889'] the gpu likes to render lots of stuff in _one_ call so if there's any way you can remodel a little, and make rendering more composite ie. gather some objects near each other and make them into 1 draw call, so that you perhaps go from 4000 to 1000, or even better 400 calls there's lots of options here! somewhat regretfully, this is the state of the modern day rasterizer it's powerful in parallell, but each core is a little slow, and its very very bandwidth limited (talk less, render more!) by talking i mean sending commands to the GPU, or uploading data (if you upload lots of data in one go, that's usually very fast ) [/quote] One thing that might be an optimization is to draw everything in one grid cell in one draw. If lucky, the objects are scattered over multiple cells. Thanks for that.
  3. VAO, VBO speed

    Hi, I'm currently having some speed problems when drawing with VAO and VBOs. This is my drawing method for every shape: [source lang="cpp"]void Shape::draw() { double posX = drawingState_->getPositionX(); double posY = drawingState_->getPositionY(); double posZ = drawingState_->getPositionZ(); double rotX = drawingState_->getRotationX(); double rotY = drawingState_->getRotationY(); double rotZ = drawingState_->getRotationZ(); glm::mat4 model_matrix = glm::translate(glm::mat4(1.0), glm::vec3(posX, posY, posZ)); model_matrix = glm::rotate(model_matrix, (float)rotX, glm::vec3(1.0,0.0,0.0)); model_matrix = glm::rotate(model_matrix, (float)rotY, glm::vec3(0.0,1.0,0.0)); model_matrix = glm::rotate(model_matrix, (float)rotZ, glm::vec3(0.0,0.0,1.0)); glm::mat4 tempMVP = Camera::projectionViewMatrix() * model_matrix; shader_.bind(); glBindVertexArray(vaoID_); GLuint MVP_ID = glGetUniformLocation(shader_.getProgramHandle(), "MVP_matrix"); glUniformMatrix4fv(MVP_ID, 1, GL_FALSE, glm::value_ptr(tempMVP)); //glDrawArrays(GL_QUADS, 0, 24); //glBindBuffer(GL_ARRAY_BUFFER, vertices_vbo_); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo_); glDrawElements(GL_TRIANGLE_FAN, nIndices_, GL_UNSIGNED_SHORT, (void*)0); glBindVertexArray(0); // Unbind our Vertex Array Object shader_.unbind(); }[/source] Every shape is just a normal circle with 21 vertices. If I remove the above drawing method and only updating the behavior of the shapes the result is 4000 shapes with 60fps (max fps). But if I update and drawing with the method above... I get 20 fps. Is drawing with VAO supposed to be that slow? I'm stuck. Any help appreciated. /F
  4. Debug faster than release

    Quote:Original post by KulSeran Are you using threading? the timing differences between release code running faster than debug code can bring out some really strange bugs if you didn't write proper threading code. Yes, I'm using pthreads to separate the engine into several concurrent components/threads. That was interesting and I will look into it (even if I haven't changed anything with the threading)...
  5. Debug faster than release

    Quote:Original post by Buckeye I'm not a Linux or gcc person so I don't know if it's a possibility, but are you, perhaps, using the debug library for the release exe? Would be very happy if it was that simple. I haven't changed anything about how I build or which libraries...
  6. Hey, I have usually found answers to my questions on google but this problem is so weird that I need some real professional help! I have been developing a concurrent simulation/game engine with C++ in Linux and compiling with gcc. The performance was quite good ~6000 Boids at 50 fps. A week ago I started a larger renovation and now the performance is 81 Boids at 30 fps.... + the debug build is actually Faster than release. My larger headache is that the debug is faster than release. Do anyone have any clue about this?