Sign in to follow this  
uselessChiP

OpenGL My opengl program is really slow, what's the problem?

Recommended Posts

uselessChiP    165

Hi, i'm a beginner with opengl, now i'm trying to load in my program few 3d models and to use some shaders but the problem is that is all already too slow.

I'm loading 2 models, one of them has 39282 vertices and 69451 faces, i know it's much but i thought having only 2 models would not have been problem.

It takes a lot for loading the models and then I have an average of 13 fps (i know fps are not so accurate but for now I think for me are enough) in the scene.

This is my scene:

OuTYyO5.jpg

 

When I load the models, i use assimp for copying the vertices, normals, uvs and faces in some std::vectors (the ones of my model class), i create the vbos and vao for them and i store the model information in another vector inside my scene object, then when i draw the scene, i scan all the models i have and and use this code for drawing:

(note: this is the code in my renderer class so in the upper section i have the vbo and vao set up i talked about earlier and in the lower the drawing section.)

#include "renderer.h"
#include <math.h>

Renderer::Renderer(Mesh* mesh, Material* material) :
	vao(0),
	vertexBuffer(0),
	uvBuffer(0),
	_mesh(mesh),
	_material(material)
{
	glGenVertexArrays(1, &vao);
	glBindVertexArray(vao);

	//indexBuffer
	glGenBuffers(1, &indexBuffer);
	glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer);
	glBufferData(GL_ELEMENT_ARRAY_BUFFER, mesh->faces().size() * sizeof(unsigned int), &mesh->faces()[0], GL_STATIC_DRAW);

	//vertexbuffer
	glGenBuffers(1, &vertexBuffer);
	glBindBuffer(GL_ARRAY_BUFFER, vertexBuffer);
	glBufferData(GL_ARRAY_BUFFER, mesh->numVertices() * sizeof(glm::vec3), &mesh->vertices()[0], GL_STATIC_DRAW);
	
	glEnableVertexAttribArray(0);

	glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0);
	/*
	//uvbuffer
	glGenBuffers(1, &uvBuffer);
	glBindBuffer(GL_ARRAY_BUFFER, uvBuffer);
	glBufferData(GL_ARRAY_BUFFER, mesh->uvs().size() * sizeof(glm::vec2), &mesh->uvs()[0], GL_STATIC_DRAW);
	
	glEnableVertexAttribArray(1);
	glVertexAttribPointer(1, 2, GL_FLOAT, GL_TRUE, 0, 0);
	*/
	//normalbuffer

	glGenBuffers(1, &normalBuffer);
	glBindBuffer(GL_ARRAY_BUFFER, normalBuffer);
	glBufferData(GL_ARRAY_BUFFER, mesh->normals().size() * sizeof(glm::vec3), &mesh->normals()[0], GL_STATIC_DRAW);

	glEnableVertexAttribArray(2);
	glVertexAttribPointer(2, 3, GL_FLOAT, GL_TRUE, 0, 0);

	glBindVertexArray(0);
}



void Renderer::draw(Camera* camera,  glm::mat4 model, Light* light) {
	_material->shader()->use();
	
	_material->shader()->setUniform("camera", camera->matrix());
	_material->shader()->setUniform("model", model);
	//_material->shader()->setUniform("view", camera->view());
	
//	_material->shader()->setUniform("tex", 0); //to use GL_TEXTURE0
	_material->shader()->setUniform("roughness", _material->roughness());
	float Ks = ((_material->roughness()+8)/(8*3.141592));
	_material->shader()->setUniform("Kd", _material->color() / 3.141592f);
	_material->shader()->setUniform("Ks", _material->specularColor() * Ks);

	_material->shader()->setUniform("lightPos", light->_position);
	_material->shader()->setUniform("intensity", light->_intensity * light->_color);
	//_material->shader()->setUniform("light.attenuation", light->_attenuation);
	//_material->shader()->setUniform("light.ambientCoefficient", light->_ambientCoefficient);
	
	_material->shader()->setUniform("cameraPos", camera->position());
	/*
	
	glActiveTexture(GL_TEXTURE0);
	glBindTexture(GL_TEXTURE_2D, _material->texture()->getID());
	*/
	//Bind vao and draw
	glBindVertexArray(vao);
	//glDrawArrays(GL_TRIANGLES, 0, _mesh->numVertices());
	glDrawElements(GL_TRIANGLES, _mesh->faces().size(), GL_UNSIGNED_INT, 0);

	glBindVertexArray(0);
	glBindTexture(GL_TEXTURE_2D, 0);

	_material->shader()->stopUsing();
}

(some of the code is commented because i'm doing some tests)

 

It's my rendering code bad? Can you say me what's the problem, why is everything so slow?


Thanks in advance for the help

 

 

Share this post


Link to post
Share on other sites
NumberXaero    2624

Just going on whats been posted its difficult to tell (what hardware is it running on?), but two areas I would look at, 1 try using unsigned short for the index type if possible. 2, your shaders are being set by string, not sure how thats done, is it doing a lookup every frame? are you using the string to get the uniform index every frame? if so, try getting the uniform location once, when the shader loads and links.

Share this post


Link to post
Share on other sites
uselessChiP    165

I think the hardware it's fast enough, i have an amd hd 7950 gpu and a intel i5 3570k cpu both overclocked.

1) thanks for the advice, i will use unsigned short for the indices, in this test my model has more faces than the maximum unsigned short supported but i think i will have lighter models so it should work.

2) the drawing method is called one time per frame per model in the scene,

the shader-> use() is as follows:


void Program::use() const {
	glUseProgram(_ID);
}

 

 

where ID is a GLuint representing the shader compiled and linked.

 

This is how the setUniform() works:

GLint Program::uniform(const GLchar* uniformName) const {
	if(!uniformName)
		throw std::runtime_error("uniformName is NULL");

	GLint uniform = glGetUniformLocation(_ID, uniformName);
	if(uniform == -1)
		throw std::runtime_error(std::string("Program uniform not found: ") + uniformName);

	return uniform;
}

void Program::setUniform(const GLchar* uniformName, GLint v0) {
	assert(isInUse());
	glUniform1i(uniform(uniformName), v0);
}

void Program::setUniform(const GLchar* uniformName, const glm::mat4& m) {
	assert(isInUse());
	glUniformMatrix4fv(uniform(uniformName), 1, GL_FALSE, glm::value_ptr(m));
}

void Program::setUniform(const GLchar* uniformName, const glm::vec3& v) {
	assert(isInUse());
	glUniform3fv(uniform(uniformName), 1, glm::value_ptr(v));
}

void Program::setUniform(const GLchar* uniformName, GLfloat v0) {
	assert(isInUse());
	glUniform1f(uniform(uniformName), v0);
}

 

and these are the other parts used in the method:

 

bool Program::isInUse() const {
	GLint currentProgram = 0;
	glGetIntegerv(GL_CURRENT_PROGRAM, &currentProgram);
	return (currentProgram == (GLint)_ID);
}

void Program::stopUsing() const {
	assert(isInUse());
	glUseProgram(0);
}

 

So yes, for every model, every frame i set and unset the program to be used and set the uniform variables. If this is the problem, where it's better to set the uniforms?

Share this post


Link to post
Share on other sites
NumberXaero    2624

There have been times ive found glGetUniformLocation being called too often during run time to be a problem, after building the shader get the uniform location and save it, look it up locally at run time and by pass calling glGetUniformLocation at run time, unless the shader changes dynamically the location shouldnt change.

 

Also, you may not always want to consider -1 locations as incorrect, -1 uniform locations can mean a few things. If the shader doesnt make use of a uniform, either it was optimized out or commented out but still declared, the shader may run fine even with a uniform having a -1 location, it doesnt always mean a the uniform had a typo or wasnt found.

 

my model has more faces than the maximum unsigned short supported

 

the index buffer indexes vertices, not faces, many faces may share the same vertex, reducing the vertex count, the whole purpose of index buffers. So if the numbers from your first post are correct the vertex count easily fits in unsigned short.

 

 

In general dont call the "glGet*()" series of commands as they stall rendering to retrieve the state being asked for.

So assert(isInUse()); calls glGetInteger() which is probably the problem, comment all those out.

Edited by NumberXaero

Share this post


Link to post
Share on other sites
uselessChiP    165

Hi, sorry for the late reply.

So i tried to change the code so the get uniform location is called only once(i was confused about the getuniformlocation, i did not realize that it actully gives me the uniform position, so thanks now it's more clear), in the constructor of my program class like this:

Program::Program() : 
	_ID(0)
{
	_ID = LoadShaders("shaders/Blinn-Phong.vs", "shaders/Blinn-Phong.fs");
	_cam = uniform("camera");
	_mod = uniform("model");
	_rough = uniform("roughness");
	_kd = uniform("Kd");
	_ks = uniform("Ks");
	_lightPos = uniform("lightPos");
	_int = uniform("intensity");
	_camPos = uniform("cameraPos");
}

GLint Program::uniform(const GLchar* uniformName) const {
	if(!uniformName)
		throw std::runtime_error("uniformName is NULL");

	GLint uniform = glGetUniformLocation(_ID, uniformName);
	if(uniform == -1)
		throw std::runtime_error(std::string("Program uniform not found: ") + uniformName);

	return uniform;
}

 

(the uniform == -1 is still there but i'll change it)

 

and then set the uniforms without getting again the location: 

GLint Program::uniform(ShProp type) const {
	switch(type) {
	case CAMERA:
		return _cam;
	case MODEL:
		return _mod;
	case ROUGHNESS:
		return _rough;
	case KD:
		return _kd;
	case KS:
		return _ks;
	case LIGHTPOS:
		return _lightPos;
	case INTENSITY:
		return _int;
	case CAMERAPOS:
		return _camPos;
	default: return -1;
	}
}

void Program::setUniform(ShProp type, GLint v0) {
	glUniform1i(uniform(type), v0);
}

void Program::setUniform(ShProp type, const glm::mat4& m) {
	glUniformMatrix4fv(uniform(type), 1, GL_FALSE, glm::value_ptr(m));
}

void Program::setUniform(ShProp type, const glm::vec3& v) {
	glUniform3fv(uniform(type), 1, glm::value_ptr(v));
}

void Program::setUniform(ShProp type, GLfloat v0) {
	glUniform1f(uniform(type), v0);
}

 

(for now the code is like this because i'm testing)

 

I also removed the asserts with the glGetIntegerv call but i'm continuing to have 13 fps, and this despite where i'm looking, they never go down or up.

So I have the same problem.

 

As for the index buffer part, I tried using unsigned shorts but after that my rabbit mesh was all messed up so i changed it back to unsigned int.

Edited by uselessChiP

Share this post


Link to post
Share on other sites
NumberXaero    2624

If its not caused by the state queries from glGet*() calls, then it probably related to the drawing calls, unless its something somewhere else.

 

When you changed to unsigned short, did you also make sure to change all the GL_UNSIGNED_INT's to GL_UNSIGNED_SHORT and all the sizeof(unsigned int) across all the buffer setup calls and draw calls?

 

You said the fps are not accurate in your first post, is the scene sluggish? can you profile where the time is being consumed? if not, make sure the fps are correct so you have a solid value to go by.

 

You might try bypassing vaos and just binding buffers and enabling arrays at draw time. I had a problem once with vertex array objects causing problems, it was related to the way the gl functions were being loaded by glew, it required glewExperimental set to true before calling glewInit(), not sure if its was on amd or nvidia. Im not sure how youre loading gl calls, maybe its relevant. Valve software released a doc "Porting source to linux" where they said vaos were slower then glVertexAttribPointer on all implementations.

 

Depending on your targeted GL version you might want to look at the gl_vertex_attrib_binding extension.

Share this post


Link to post
Share on other sites
dpadam450    2357

Comment out the draw code except for swapBuffers() or whatever you call when done drawing. Sounds like you have a Sleep() call in your application. Or you are doing something very stupid each frame like re-loading your models.

 

glGet commands are not even close to that slow and you only have a few models. You should be fine. Is the framerate actually choppy or are you calculating it yourself (it could be wrong try FRAPS).

Share this post


Link to post
Share on other sites
uselessChiP    165

@NumberXaero

if for drawing calls you mean the glDrawElements, i call them like this:

 

this is the main loop:

double lastTime = glfwGetTime();
	do {

		glClearColor(0.0f, 0.4f, 0.0f,1.0f);
		glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);

		double thisTime = glfwGetTime();
		float secondElapsed = thisTime - lastTime;
		lastTime = thisTime;

		update(secondElapsed);

		draw();

		glfwSwapBuffers();

	} while(!glfwGetKey(GLFW_KEY_ESC) && glfwGetWindowParam(GLFW_OPENED) && !glfwGetKey('P'));

 

this is the method draw(that draw every object in the scene, 2 in this case)

void draw() {
	scene->draw();
}

 

and this is the scene->draw() implementation:

void Scene::draw() {
	for(unsigned int i = 0; i < _model.size(); i++) {
		_model[i]->renderer()->draw(_camera[_activeCamera], _model[i]->transform(),  _light[0]);
	}
}

 

the renderer()->draw... is the one i posted above:

void Renderer::draw(Camera* camera,  glm::mat4 model, Light* light) {
	_material->shader()->use();

	_material->shader()->setUniform(CAMERA, camera->matrix());
	_material->shader()->setUniform(MODEL, model);
	
//	_material->shader()->setUniform("tex", 0); //to use GL_TEXTURE0

	_material->shader()->setUniform(ROUGHNESS, _material->roughness());
	float Ks = ((_material->roughness()+8)/(8*3.141592));
	_material->shader()->setUniform(KD, _material->color() / 3.141592f);
	_material->shader()->setUniform(KS, _material->specularColor() * Ks);
	
	_material->shader()->setUniform(LIGHTPOS, light->_position);
	_material->shader()->setUniform(INTENSITY, light->_intensity * light->_color);
	//_material->shader()->setUniform("light.attenuation", light->_attenuation);
	//_material->shader()->setUniform("light.ambientCoefficient", light->_ambientCoefficient);
	
	_material->shader()->setUniform(CAMERAPOS, camera->position());
	/*
	
	glActiveTexture(GL_TEXTURE0);
	glBindTexture(GL_TEXTURE_2D, _material->texture()->getID());
	*/
	//Bind vao and draw
	glBindVertexArray(vao);
	//vertexbuffer
	//glDrawArrays(GL_TRIANGLES, 0, _mesh->numVertices());
	glDrawElements(GL_TRIANGLES, _mesh->faces().size(), GL_UNSIGNED_INT, 0);

	//glBindVertexArray(0);
	//glBindTexture(GL_TEXTURE_2D, 0);

	_material->shader()->stopUsing();
}

 

 

As for fps, im using fraps to measure them so i think they are accurate and also i feel that all is slow moving around the scene.

For the last part(the one bypassing vaos) im not sure to understand what you mean also my target platform for now is windows and i'm also using glewExperimental.

 

 

 

@dpadam450

I don't think i have a sleep call, commenting the draw code makes the program go fast but this also happens when im not loading the rabbit, as for the loading the model every frame i think im loading them once: im loading the vertices, normals uv coordinates once in some vectors, then i bind once the data to some vertex buffers in my drawing code im only setting the uniforms binding the vao and drawing.

Share this post


Link to post
Share on other sites
Dave Hunt    4872

You are passing the glm::mat4 parameter to your draw method by value. I should think that was a bad idea. You should pass it by reference, otherwise the mat4 object gets copied to the stack on every call. Whether that's a major contributor to your performance issues or not, only profiling would say for sure, but it's a good idea in any case.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By Kjell Andersson
      I'm trying to get some legacy OpenGL code to run with a shader pipeline,
      The legacy code uses glVertexPointer(), glColorPointer(), glNormalPointer() and glTexCoordPointer() to supply the vertex information.
      I know that it should be using setVertexAttribPointer() etc to clearly define the layout but that is not an option right now since the legacy code can't be modified to that extent.
      I've got a version 330 vertex shader to somewhat work:
      #version 330 uniform mat4 osg_ModelViewProjectionMatrix; uniform mat4 osg_ModelViewMatrix; layout(location = 0) in vec4 Vertex; layout(location = 2) in vec4 Normal; // Velocity layout(location = 3) in vec3 TexCoord; // TODO: is this the right layout location? out VertexData { vec4 color; vec3 velocity; float size; } VertexOut; void main(void) { vec4 p0 = Vertex; vec4 p1 = Vertex + vec4(Normal.x, Normal.y, Normal.z, 0.0f); vec3 velocity = (osg_ModelViewProjectionMatrix * p1 - osg_ModelViewProjectionMatrix * p0).xyz; VertexOut.velocity = velocity; VertexOut.size = TexCoord.y; gl_Position = osg_ModelViewMatrix * Vertex; } What works is the Vertex and Normal information that the legacy C++ OpenGL code seem to provide in layout location 0 and 2. This is fine.
      What I'm not getting to work is the TexCoord information that is supplied by a glTexCoordPointer() call in C++.
      Question:
      What layout location is the old standard pipeline using for glTexCoordPointer()? Or is this undefined?
       
      Side note: I'm trying to get an OpenSceneGraph 3.4.0 particle system to use custom vertex, geometry and fragment shaders for rendering the particles.
    • By markshaw001
      Hi i am new to this forum  i wanted to ask for help from all of you i want to generate real time terrain using a 32 bit heightmap i am good at c++ and have started learning Opengl as i am very interested in making landscapes in opengl i have looked around the internet for help about this topic but i am not getting the hang of the concepts and what they are doing can some here suggests me some good resources for making terrain engine please for example like tutorials,books etc so that i can understand the whole concept of terrain generation.
       
    • By KarimIO
      Hey guys. I'm trying to get my application to work on my Nvidia GTX 970 desktop. It currently works on my Intel HD 3000 laptop, but on the desktop, every bind textures specifically from framebuffers, I get half a second of lag. This is done 4 times as I have three RGBA textures and one depth 32F buffer. I tried to use debugging software for the first time - RenderDoc only shows SwapBuffers() and no OGL calls, while Nvidia Nsight crashes upon execution, so neither are helpful. Without binding it runs regularly. This does not happen with non-framebuffer binds.
      GLFramebuffer::GLFramebuffer(FramebufferCreateInfo createInfo) { glGenFramebuffers(1, &fbo); glBindFramebuffer(GL_FRAMEBUFFER, fbo); textures = new GLuint[createInfo.numColorTargets]; glGenTextures(createInfo.numColorTargets, textures); GLenum *DrawBuffers = new GLenum[createInfo.numColorTargets]; for (uint32_t i = 0; i < createInfo.numColorTargets; i++) { glBindTexture(GL_TEXTURE_2D, textures[i]); GLint internalFormat; GLenum format; TranslateFormats(createInfo.colorFormats[i], format, internalFormat); // returns GL_RGBA and GL_RGBA glTexImage2D(GL_TEXTURE_2D, 0, internalFormat, createInfo.width, createInfo.height, 0, format, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); DrawBuffers[i] = GL_COLOR_ATTACHMENT0 + i; glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, textures[i], 0); } if (createInfo.depthFormat != FORMAT_DEPTH_NONE) { GLenum depthFormat; switch (createInfo.depthFormat) { case FORMAT_DEPTH_16: depthFormat = GL_DEPTH_COMPONENT16; break; case FORMAT_DEPTH_24: depthFormat = GL_DEPTH_COMPONENT24; break; case FORMAT_DEPTH_32: depthFormat = GL_DEPTH_COMPONENT32; break; case FORMAT_DEPTH_24_STENCIL_8: depthFormat = GL_DEPTH24_STENCIL8; break; case FORMAT_DEPTH_32_STENCIL_8: depthFormat = GL_DEPTH32F_STENCIL8; break; } glGenTextures(1, &depthrenderbuffer); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); glTexImage2D(GL_TEXTURE_2D, 0, depthFormat, createInfo.width, createInfo.height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, depthrenderbuffer, 0); } if (createInfo.numColorTargets > 0) glDrawBuffers(createInfo.numColorTargets, DrawBuffers); else glDrawBuffer(GL_NONE); if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) std::cout << "Framebuffer Incomplete\n"; glBindFramebuffer(GL_FRAMEBUFFER, 0); width = createInfo.width; height = createInfo.height; } // ... // FBO Creation FramebufferCreateInfo gbufferCI; gbufferCI.colorFormats = gbufferCFs.data(); gbufferCI.depthFormat = FORMAT_DEPTH_32; gbufferCI.numColorTargets = gbufferCFs.size(); gbufferCI.width = engine.settings.resolutionX; gbufferCI.height = engine.settings.resolutionY; gbufferCI.renderPass = nullptr; gbuffer = graphicsWrapper->CreateFramebuffer(gbufferCI); // Bind glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo); // Draw here... // Bind to textures glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, textures[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, textures[1]); glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, textures[2]); glActiveTexture(GL_TEXTURE3); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); Here is an extract of my code. I can't think of anything else to include. I've really been butting my head into a wall trying to think of a reason but I can think of none and all my research yields nothing. Thanks in advance!
    • By Adrianensis
      Hi everyone, I've shared my 2D Game Engine source code. It's the result of 4 years working on it (and I still continue improving features ) and I want to share with the community. You can see some videos on youtube and some demo gifs on my twitter account.
      This Engine has been developed as End-of-Degree Project and it is coded in Javascript, WebGL and GLSL. The engine is written from scratch.
      This is not a professional engine but it's for learning purposes, so anyone can review the code an learn basis about graphics, physics or game engine architecture. Source code on this GitHub repository.
      I'm available for a good conversation about Game Engine / Graphics Programming
    • By C0dR
      I would like to introduce the first version of my physically based camera rendering library, written in C++, called PhysiCam.
      Physicam is an open source OpenGL C++ library, which provides physically based camera rendering and parameters. It is based on OpenGL and designed to be used as either static library or dynamic library and can be integrated in existing applications.
       
      The following features are implemented:
      Physically based sensor and focal length calculation Autoexposure Manual exposure Lense distortion Bloom (influenced by ISO, Shutter Speed, Sensor type etc.) Bokeh (influenced by Aperture, Sensor type and focal length) Tonemapping  
      You can find the repository at https://github.com/0x2A/physicam
       
      I would be happy about feedback, suggestions or contributions.

  • Popular Now