# OpenGL OpenGL 3.0+ And VAOs

This topic is 1358 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I'm finally learning OpenGL above 2.1, which would require some extra driver knowledge in Linux, and a Hail Mary from Apple in regards to OpenGL 4.0 and above. What I'm wondering is: what are the important features differences between OpenGL 3.x and OpenGL 4.x? My guess is that OpenGL 4.2 (I think) provides geometry shaders which allows for hardware batching. In other words, if I had a model of a character, I could render dozens of instances of that character in 1 draw call per mesh in that model. Another are compute shaders in OpenGL 4.3 which is a nice replacement for OpenCL, more of an equivalent to DirectX 11, and possibly VERY useful for processing audio samples for interesting DSP effects that'd typically be handled by the motherboard's audio hardware. There are also 3D textures, and better techniques for rendering volumetric clouds, from what I've heard. Are there any other interesting features to look out for while I learn about OpenGL?

Now, how do vertex array objects (VAOs) work, exactly? From what I've read so far, they preserve vertex state, and by vertex state, I think it means the state of which vertex arrays are enabled. For example, if I have a model that is composed of 5 meshes, and the vertex format for all meshes are the same: position, texture coordinate and normal. So, when setting up my vertex array, I'd generate a VAO, bind to it, enable the first 3 vertex attribute arrays, then unbind. Now, when I wanted to draw the model, I'd just bind to that VAO again, bind my VBOs containing references to the vertex data, and call glDrawElements(). I no longer need to call glEnableVertexAttribArray() or glDisableVertexAttribArray() whenever I draw something because the VAO I've just bound preserves which vertex arrays to enable/disable --effectively batching, or rather, caching those calls into a single gl* call.

Then, there are VBOs... VBOs are completely separate from VAOs. A VBO must be generated per vertex attribute, whether they're separate arrays, or interleaved via structures, blobs, etc. Then, I may have an IBO (index buffer array) if my vertices are indexed, but again, has nothing to do with VAOs. VAOs only cache which vertex attribute arrays are enabled. Is that correct?

NOTE: If this is correct, would it make sense to no store VAOs on a per-model basis, but at a graphics context basis instead. If I have 5 different models that all happen to have the same number of vertex attribute arrays enabled, then I'd create 1 VAO that'd enable the first 3 vertex attribute arrays, bound once, render all instances of all 5 models, then bind to another VAO that uses a different number of arrays.

EDIT: I think I just realized something. So, I'd generate a new VAO, then bind it to configure it. At this point, I'd enable all the attribute arrays needed, and then generate, bind and fill my VBOs/IBOs. Then, I'd also setup glVertexAttribPointer() per attribute to specify the starting address for each attribute in the VBO, or VBOs if I'm going the array-per-attribute route. Finally, I'd unbind for safety. Then, when I want to draw something, it's a matter of setting the correct shader, setting the uniforms (probably with UBOs, but I haven't read that far yet), binding the VAO, and then drawing with glDrawArrays() or glDrawElements(). So, VAOs would greatly reduce the amount of gl* calls by caching these commands in a VAO, which serves similarly to a mini-command buffer that could be modified or calls on-the-fly. Which, if this is correct, then do binding VAOs introduce any type of scope for binding VBOs? For example, if I bound a VBO while a VAO is bound, once I bound the current VAO to zero, would it revert the currently-bound VBO to whatever VBO I was bound to when I wasn't in VAO scope? Does all of this sound about right?

Edited by Vincent_M

##### Share on other sites

You don't have to, and in many cases don't want to use UBOs. I think the support is still very shaky, but I actually don't know the specifics. I have limited enviroments to test on.

Here is my VAO implementation:

https://github.com/fwsGonzo/library/blob/master/include/library/opengl/vao.hpp

https://github.com/fwsGonzo/library/blob/master/library/opengl/vao.cpp

Just like you said:

generate VAO

bind VAO

note that you don't want to enable attribs here, because you have no VBO bound

an enabled attrib is bound to the VBO you have bound, which means you can in theory have several VBOs with vertex data

generate VBO & IBO

bind VBO

enable attribs (use offsetof(struct, x))

(potentially bind IBO & upload data)

done. no need to unbind anything.

if you are using a wrapper for VAOs FBOs Textures and Shaders, these wrappers should manage this for you

Note that my implementation isn't 100% perfect. I even spotted grey areas just skimming through it right now, eg. indexes() doesn't do a bind() to make guarantee the IBO bind to work correctly. But it will hopefully give you an idea of how it all works.

When you upload data to a VBO you have a choice between GL_STATIC_DRAW and GL_STREAM_DRAW, the former for when your mesh is static and the latter for when you are re-uploading the data frequently. There are other flags, but afaik the drivers don't care.

So, with all that said, here are some tips:

1. You never really disable an attrib array, as you would just instead use a shader that doesn't utilize the specific attribute.

2. You should avoid unbinding anything, unless you absolutely have to.

3. Don't fall into the immediate mode trap for screenspace shaders, as suddenly glEnable(old_shit) matters, like GL_TEXTURE_2D.

I avoided this trap myself by having a very useful createScreenspace() function in my VAO implementation. :) Laziness > all.

Yes, when you unbind a VAO, you are suddenly back in old/VBO territory with gl*Pointer stuff, I guess. If you are in compatibility mode, like most people are.

Edited by Kaptein

##### Share on other sites

What I'm wondering is: what are the important features differences between OpenGL 3.x and OpenGL 4.x?

The main differences between the latest OpenGL 3 and 4 versions off the top of my head are...

• Support for 64-bit floats (doubles) in shaders
• Separable shader objects - you essentially mix and match shaders in different parts of the graphics pipeline, similar to D3D.
• Direct state access - no longer have to bind-to-edit
• Indirect rendering
• Immutable buffers and textures

Direct state access is part of OpenGL 4.5 which came out just a few weeks ago so unless you have a relatively new Nvidia card, won't be available to you. For rendering multiple of the same model, you would use regular instancing (e.g. glDrawArraysInstanced, glDrawElementsInstanced, etc). With an array of model matrices as uniforms, and the gl_InstanceID variable in your vertex shader, you can then index into the array of matrices to position each instance differently

Now, how do vertex array objects (VAOs) work, exactly? [...] Then, there are VBOs..

Think of VAOs as containers for vertex attributes (and I guess for convenience, an index buffer). Each vertex attribute then describes where to fetch its data from, how much data to read each time, how many bytes to skip between each element, and so forth. And you can have multiple vertex attributes, like your position, texture coordinates, or even arbitrary data that is needed per-vertex (or per-instance*). So the VAO contains all of that information. Every time you bind the VAO, all this information is used in the subsequent draw calls until you bind a different VAO. I have found some drivers are a bit buggy in that they don't keep the index buffer, so you might need to rebind your index buffer every time you bind your VAO as well...

* Vertex attributes can be per-instance by using glVertexAttribDivisor, which tells GL to advance the attribute read-pointer every N instances.

Where do VBOs come into it, you ask? Each vertex attribute has a "data source" which is your VBO, so you can use a single vertex buffer for all your attributes, or use a different vertex buffer for each attribute, or a mixture.

Perhaps beyond the scope of what you need or intend to do (but I'll add it anyway because I think it's something to consider), is a different way of thinking about VAOs which I came across a few months ago [1]. If instead of creating a VAO per-object, you create a VAO per-vertex format, you can reduce the number of glBindVertexArray calls (which in the driver would reduce the number of buffer changes). In order to do this, you would need to create a very large vertex buffer (a few tens of megabytes) and store all your models in this vertex buffer which had the same vertex format. Each model (or model sub-mesh) would then then also need a base vertex, which is the "offset" in the VBO to start rendering from. So instead of Bind VAO, Draw, Bind VAO, Draw, Bind VAO, Draw, you now end up with Bind VAO, Draw, Draw, Draw, which not only cuts your GL calls in half pretty much, but also the number of potential buffer switches.

Eventually, you see the same can be applied to UBOs as well. Create a large UBO, and describe each 'chunk' with an offset and size. You can take it even further, and allocate a single large buffer, and use different ranges of it as your VBO, IBO and UBO! At this point, you're basically managing your own GPU buffer memory

[1] http://www.ogre3d.org/forums/viewtopic.php?p=506783&sid=f629b3848582844ecb131a120ba21659#p506783 The poster, gsellers, is Graham Sellers from AMD.

Edited by Xycaleth

##### Share on other sites

Alright, thanks guys. I think I'm getting the hang of it. I've been busy the last 2 weeks with work and the gym, so I've rarely had the time to reply back, let alone test it out. I was able to try out VAOs and VBOs yesterday, and things are starting to click.

So, with all that said, here are some tips:
1. You never really disable an attrib array, as you would just instead use a shader that doesn't utilize the specific attribute.
2. You should avoid unbinding anything, unless you absolutely have to.
3. Don't fall into the immediate mode trap for screenspace shaders, as suddenly glEnable(old_shit) matters, like GL_TEXTURE_2D.
I avoided this trap myself by having a very useful createScreenspace() function in my VAO implementation. Laziness > all.

Thanks for clarifying about the unbinding part --that makes sense. By "screenspace shader", are you talking about post-processing? Also, does the OpenGL 4.x core spec eventually get rid of glEnable()/glDisable() entirely?

@Xycaleth, you bring up a good point on storing everything on a per-format basis. This could reduce the amount of gl* calls, which is always a good thing. These objects might have to be divided up into draw calls due to other factors such as drawing with/without depth, with/without blending, with/without lighting, etc. Btw,

Perhaps beyond the scope of what you need or intend to do (but I'll add it anyway because I think it's something to consider), is a different way of thinking about VAOs which I came across a few months ago [1]. If instead of creating a VAO per-object, you create a VAO per-vertex format, you can reduce the number of glBindVertexArray calls (which in the driver would reduce the number of buffer changes).

This also brings up another question I was wondering: do VAOs provide more efficiency, or are they there for convenience for programmers? It sounds like VAOs are more of a shortcut for programmers to draw stuff to the screen without having to worry about enabling the correct attribute arrays, setting pointers, binding buffers, etc. Instead, VAOs do that for us, obviously, but under the hood, are VAOs really the equivalent of us doing that ourselves meaning they increasing programmer productivity instead of GPU performance? Or, is it caching the commands in a batched way similar to how GL 4.5's DSM methodology will be taking us?

At this point, I'm all theory though! I've been reading quite a bit online, books and making posts. I really need to make time to sit down, and write code lol.

EDIT: I noticed the Graham Sellers link you posted after writing this, and I'm starting to think that VAOs are in fact what my theory was:

Traditional APIs which generally have a function call per state change encourage bad behavior as seen by the GPU. Wrapping blobs of state into state objects or pushing the work of building them onto other threads only addresses the CPU side of the problem. The GPU still eats the same work. In some cases, it will eat more - the big, monolithic state object approach is likely to push a lot of redundancy into the pipe because a large number of states will be the same between objects.

I should have mentioned this before, but my theory is that if VAOs are merely there for productivity, then there could be more GPU overhead because now you have the VAO buffer that's eating up precious video memory, and yet another buffer swap to deal with, but it should cut down on CPU-side overhead as less gl* calls are being made. Is this correct?

EDIT 2: Back in my OpenGL ES 2.0 days, I didn't really mess with gl buffers much. Now that I've read the Graham Sellers article, I'm starting to realize that they can be looked at as just another memory map. My uber shader methodology, as nasty as it was, sounds it's still the fastest alternative. In fact, it sounds like some of OpenGL 4's features don't really make OpenGL 4 much faster in terms of performance except maybe batching... Does OpenGL 4.3's batching features help with that?

Edited by Vincent_M

##### Share on other sites

Also, does the OpenGL 4.x core spec eventually get rid of glEnable()/glDisable() entirely?

Starting with the introduction of the core spec, some glEnable/glDisable enums are no longer relevant. The reason for this was the move to a programmable pipeline. Take texturing for example. In a fixed function pipeline, you can bind a texture, specify texture coordinates, specify vertex colours, but it's up to you to tell the API whether you want to use texturing by using glEnable(GL_TEXTURE_2D); Compare this with the programmable pipeline: if you don't want to use texturing, then your shaders will not use any texture sampling functions. If you do want to use texturing, then the shaders will use the sampling functions.

This also brings up another question I was wondering: do VAOs provide more efficiency, or are they there for convenience for programmers?

VAOs are purely a software feature (as far as I've seen), that is, the GPU doesn't have any knowledge of them. They're supposed to cut down on time spent validating the vertex attributes, switching buffers, but YMMV. Here's a good write up on when benefits can be seen or not seen: http://www.openglsuperbible.com/2013/12/09/vertex-array-performance/

In fact, it sounds like some of OpenGL 4's features don't really make OpenGL 4 much faster in terms of performance except maybe batching... Does OpenGL 4.3's batching features help with that?

If by batching, you mean instancing, then this is available since 3.1. I'm not sure what else you could mean :P

##### Share on other sites

A minor comment I would add:

There are many new things in OpenGL which help you reduce bugs too. The less states you worry about the better.

Even so, many of these things were already solved by creating your own wrapper classes that deals with all of this, and it continues to be true now.

The new features in 4.x allow more batching, so you have to investigate whether or not you can rewrite parts of your pipeline to utilize these new features, or whether you should just keep using the old proven way. There are some new ways of batching though which I think is easier to (short term) leverage than say going full AZDO approach.

Look at the AZDO presentation (google) to see which order you should render things in, then figure out which features make sense for you and go from there.

Short of using any synchronizing functions (such as glGet*) that stalls the entire pipeline, you're going to be fine. AZDO requires GL 4.4 btw. I think.

Advice about minimizing state changes and batching as much as possible is always true, but it's really only to help programmers make good architectural decisions.

##### Share on other sites

Vincent_M, on 31 Aug 2014 - 3:21 PM, said:
This also brings up another question I was wondering: do VAOs provide more efficiency, or are they there for convenience for programmers?
VAOs are purely a software feature (as far as I've seen), that is, the GPU doesn't have any knowledge of them. They're supposed to cut down on time spent validating the vertex attributes, switching buffers, but YMMV. Here's a good write up on when benefits can be seen or not seen: http://www.openglsuperbible.com/2013/12/09/vertex-array-performance/

I did see that post, and it looks like there are efficiency benefits for VAOs, but if it's just software, then I find it kind of unnecessary outside of it being forced upon you in OpenGL 4.x. My own state manager was a wrapper for whenever I switched FBOs, shader programs, VBOs, textures, glEnable/Disable, and enabling/disabling vertex arrays. The way the vertex array portion worked was that whenever I swapped my shader, and my GraphicsContext class recognized it as swapping to a different shader than the one currently in use, it'd enable/disable the difference vertex arrays from the last bound shader because GraphicsContext also has its own client-side set of bools to keep track of which attribute arrays were currently active internally.

For example, let's just say my currently-bound shader only requires 1 vertex attribute array enabled, so only array 0 would be activated. Then, let's say later on in the frame I need to activate my lit-and-textured shader that takes 3 attribute arrays. It'd activate arrays 1 and 2 only since 0 was already activated. Then, when the next frame is drawn, and I need to go back to the single attribute array shader, it'll swap, and deactivate attribute arrays 1 and 2 all. This is simple to the user drawing something because all they have to do is call GraphicsContext::UseProgram(Shader *shader), and pass in the shader object they require. Now, I'm not sure how efficient the software implementation is, but if my objects were grouped up by shader, then by state, etc you're really not calling glEnableVertexAttribArray()/Disable too much! Now, glVertexAttribArrayPointer() gets called per legit shader swap, however, but there's ways of further optimizing that using the massive VBO buffer mentioned above, and also referenced in Graham Sellers' post above.

Look at the AZDO presentation (google) to see which order you should render things in, then figure out which features make sense for you and go from there.
Short of using any synchronizing functions (such as glGet*) that stalls the entire pipeline, you're going to be fine. AZDO requires GL 4.4 btw. I think.

Ironically, I haven't needed to use any glGet* functions outside of glGetString(GL_VERSION) at startup to print the implementation string for logging purposes. The guys over at Steam mentioned in their video regarding porting their engine over from DirectX to OpenGL that their Source Engine uses glGet* for nearly ever state query they need as they believe that all states systems deviate, at least slightly. I can see how this is true in some cases of the OpenGL State, but when it comes to things, such as glEnable/Disable, writing a wrapper for setting/getting has always worked for me. Of course, my engine only assumes single-context rendering...

But yeah, GraphicsContext::SetGLState(unsigned int state, bool enable) -> pass in ANYTHING, and internally, it'll check if that state's value is in an STL vector already for enabling, or check if does not exist for disabling. If enabling, but the state doesn't exist in the STL vector, then call glEnable, and add it to the vector of states. If disabling, it'll check to see if the state is in the vector, in which case it'll remove it from the STL vector and call glDisable. The method even returns a bool on if it successfully state changes or not. Same with GraphicsContext::UseProgram(Shader *shader), GraphicsContext::SetActiveTexture(int target, Texture *texture), I have one for FBOs, etc.

This cut down quite a bit of gl* calls in generate on mobile devices using OpenGL ES 2.0, and I could assume it'll only do more justice on desktop environments with instancing.

Edited by Vincent_M

• 10
• 17
• 9
• 14
• 41
• ### Similar Content

• Hi all,

I'm trying to generate MIP-maps of a 2D-array texture, but only a limited amount of array layers and MIP-levels.
For instance, to generate only the first 3 MIP-maps of a single array layer of a large 2D-array.

After experimenting with glBlitFramebuffer to generate the MIP-maps manually but still with some sort of hardware acceleration,
I ended up with glTextureView which already works with the limited amount of array layers (I can also verify the result in RenderDoc).
However, glGenerateMipmap (or glGenerateTextureMipmap) always generates the entire MIP-chain for the specified array layer.

Thus, the <numlevels> parameter of glTextureView seems to be ignored in the MIP-map generation process.
I also tried to use glTexParameteri(..., GL_TEXTURE_MAX_LEVEL, 3), but this has the same result.
Can anyone explain me how to solve this?

Here is an example code, how I do it:
void GenerateSubMips( GLuint texID, GLenum texTarget, GLenum internalFormat, GLuint baseMipLevel, GLuint numMipLevels, GLuint baseArrayLayer, GLuint numArrayLayers) { GLuint texViewID = 0; glGenTextures(1, &texViewID); glTextureView( texViewID, texTarget, texID, internalFormat, baseMipLevel, numMipLevels, baseArrayLayer, numArrayLayers ); glGenerateTextureMipmap(texViewID); glDeleteTextures(1, &texViewID); } GenerateSubMips( myTex, GL_TEXTURE_2D_ARRAY, GL_RGBA8, 0, 3, // only the first 3 MIP-maps 4, 1 // only one array layer with index 4 );
Thanks and kind regards,
Lukas
• By mmmax3d
Hi everyone,
I would need some assistance from anyone who has a similar experience
or a nice idea!
I have created a skybox (as cube) and now I need to add a floor/ground.
The skybox is created from cubemap and initially it was infinite.
Now it is finite with a specific size. The floor is a quad in the middle
of the skybox, like a horizon.
I have two problems:
When moving the skybox upwards or downwards, I need to
sample from points even above the horizon while sampling
from the botton at the same time.  I am trying to create a seamless blending of the texture
at the points of the horizon, when the quad is connected
to the skybox. However, I get skew effects. Does anybody has done sth similar?
Is there any good practice?
Thanks everyone!
• By mmmax3d
Hi everyone,
I would need some assistance from anyone who has a similar experience
or a nice idea!
I have created a skybox (as cube) and now I need to add a floor/ground.
The skybox is created from cubemap and initially it was infinite.
Now it is finite with a specific size. The floor is a quad in the middle
of the skybox, like a horizon.
I have two problems:
When moving the skybox upwards or downwards, I need to
sample from points even above the horizon while sampling
from the botton at the same time.  I am trying to create a seamless blending of the texture
at the points of the horizon, when the quad is connected
to the skybox. However, I get skew effects. Does anybody has done sth similar?
Is there any good practice?
Thanks everyone!

• I'm trying to implement PBR into my simple OpenGL renderer and trying to use multiple lighting passes, I'm using one pass per light for rendering as follow:
1- First pass = depth
2- Second pass = ambient
3- [3 .. n] for all the lights in the scene.
I'm using the blending function glBlendFunc(GL_ONE, GL_ONE) for passes [3..n], and i'm doing a Gamma Correction at the end of each fragment shader.
But i still have a problem with the output image it just looks noisy specially when i'm using texture maps.
Is there anything wrong with those steps or is there any improvement to this process?

• Hello Everyone!
I'm learning openGL, and currently i'm making a simple 2D game engine to test what I've learn so far.  In order to not say to much, i made a video in which i'm showing you the behavior of the rendering.
Video:

What i was expecting to happen, was the player moving around. When i render only the player, he moves as i would expect. When i add a second Sprite object, instead of the Player, this new sprite object is moving and finally if i add a third Sprite object the third one is moving. And the weird think is that i'm transforming the Vertices of the Player so why the transformation is being applied somewhere else?

Take a look at my code:
Sprite Class
(You mostly need to see the Constructor, the Render Method and the Move Method)
#include "Brain.h" #include <glm/gtc/matrix_transform.hpp> #include <vector> struct Sprite::Implementation { //Position. struct pos pos; //Tag. std::string tag; //Texture. Texture *texture; //Model matrix. glm::mat4 model; //Vertex Array Object. VertexArray *vao; //Vertex Buffer Object. VertexBuffer *vbo; //Layout. VertexBufferLayout *layout; //Index Buffer Object. IndexBuffer *ibo; //Shader. Shader *program; //Brains. std::vector<Brain *> brains; //Deconstructor. ~Implementation(); }; Sprite::Sprite(std::string image_path, std::string tag, float x, float y) { //Create Pointer To Implementaion. m_Impl = new Implementation(); //Set the Position of the Sprite object. m_Impl->pos.x = x; m_Impl->pos.y = y; //Set the tag. m_Impl->tag = tag; //Create The Texture. m_Impl->texture = new Texture(image_path); //Initialize the model Matrix. m_Impl->model = glm::mat4(1.0f); //Get the Width and the Height of the Texture. int width = m_Impl->texture->GetWidth(); int height = m_Impl->texture->GetHeight(); //Create the Verticies. float verticies[] = { //Positions //Texture Coordinates. x, y, 0.0f, 0.0f, x + width, y, 1.0f, 0.0f, x + width, y + height, 1.0f, 1.0f, x, y + height, 0.0f, 1.0f }; //Create the Indicies. unsigned int indicies[] = { 0, 1, 2, 2, 3, 0 }; //Create Vertex Array. m_Impl->vao = new VertexArray(); //Create the Vertex Buffer. m_Impl->vbo = new VertexBuffer((void *)verticies, sizeof(verticies)); //Create The Layout. m_Impl->layout = new VertexBufferLayout(); m_Impl->layout->PushFloat(2); m_Impl->layout->PushFloat(2); m_Impl->vao->AddBuffer(m_Impl->vbo, m_Impl->layout); //Create the Index Buffer. m_Impl->ibo = new IndexBuffer(indicies, 6); //Create the new shader. m_Impl->program = new Shader("Shaders/SpriteShader.shader"); } //Render. void Sprite::Render(Window * window) { //Create the projection Matrix based on the current window width and height. glm::mat4 proj = glm::ortho(0.0f, (float)window->GetWidth(), 0.0f, (float)window->GetHeight(), -1.0f, 1.0f); //Set the MVP Uniform. m_Impl->program->setUniformMat4f("u_MVP", proj * m_Impl->model); //Run All The Brains (Scripts) of this game object (sprite). for (unsigned int i = 0; i < m_Impl->brains.size(); i++) { //Get Current Brain. Brain *brain = m_Impl->brains[i]; //Call the start function only once! if (brain->GetStart()) { brain->SetStart(false); brain->Start(); } //Call the update function every frame. brain->Update(); } //Render. window->GetRenderer()->Draw(m_Impl->vao, m_Impl->ibo, m_Impl->texture, m_Impl->program); } void Sprite::Move(float speed, bool left, bool right, bool up, bool down) { if (left) { m_Impl->pos.x -= speed; m_Impl->model = glm::translate(m_Impl->model, glm::vec3(-speed, 0, 0)); } if (right) { m_Impl->pos.x += speed; m_Impl->model = glm::translate(m_Impl->model, glm::vec3(speed, 0, 0)); } if (up) { m_Impl->pos.y += speed; m_Impl->model = glm::translate(m_Impl->model, glm::vec3(0, speed, 0)); } if (down) { m_Impl->pos.y -= speed; m_Impl->model = glm::translate(m_Impl->model, glm::vec3(0, -speed, 0)); } } void Sprite::AddBrain(Brain * brain) { //Push back the brain object. m_Impl->brains.push_back(brain); } pos *Sprite::GetPos() { return &m_Impl->pos; } std::string Sprite::GetTag() { return m_Impl->tag; } int Sprite::GetWidth() { return m_Impl->texture->GetWidth(); } int Sprite::GetHeight() { return m_Impl->texture->GetHeight(); } Sprite::~Sprite() { delete m_Impl; } //Implementation Deconstructor. Sprite::Implementation::~Implementation() { delete texture; delete vao; delete vbo; delete layout; delete ibo; delete program; }
Renderer Class
#include "Renderer.h" #include "Error.h" Renderer::Renderer() { } Renderer::~Renderer() { } void Renderer::Draw(VertexArray * vao, IndexBuffer * ibo, Texture *texture, Shader * program) { vao->Bind(); ibo->Bind(); program->Bind(); if (texture != NULL) texture->Bind(); GLCall(glDrawElements(GL_TRIANGLES, ibo->GetCount(), GL_UNSIGNED_INT, NULL)); } void Renderer::Clear(float r, float g, float b) { GLCall(glClearColor(r, g, b, 1.0)); GLCall(glClear(GL_COLOR_BUFFER_BIT)); } void Renderer::Update(GLFWwindow *window) { /* Swap front and back buffers */ glfwSwapBuffers(window); /* Poll for and process events */ glfwPollEvents(); }
#shader vertex #version 330 core layout(location = 0) in vec4 aPos; layout(location = 1) in vec2 aTexCoord; out vec2 t_TexCoord; uniform mat4 u_MVP; void main() { gl_Position = u_MVP * aPos; t_TexCoord = aTexCoord; } #shader fragment #version 330 core out vec4 aColor; in vec2 t_TexCoord; uniform sampler2D u_Texture; void main() { aColor = texture(u_Texture, t_TexCoord); } Also i'm pretty sure that every time i'm hitting the up, down, left and right arrows on the keyboard, i'm changing the model Matrix of the Player and not the others.

Window Class:
#include "Window.h" #include <GL/glew.h> #include <GLFW/glfw3.h> #include "Error.h" #include "Renderer.h" #include "Scene.h" #include "Input.h" //Global Variables. int screen_width, screen_height; //On Window Resize. void OnWindowResize(GLFWwindow *window, int width, int height); //Implementation Structure. struct Window::Implementation { //GLFW Window. GLFWwindow *GLFW_window; //Renderer. Renderer *renderer; //Delta Time. double delta_time; //Frames Per Second. int fps; //Scene. Scene *scnene; //Input. Input *input; //Deconstructor. ~Implementation(); }; //Window Constructor. Window::Window(std::string title, int width, int height) { //Initializing width and height. screen_width = width; screen_height = height; //Create Pointer To Implementation. m_Impl = new Implementation(); //Try initializing GLFW. if (!glfwInit()) { std::cout << "GLFW could not be initialized!" << std::endl; std::cout << "Press ENTER to exit..." << std::endl; std::cin.get(); exit(-1); } //Setting up OpenGL Version 3.3 Core Profile. glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); /* Create a windowed mode window and its OpenGL context */ m_Impl->GLFW_window = glfwCreateWindow(width, height, title.c_str(), NULL, NULL); if (!m_Impl->GLFW_window) { std::cout << "GLFW could not create a window!" << std::endl; std::cout << "Press ENTER to exit..." << std::endl; std::cin.get(); glfwTerminate(); exit(-1); } /* Make the window's context current */ glfwMakeContextCurrent(m_Impl->GLFW_window); //Initialize GLEW. if(glewInit() != GLEW_OK) { std::cout << "GLEW could not be initialized!" << std::endl; std::cout << "Press ENTER to exit..." << std::endl; std::cin.get(); glfwTerminate(); exit(-1); } //Enabling Blending. GLCall(glEnable(GL_BLEND)); GLCall(glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA)); //Setting the ViewPort. GLCall(glViewport(0, 0, width, height)); //**********Initializing Implementation**********// m_Impl->renderer = new Renderer(); m_Impl->delta_time = 0.0; m_Impl->fps = 0; m_Impl->input = new Input(this); //**********Initializing Implementation**********// //Set Frame Buffer Size Callback. glfwSetFramebufferSizeCallback(m_Impl->GLFW_window, OnWindowResize); } //Window Deconstructor. Window::~Window() { delete m_Impl; } //Window Main Loop. void Window::MainLoop() { //Time Variables. double start_time = 0, end_time = 0, old_time = 0, total_time = 0; //Frames Counter. int frames = 0; /* Loop until the user closes the window */ while (!glfwWindowShouldClose(m_Impl->GLFW_window)) { old_time = start_time; //Total time of previous frame. start_time = glfwGetTime(); //Current frame start time. //Calculate the Delta Time. m_Impl->delta_time = start_time - old_time; //Get Frames Per Second. if (total_time >= 1) { m_Impl->fps = frames; total_time = 0; frames = 0; } //Clearing The Screen. m_Impl->renderer->Clear(0, 0, 0); //Render The Scene. if (m_Impl->scnene != NULL) m_Impl->scnene->Render(this); //Updating the Screen. m_Impl->renderer->Update(m_Impl->GLFW_window); //Increasing frames counter. frames++; //End Time. end_time = glfwGetTime(); //Total time after the frame completed. total_time += end_time - start_time; } //Terminate GLFW. glfwTerminate(); } //Load Scene. void Window::LoadScene(Scene * scene) { //Set the scene. m_Impl->scnene = scene; } //Get Delta Time. double Window::GetDeltaTime() { return m_Impl->delta_time; } //Get FPS. int Window::GetFPS() { return m_Impl->fps; } //Get Width. int Window::GetWidth() { return screen_width; } //Get Height. int Window::GetHeight() { return screen_height; } //Get Input. Input * Window::GetInput() { return m_Impl->input; } Renderer * Window::GetRenderer() { return m_Impl->renderer; } GLFWwindow * Window::GetGLFWindow() { return m_Impl->GLFW_window; } //Implementation Deconstructor. Window::Implementation::~Implementation() { delete renderer; delete input; } //OnWindowResize void OnWindowResize(GLFWwindow *window, int width, int height) { screen_width = width; screen_height = height; //Updating the ViewPort. GLCall(glViewport(0, 0, width, height)); }
Brain Class
#include "Brain.h" #include "Sprite.h" #include "Window.h" struct Brain::Implementation { //Just A Flag. bool started; //Window Pointer. Window *window; //Sprite Pointer. Sprite *sprite; }; Brain::Brain(Window *window, Sprite *sprite) { //Create Pointer To Implementation. m_Impl = new Implementation(); //Initialize Implementation. m_Impl->started = true; m_Impl->window = window; m_Impl->sprite = sprite; } Brain::~Brain() { //Delete Pointer To Implementation. delete m_Impl; } void Brain::Start() { } void Brain::Update() { } Window * Brain::GetWindow() { return m_Impl->window; } Sprite * Brain::GetSprite() { return m_Impl->sprite; } bool Brain::GetStart() { return m_Impl->started; } void Brain::SetStart(bool value) { m_Impl->started = value; } Script Class (Its a Brain Subclass!!!)
#include "Script.h" Script::Script(Window *window, Sprite *sprite) : Brain(window, sprite) { } Script::~Script() { } void Script::Start() { std::cout << "Game Started!" << std::endl; } void Script::Update() { Input *input = this->GetWindow()->GetInput(); Sprite *sp = this->GetSprite(); //Move this sprite. this->GetSprite()->Move(200 * this->GetWindow()->GetDeltaTime(), input->GetKeyDown("left"), input->GetKeyDown("right"), input->GetKeyDown("up"), input->GetKeyDown("down")); std::cout << sp->GetTag().c_str() << ".x = " << sp->GetPos()->x << ", " << sp->GetTag().c_str() << ".y = " << sp->GetPos()->y << std::endl; }
Main:
#include "SpaceShooterEngine.h" #include "Script.h" int main() { Window w("title", 600,600); Scene *scene = new Scene(); Sprite *player = new Sprite("Resources/Images/player.png", "Player", 100,100); Sprite *other = new Sprite("Resources/Images/cherno.png", "Other", 400, 100); Sprite *other2 = new Sprite("Resources/Images/cherno.png", "Other", 300, 400); Brain *brain = new Script(&w, player); player->AddBrain(brain); scene->AddSprite(player); scene->AddSprite(other); scene->AddSprite(other2); w.LoadScene(scene); w.MainLoop(); return 0; }

I literally can't find what is wrong. If you need more code, ask me to post it. I will also attach all the source files.
Brain.cpp
Error.cpp
IndexBuffer.cpp
Input.cpp
Renderer.cpp
Scene.cpp
Sprite.cpp
Texture.cpp
VertexArray.cpp
VertexBuffer.cpp
VertexBufferLayout.cpp
Window.cpp
Brain.h
Error.h
IndexBuffer.h
Input.h
Renderer.h
Scene.h
SpaceShooterEngine.h
Sprite.h
Texture.h
VertexArray.h
VertexBuffer.h
VertexBufferLayout.h
Window.h

• Hello fellow programmers,
For a couple of days now i've decided to build my own planet renderer just to see how floating point precision issues
can be tackled. As you probably imagine, i've quickly faced FPP issues when trying to render absurdly large planets.

I have used the classical quadtree LOD approach;
I've generated my grids with 33 vertices, (x: -1 to 1, y: -1 to 1, z = 0).
Each grid is managed by a TerrainNode class that, depending on the side it represents (top, bottom, left right, front, back),
creates a special rotation-translation matrix that moves and rotates the grid away from the origin so that when i finally
normalize all the vertices on my vertex shader i can get a perfect sphere.
T = glm::translate(glm::dmat4(1.0), glm::dvec3(0.0, 0.0, 1.0)); R = glm::rotate(glm::dmat4(1.0), glm::radians(180.0), glm::dvec3(1.0, 0.0, 0.0)); sides[0] = new TerrainNode(1.0, radius, T * R, glm::dvec2(0.0, 0.0), new TerrainTile(1.0, SIDE_FRONT)); T = glm::translate(glm::dmat4(1.0), glm::dvec3(0.0, 0.0, -1.0)); R = glm::rotate(glm::dmat4(1.0), glm::radians(0.0), glm::dvec3(1.0, 0.0, 0.0)); sides[1] = new TerrainNode(1.0, radius, R * T, glm::dvec2(0.0, 0.0), new TerrainTile(1.0, SIDE_BACK)); // So on and so forth for the rest of the sides As you can see, for the front side grid, i rotate it 180 degrees to make it face the camera and push it towards the eye;
the back side is handled almost the same way only that i don't need to rotate it but simply push it away from the eye.
The same technique is applied for the rest of the faces (obviously, with the proper rotations / translations).
The matrix that result from the multiplication of R and T (in that particular order) is send to my vertex shader as r_Grid'.
// spherify vec3 V = normalize((r_Grid * vec4(r_Vertex, 1.0)).xyz); gl_Position = r_ModelViewProjection * vec4(V, 1.0); The r_ModelViewProjection' matrix is generated on the CPU in this manner.
// No the most efficient way, but it works. glm::dmat4 Camera::getMatrix() { // Create the view matrix // Roll, Yaw and Pitch are all quaternions. glm::dmat4 View = glm::toMat4(Roll) * glm::toMat4(Pitch) * glm::toMat4(Yaw); // The model matrix is generated by translating in the oposite direction of the camera. glm::dmat4 Model = glm::translate(glm::dmat4(1.0), -Position); // Projection = glm::perspective(fovY, aspect, zNear, zFar); // zNear = 0.1, zFar = 1.0995116e12 return Projection * View * Model; } I managed to get rid of z-fighting by using a technique called Logarithmic Depth Buffer described in this article; it works amazingly well, no z-fighting at all, at least not visible.
Each frame i'm rendering each node by sending the generated matrices this way.
// set the r_ModelViewProjection uniform // Sneak in the mRadiusMatrix which is a matrix that contains the radius of my planet. Shader::setUniform(0, Camera::getInstance()->getMatrix() * mRadiusMatrix); // set the r_Grid matrix uniform i created earlier. Shader::setUniform(1, r_Grid); grid->render(); My planet's radius is around 6400000.0 units, absurdly large, but that's what i really want to achieve;
Everything works well, the node's split and merge as you'd expect, however whenever i get close to the surface
of the planet the rounding errors start to kick in giving me that lovely stairs effect.
I've read that if i could render each grid relative to the camera i could get better precision on the surface, effectively
getting rid of those rounding errors.

My question is how can i achieve this relative to camera rendering in my scenario here?
I know that i have to do most of the work on the CPU with double, and that's exactly what i'm doing.
I only use double on the CPU side where i also do most of the matrix multiplications.
As you can see from my vertex shader i only do the usual r_ModelViewProjection * (some vertex coords).

• By mike44
HI
I've a ok framebuffer looking from above. Now how to turn it 90' to look at it from the front?
It looks almost right but the upper colors look like you're right in it. Those should be blue like sky.
I draw GL_TRIANGLE_STRIP colored depending on a height value.
Any ideas also on the logic? Thanks

• I have a 9-slice shader working mostly nicely:

Here, both the sprites are separate images, so the shader code works well:
varying vec4 color; varying vec2 texCoord; uniform sampler2D tex; uniform vec2 u_dimensions; uniform vec2 u_border; float map(float value, float originalMin, float originalMax, float newMin, float newMax) { return (value - originalMin) / (originalMax - originalMin) * (newMax - newMin) + newMin; } // Helper function, because WET code is bad code // Takes in the coordinate on the current axis and the borders float processAxis(float coord, float textureBorder, float windowBorder) { if (coord < windowBorder) return map(coord, 0, windowBorder, 0, textureBorder) ; if (coord < 1 - windowBorder) return map(coord, windowBorder, 1 - windowBorder, textureBorder, 1 - textureBorder); return map(coord, 1 - windowBorder, 1, 1 - textureBorder, 1); } void main(void) { vec2 newUV = vec2( processAxis(texCoord.x, u_border.x, u_dimensions.x), processAxis(texCoord.y, u_border.y, u_dimensions.y) ); // Output the color gl_FragColor = texture2D(tex, newUV); } External from the shader, I upload vec2(slice/box.w, slice/box.h) into the u_dimensions variable, and vec2(slice/clip.w, slice/clip.h) into u_border. In this scenario, box represents the box dimensions, and clip represents dimensions of the 24x24 image to be 9-sliced, and slice is 8 (the size of each slice in pixels).
This is great and all, but it's very disagreeable if I decide I'm going to organize the various 9-slice images into a single image sprite sheet.

Because OpenGL works between 0.0 and 1.0 instead of true pixel coordinates, and processes the full images rather than just the contents of the clipping rectangles, I'm kind of stumped about how to tell the shader to do what I need it to do. Anyone have pro advice on how to get it to be more sprite-sheet-friendly? Thank you!
• By hellgasm
Hello,
I have a question about premultiplied alpha images, texture atlas and texture filter (minification with bilinear filter).
(I use correct blending function for PMA and all my images are in premultiplied alpha format.)
Suppose that there are 3 different versions of a plain square image:
1. In a separate file, by itself. No padding (padding means alpha 0 pixels in my case).
2. In an atlas in which the subtexture's top left corner(0, 0) is positioned on top left corner(0, 0) of atlas. There are padding on right and bottom but not on left and top.
3. In an atlas, in which there are padding in each of the 4 directions of the subtexture.
Do these 3 give the same result when texture is minified using Bilinear filter? If not I assume this means premultiplied alpha distorts images since alpha 0 pixels are included in interpolation. If so why do we use something that distorts our images?
And even if we don't use premultiplied alpha and use "normal" blending, alpha 0 pixels are still used in interpolation. We add bleeding to overcome problems caused by this (don't know if it causes exact true pixel rendering though). So the question is: Does every texture atlas cause distortion in contained images even if we use bilinear (without mipmaps)? If so why does everyone use atlas if it's something that's so bad?
I tried to test this with a simple scenario but don't know if my test method is right.
I made a yellow square image with red border. The entire square (border included) is 116x116 px. The entire image is 128x128.
I made two versions of this image. Both images are in premultiplied alpha format.
1st version: square starts at 0, 0 and there are 12 pixels padding on bottom and right.
2nd version: square is centered both horizontally and vertically so there are 6px padding on top, left, bottom and right.
I scaled them to 32x32 (scaled entire image without removing padding) using Bilinear filter. And when rendered, they both give very very different results. One is exact while the other one is blurry. I need to know if this is caused by the problem I mentioned in the question.
Here are the images I used in this test:
Image (topleft):

Image (mid):

Render result:

I try to use x, y and width, height values for sprite which match integers so subpixel rendering is not intended.
• By KKTHXBYE
So, algorithm looks like this:
Use fbo
Clear depth and color buffers
Write depth
Stretch fbo depth texture to screen size and compare that with final scene

GLSL algo looks like this:
Project light position and vertex position to screen space coords then move them to 0..1 space
Compute projected vertex to light vector
Then by defined number of samples go from projected vertex position to projected light position:
- Get the depth from depth texture
- unproject this given texture coord and depth value * 2.0 - 1.0 using inverse of (model*view)*projection matrix
Find closest point on line (world_vertex_pos, wirld light pos) to unprojected point, if its less than 0.0001 then i say the ray hit something and original fragment is in shadow

Now i forgot few things, so i'll have to ask:
In vertex shader i do something like this
vertexClip.x = dp43(MVP1, Vpos); vertexClip.y = dp43(MVP2, Vpos); vertexClip.z = dp43(MVP3, Vpos); vertexClip.w = dp43(MVP4, Vpos); scrcoord = vec3(vertexClip.x, vertexClip.y, vertexClip.z); Where float dp43(vec4 matrow, vec3 p) { return ( (matrow.x*p.x) + (matrow.y*p.y) + (matrow.z*p.z) + matrow.w ); } It looks like i dont have to divide scrcoord by vertexClip.w component when i do something like this at the end of shader
gl_Position = vertexClip; and fragments are located where they should be...
I pass scrcoord to fragment shader, then do 0.5 + 0.5 to know from which position of the depth tex i start to unproject values.
So scrcoord should be in -1..1 space right?
Another thing is with unprojecting a screen coord to 3d position:
So far i use this formula:
Get texel depth, do *2.0-1.0 for all xyz components
Then multiple it by inverse of (model*view)*projection matrix like that:
Not quite sure if this isneven correct:
vec3 unproject(vec3 op) { vec3 outpos; outpos.x = dp43(imvp1, op); outpos.y = dp43(imvp2, op); outpos.z = dp43(imvp3, op); return outpos; }
And last question is about ray sampling i'm pretty sure it will skip some pixels making shadowed fragments unshadowed.... Need somehow to fix that too, but for now i have no clue...

vec3 act_tex_pos = fcoord + projected_ldir * sample_step * float ( i );

I checked depth tex for values and theyre right.

• ### Forum Statistics

• Total Topics
631068
• Total Posts
2997741
×