• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
B_old

OpenGL
Uniform buffer updates

8 posts in this topic

I use uniform buffers to set constants in the shaders.

Currently each uniform block is backed by a uniform buffer of the appropriate size, glBindBufferBase is called once per frame and glNamedBufferSubDataEXT is called for every object without orphaning.

 

I tried to optimize this by using a larger uniform buffer, calling glBindBufferRange and updating subsequent regions in the buffer and this turned out to be significantly slower. After looking around I found this and similar threads that talk about the same problem. The suggestion seems to be to use one large uniform buffer for all objects, only update once with the data for all objects and call glBindBufferRange for every drawcall. 

 

Is this the definite way to go with in OpenGL, regardless of using BufferSubData or MapBufferRange? At one place it was suggested that for small amounts of data glUniformfv is the fastest choice. It would be nice to implement comparable levels of performance with uniform buffers.

 

What is your experience with updating shader uniforms in OpenGL?

0

Share this post


Link to post
Share on other sites

Obviously for runtime speed performance, it would be better to upload only once and bind the range. This will take more memory but run faster.

 

But of course this wouldn't be viable for truely dynamic data, which couldn't be precalculated at a single upload stage. That's when runtime BufferSubData calls becomes viable. If your information isn't truely dynamic like this, then I'd say BufferSubData should only be used at the single upload stage.

 

Regardless, I think direct glUniform* calls have perfectly fine performance. The usage of uniform buffers might be overhyped. Especially when you're using truely dynamic data, the cases in which they will add performance diminishes. Basically when you use the SAME identical (updated) information multiple times. If you're updating a real dynamic uniform buffer and only using that information once, then you may aswell just use a straight glUniformfv

0

Share this post


Link to post
Share on other sites

When using a larger uniform buffer and ranges, there are a lot of things you have to do right. And it won't always be the best performance for every situation. While I can't know if it would provide better performance in your specific circumstance, I can ask a few things to make sure you were setting it up right. Were updates far enough apart to avoid conflicts (can be perf hit if call A writes to first half of block A, and call B writes to second half of block A)? Were you invalidating (orphaning) old data early enough or were you orphaning just before writing new data to the same location? Were you only orphaning large chunks at a time, preferably that map directly to memory pages? Was your buffer at least twice the size of the chunks you orphan? Were you using unsynchronized when writing new data? Were you ever blocking on your sync? Were you only sync'ing per orphan call instead of per frame? Did you test performance to make sure STREAM is faster than DYNAMIC (buffer hint)?

 

I almost certainly left out some important performance traps and considerations, but that should get you started. If you were already aware of those rules and doing everything right, it's probably wiser to just provide sample code to see if someone catches something you missed. It's always possible your code just won't gain a performance benefit from this stuff anyways, and this is complicated enough it's probably not worth your time unless you need the extra performance.

0

Share this post


Link to post
Share on other sites

I'm the author of the original recommendation, and it came about through considerable experimentation in an attempt to get comparable performance out of GL UBOs as you can get from D3D cbuffers.

 

The issue was that in D3D the per-object orphan/write/use/orphan/write/use/orphan/write/use pattern (with a single small buffer containing storage for a single object) works, it runs fast and gives robust and consistent behaviour across different hardware vendors.  In GL none of this happens.

 

The answer to the question "have you tried <insert tired old standard buffer object recommendation here>?" is "yes - and it didn't work".

 

The only solution that worked robustly across different hardware from different vendors was to iterate over each object twice - this may not be such a big deal as you're probably already doing it anyway (e.g once to build a list of drawable objects, once to draw them) - with the first iteration choosing a range in a single large UBO (sized large enough for the max number of objects you want to handle in a frame) to use for the object and writing in it's data, the second calling glBindBufferRange and drawing.

 

The data was originally written to a system memory intermediate storage area, then that was loaded (between the two iterations) to the UBO using glBufferSubData.  Mapping (even MapBufferRange with the appropriate flags) sometimes worked on one vendor but failed on another and gave no appreciable performance difference anyway, so it wasn't pursued much.  glBufferData (NULL) before the glBufferSubData call gave no measurable performance difference.  glBufferData on it's own gave no measurable performance difference.  Different usage hints were a waste of time as drivers ignore these anyway.

 

I'm fully satisfied that all of this is 100% a specification/driver issue, particularly coming from the OpenGL buffer object specification, and poor driver implementations.  After all, the hardware vendors can write a driver where the single-small-buffer and per-object orphan/write/use pattern works and works fast - they've done it in their D3D drivers.  It would be interesting to test if the GL4.4 immutable buffer storage functionality helps any, but in the absence of AMD and Intel implementations I don't see anything meaningful or useful coming from such a test.

 

Finally, by way of comparison with standalone uniforms, the problem there was that in GL standalone uniforms are per-program state and there are no global/shared uniforms, outside of UBOs.

Edited by mhagain
2

Share this post


Link to post
Share on other sites

What I currently do:

bindBufferBase(smallUniformBuffer);

for (o : objects)
{
    bufferSubData(smallUniformBuffer, o.transformation);
    draw(o.vertices);
}

What I think I should be doing:

offset = 0;
for (o : objects)
{
    memory[offset] = o.transformation;
    ++offset;
}

bufferData(hugeBuffer, memory);

offset = 0;
for (o : objects)
{
    bindBufferRange(hugeBuffer, offset);
    draw(o.vertices);
}

At first I was a bit frustrated because I am used to the Effects of the D3D-Sdk, but after reading the presentation about batched buffer updates it seems a D3D application can also benefit from doing it this way. So the architecture can be the same for both APIs.

 

@richardurich:

"Were updates far enough apart to avoid conflicts (can be perf hit if call A writes to first half of block A, and call B writes to second half of block A)?"

Can you explain this a bit more. Are you saying, that it is not good to write to the first half of the buffer and then to the second half, although the ranges don't intersect?

1

Share this post


Link to post
Share on other sites

I'm the author of the original recommendation, and it came about through considerable experimentation in an attempt to get comparable performance out of GL UBOs as you can get from D3D cbuffers.

 

The issue was that in D3D the per-object orphan/write/use/orphan/write/use/orphan/write/use pattern (with a single small buffer containing storage for a single object) works, it runs fast and gives robust and consistent behaviour across different hardware vendors.  In GL none of this happens.

 

The answer to the question "have you tried <insert tired old standard buffer object recommendation here>?" is "yes - and it didn't work".

 

The only solution that worked robustly across different hardware from different vendors was to iterate over each object twice - this may not be such a big deal as you're probably already doing it anyway (e.g once to build a list of drawable objects, once to draw them) - with the first iteration choosing a range in a single large UBO (sized large enough for the max number of objects you want to handle in a frame) to use for the object and writing in it's data, the second calling glBindBufferRange and drawing.

 

The data was originally written to a system memory intermediate storage area, then that was loaded (between the two iterations) to the UBO using glBufferSubData.  Mapping (even MapBufferRange with the appropriate flags) sometimes worked on one vendor but failed on another and gave no appreciable performance difference anyway, so it wasn't pursued much.  glBufferData (NULL) before the glBufferSubData call gave no measurable performance difference.  glBufferData on it's own gave no measurable performance difference.  Different usage hints were a waste of time as drivers ignore these anyway.

 

I'm fully satisfied that all of this is 100% a specification/driver issue, particularly coming from the OpenGL buffer object specification, and poor driver implementations.  After all, the hardware vendors can write a driver where the single-small-buffer and per-object orphan/write/use pattern works and works fast - they've done it in their D3D drivers.  It would be interesting to test if the GL4.4 immutable buffer storage functionality helps any, but in the absence of AMD and Intel implementations I don't see anything meaningful or useful coming from such a test.

 

Finally, by way of comparison with standalone uniforms, the problem there was that in GL standalone uniforms are per-program state and there are no global/shared uniforms, outside of UBOs.

Thanks, that answers my question!

Have you tried this architecture that works well for OpenGL with a D3D backend? Will it also work well there? At least it would make it less annoying, that this is apparently a driver weakness.

 

I'm indeed already iterating over each object twice but I'm wondering if I don't have to do it a third time now, because the first iteration is followed by a sort which could tell me when I don't need to update the per-material buffers for instance. 

 

I'm also wondering about another thing. Is it very important that the uniform buffer is large enough to fit every single object drawn per frame, or can you achieve good performance with a buffer that is large enough to contain some object data before the data has to be changed. To me it sounds like that should already help, but then again the handling of uniform buffers shouldn't be so hard in the first place.

0

Share this post


Link to post
Share on other sites

@richardurich:

"Were updates far enough apart to avoid conflicts (can be perf hit if call A writes to first half of block A, and call B writes to second half of block A)?"

Can you explain this a bit more. Are you saying, that it is not good to write to the first half of the buffer and then to the second half, although the ranges don't intersect?

A different way to look at the concept I'm talking about is don't put uniforms for group A at offset 0 and uniforms for group B at offset 12 (a single vec3). If you're updating an odd amount of data like 1 vec3, pad out to the next 64B boundary for the start of object B. Do not trust my 64 byte number there. I have no idea if 64 bytes is the right value. I feel like a single 4x4 matrix was the right size, but I could easily be remembering wrong and it could have changed anyways.

 

I don't know which calls or vendors took the hit, just the end result was X-byte align each group of uniforms. You'll also have to test that this is better performance since it may hurt you if you are bandwidth starved.

 

I know I've said it before, and I'm sure you already know it, but make sure you are bottlenecking on something this fixes before wasting a lot of time tuning the performance. And make sure you're late enough in development that the perf data you collect is going to be right. Unless you're familiar with coding something for development that also accounts for future optimizations (var for group alignment, one for map alignment, etc.), you're best off delaying this type of coding as long as humanly possible. For example, you want to 1-byte align during development so you catch any bugs where you stomp past your buffer instead of hiding the bug due to padding.

0

Share this post


Link to post
Share on other sites

A different way to look at the concept I'm talking about is don't put uniforms for group A at offset 0 and uniforms for group B at offset 12 (a single vec3). If you're updating an odd amount of data like 1 vec3, pad out to the next 64B boundary for the start of object B. Do not trust my 64 byte number there. I have no idea if 64 bytes is the right value. I feel like a single 4x4 matrix was the right size, but I could easily be remembering wrong and it could have changed anyways.

 

I don't know which calls or vendors took the hit, just the end result was X-byte align each group of uniforms. You'll also have to test that this is better performance since it may hurt you if you are bandwidth starved.

 

I know I've said it before, and I'm sure you already know it, but make sure you are bottlenecking on something this fixes before wasting a lot of time tuning the performance. And make sure you're late enough in development that the perf data you collect is going to be right. Unless you're familiar with coding something for development that also accounts for future optimizations (var for group alignment, one for map alignment, etc.), you're best off delaying this type of coding as long as humanly possible. For example, you want to 1-byte align during development so you catch any bugs where you stomp past your buffer instead of hiding the bug due to padding.

OK, I think I understand. If we are thinking about the same thing calls to glBindBufferRange actually have to have an offset that is a multiple of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, which on my machine (and I think on many other cards) 256.

 

Just now I tried to evaluate if the uniform updates are a bottleneck in my case. For this test I stripped down the rendering pipeline as much as I could, regarding OpenGL interaction. I simulated the performance of the "optimized" uniform updates by replacing glBufferSubData(float4x4) with glBindBufferRange().

I compared the two approaches with 1K and 4K draw calls for very simple geometry (same vb for every draw  call) and could not see any noticeable difference.

 

I concluded that the optimized version could not possibly be faster than just calling glBindBufferRange() for every differently transformed object, which in turn means this is not my bottleneck. 

 

So has the driver situation improved or is my test/conclusion flawed?

Edited by B_old
0

Share this post


Link to post
Share on other sites

If you slim down the OpenGL calls in the rendering pipeline so the driver is only busy 20% of the time, the frame rate won't change if you increase that to 40% (twice as slow) or decrease it to 10% (twice as fast)? You basically just guaranteed the driver has plenty of time to do all the memory management required, and that's mostly what you were trying to take off the driver's plate in the first place.

 

It sounds like you do not need to be worrying about this stuff yet, and may never need to worry about it.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By Solid_Spy
      Hello, I have been working on SH Irradiance map rendering, and I have been using a GLSL pixel shader to render SH irradiance to 2D irradiance maps for my static objects. I already have it working with 9 3D textures so far for the first 9 SH functions.
      In my GLSL shader, I have to send in 9 SH Coefficient 3D Texures that use RGBA8 as a pixel format. RGB being used for the coefficients for red, green, and blue, and the A for checking if the voxel is in use (for the 3D texture solidification shader to prevent bleeding).
      My problem is, I want to knock this number of textures down to something like 4 or 5. Getting even lower would be a godsend. This is because I eventually plan on adding more SH Coefficient 3D Textures for other parts of the game map (such as inside rooms, as opposed to the outside), to circumvent irradiance probe bleeding between rooms separated by walls. I don't want to reach the 32 texture limit too soon. Also, I figure that it would be a LOT faster.
      Is there a way I could, say, store 2 sets of SH Coefficients for 2 SH functions inside a texture with RGBA16 pixels? If so, how would I extract them from inside GLSL? Let me know if you have any suggestions ^^.
    • By DaniDesu
      #include "MyEngine.h" int main() { MyEngine myEngine; myEngine.run(); return 0; } MyEngine.h
      #pragma once #include "MyWindow.h" #include "MyShaders.h" #include "MyShapes.h" class MyEngine { private: GLFWwindow * myWindowHandle; MyWindow * myWindow; public: MyEngine(); ~MyEngine(); void run(); }; MyEngine.cpp
      #include "MyEngine.h" MyEngine::MyEngine() { MyWindow myWindow(800, 600, "My Game Engine"); this->myWindow = &myWindow; myWindow.createWindow(); this->myWindowHandle = myWindow.getWindowHandle(); // Load all OpenGL function pointers for use gladLoadGLLoader((GLADloadproc)glfwGetProcAddress); } MyEngine::~MyEngine() { this->myWindow->destroyWindow(); } void MyEngine::run() { MyShaders myShaders("VertexShader.glsl", "FragmentShader.glsl"); MyShapes myShapes; GLuint vertexArrayObjectHandle; float coordinates[] = { 0.5f, 0.5f, 0.0f, 0.5f, -0.5f, 0.0f, -0.5f, 0.5f, 0.0f }; vertexArrayObjectHandle = myShapes.drawTriangle(coordinates); while (!glfwWindowShouldClose(this->myWindowHandle)) { glClearColor(0.5f, 0.5f, 0.5f, 1.0f); glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); // Draw something glUseProgram(myShaders.getShaderProgram()); glBindVertexArray(vertexArrayObjectHandle); glDrawArrays(GL_TRIANGLES, 0, 3); glfwSwapBuffers(this->myWindowHandle); glfwPollEvents(); } } MyShaders.h
      #pragma once #include <glad\glad.h> #include <GLFW\glfw3.h> #include "MyFileHandler.h" class MyShaders { private: const char * vertexShaderFileName; const char * fragmentShaderFileName; const char * vertexShaderCode; const char * fragmentShaderCode; GLuint vertexShaderHandle; GLuint fragmentShaderHandle; GLuint shaderProgram; void compileShaders(); public: MyShaders(const char * vertexShaderFileName, const char * fragmentShaderFileName); ~MyShaders(); GLuint getShaderProgram(); const char * getVertexShaderCode(); const char * getFragmentShaderCode(); }; MyShaders.cpp
      #include "MyShaders.h" MyShaders::MyShaders(const char * vertexShaderFileName, const char * fragmentShaderFileName) { this->vertexShaderFileName = vertexShaderFileName; this->fragmentShaderFileName = fragmentShaderFileName; // Load shaders from files MyFileHandler myVertexShaderFileHandler(this->vertexShaderFileName); this->vertexShaderCode = myVertexShaderFileHandler.readFile(); MyFileHandler myFragmentShaderFileHandler(this->fragmentShaderFileName); this->fragmentShaderCode = myFragmentShaderFileHandler.readFile(); // Compile shaders this->compileShaders(); } MyShaders::~MyShaders() { } void MyShaders::compileShaders() { this->vertexShaderHandle = glCreateShader(GL_VERTEX_SHADER); this->fragmentShaderHandle = glCreateShader(GL_FRAGMENT_SHADER); glShaderSource(this->vertexShaderHandle, 1, &(this->vertexShaderCode), NULL); glShaderSource(this->fragmentShaderHandle, 1, &(this->fragmentShaderCode), NULL); glCompileShader(this->vertexShaderHandle); glCompileShader(this->fragmentShaderHandle); this->shaderProgram = glCreateProgram(); glAttachShader(this->shaderProgram, this->vertexShaderHandle); glAttachShader(this->shaderProgram, this->fragmentShaderHandle); glLinkProgram(this->shaderProgram); return; } GLuint MyShaders::getShaderProgram() { return this->shaderProgram; } const char * MyShaders::getVertexShaderCode() { return this->vertexShaderCode; } const char * MyShaders::getFragmentShaderCode() { return this->fragmentShaderCode; } MyWindow.h
      #pragma once #include <glad\glad.h> #include <GLFW\glfw3.h> class MyWindow { private: GLFWwindow * windowHandle; int windowWidth; int windowHeight; const char * windowTitle; public: MyWindow(int windowWidth, int windowHeight, const char * windowTitle); ~MyWindow(); GLFWwindow * getWindowHandle(); void createWindow(); void MyWindow::destroyWindow(); }; MyWindow.cpp
      #include "MyWindow.h" MyWindow::MyWindow(int windowWidth, int windowHeight, const char * windowTitle) { this->windowHandle = NULL; this->windowWidth = windowWidth; this->windowWidth = windowWidth; this->windowHeight = windowHeight; this->windowTitle = windowTitle; glfwInit(); } MyWindow::~MyWindow() { } GLFWwindow * MyWindow::getWindowHandle() { return this->windowHandle; } void MyWindow::createWindow() { // Use OpenGL 3.3 and GLSL 3.3 glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); // Limit backwards compatibility glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); glfwWindowHint(GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE); // Prevent resizing window glfwWindowHint(GLFW_RESIZABLE, GL_FALSE); // Create window this->windowHandle = glfwCreateWindow(this->windowWidth, this->windowHeight, this->windowTitle, NULL, NULL); glfwMakeContextCurrent(this->windowHandle); } void MyWindow::destroyWindow() { glfwTerminate(); } MyShapes.h
      #pragma once #include <glad\glad.h> #include <GLFW\glfw3.h> class MyShapes { public: MyShapes(); ~MyShapes(); GLuint & drawTriangle(float coordinates[]); }; MyShapes.cpp
      #include "MyShapes.h" MyShapes::MyShapes() { } MyShapes::~MyShapes() { } GLuint & MyShapes::drawTriangle(float coordinates[]) { GLuint vertexBufferObject{}; GLuint vertexArrayObject{}; // Create a VAO glGenVertexArrays(1, &vertexArrayObject); glBindVertexArray(vertexArrayObject); // Send vertices to the GPU glGenBuffers(1, &vertexBufferObject); glBindBuffer(GL_ARRAY_BUFFER, vertexBufferObject); glBufferData(GL_ARRAY_BUFFER, sizeof(coordinates), coordinates, GL_STATIC_DRAW); // Dertermine the interpretation of the array buffer glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3*sizeof(float), (void *)0); glEnableVertexAttribArray(0); // Unbind the buffers glBindBuffer(GL_ARRAY_BUFFER, 0); glBindVertexArray(0); return vertexArrayObject; } MyFileHandler.h
      #pragma once #include <cstdio> #include <cstdlib> class MyFileHandler { private: const char * fileName; unsigned long fileSize; void setFileSize(); public: MyFileHandler(const char * fileName); ~MyFileHandler(); unsigned long getFileSize(); const char * readFile(); }; MyFileHandler.cpp
      #include "MyFileHandler.h" MyFileHandler::MyFileHandler(const char * fileName) { this->fileName = fileName; this->setFileSize(); } MyFileHandler::~MyFileHandler() { } void MyFileHandler::setFileSize() { FILE * fileHandle = NULL; fopen_s(&fileHandle, this->fileName, "rb"); fseek(fileHandle, 0L, SEEK_END); this->fileSize = ftell(fileHandle); rewind(fileHandle); fclose(fileHandle); return; } unsigned long MyFileHandler::getFileSize() { return (this->fileSize); } const char * MyFileHandler::readFile() { char * buffer = (char *)malloc((this->fileSize)+1); FILE * fileHandle = NULL; fopen_s(&fileHandle, this->fileName, "rb"); fread(buffer, this->fileSize, sizeof(char), fileHandle); fclose(fileHandle); buffer[this->fileSize] = '\0'; return buffer; } VertexShader.glsl
      #version 330 core layout (location = 0) vec3 VertexPositions; void main() { gl_Position = vec4(VertexPositions, 1.0f); } FragmentShader.glsl
      #version 330 core out vec4 FragmentColor; void main() { FragmentColor = vec4(1.0f, 0.0f, 0.0f, 1.0f); } I am attempting to create a simple engine/graphics utility using some object-oriented paradigms. My first goal is to get some output from my engine, namely, a simple red triangle.
      For this goal, the MyShapes class will be responsible for defining shapes such as triangles, polygons etc. Currently, there is only a drawTriangle() method implemented, because I first wanted to see whether it works or not before attempting to code other shape drawing methods.
      The constructor of the MyEngine class creates a GLFW window (GLAD is also initialized here to load all OpenGL functionality), and the myEngine.run() method in Main.cpp is responsible for firing up the engine. In this run() method, the shaders get loaded from files via the help of my FileHandler class. The vertices for the triangle are processed by the myShapes.drawTriangle() method where a vertex array object, a vertex buffer object and vertrex attributes are set for this purpose.
      The while loop in the run() method should be outputting me the desired red triangle, but all I get is a grey window area. Why?
      Note: The shaders are compiling and linking without any errors.
      (Note: I am aware that this code is not using any good software engineering practices (e.g. exceptions, error handling). I am planning to implement them later, once I get the hang of OpenGL.)

       
    • By KarimIO
      EDIT: I thought this was restricted to Attribute-Created GL contexts, but it isn't, so I rewrote the post.
      Hey guys, whenever I call SwapBuffers(hDC), I get a crash, and I get a "Too many posts were made to a semaphore." from Windows as I call SwapBuffers. What could be the cause of this?
      Update: No crash occurs if I don't draw, just clear and swap.
      static PIXELFORMATDESCRIPTOR pfd = // pfd Tells Windows How We Want Things To Be { sizeof(PIXELFORMATDESCRIPTOR), // Size Of This Pixel Format Descriptor 1, // Version Number PFD_DRAW_TO_WINDOW | // Format Must Support Window PFD_SUPPORT_OPENGL | // Format Must Support OpenGL PFD_DOUBLEBUFFER, // Must Support Double Buffering PFD_TYPE_RGBA, // Request An RGBA Format 32, // Select Our Color Depth 0, 0, 0, 0, 0, 0, // Color Bits Ignored 0, // No Alpha Buffer 0, // Shift Bit Ignored 0, // No Accumulation Buffer 0, 0, 0, 0, // Accumulation Bits Ignored 24, // 24Bit Z-Buffer (Depth Buffer) 0, // No Stencil Buffer 0, // No Auxiliary Buffer PFD_MAIN_PLANE, // Main Drawing Layer 0, // Reserved 0, 0, 0 // Layer Masks Ignored }; if (!(hDC = GetDC(windowHandle))) return false; unsigned int PixelFormat; if (!(PixelFormat = ChoosePixelFormat(hDC, &pfd))) return false; if (!SetPixelFormat(hDC, PixelFormat, &pfd)) return false; hRC = wglCreateContext(hDC); if (!hRC) { std::cout << "wglCreateContext Failed!\n"; return false; } if (wglMakeCurrent(hDC, hRC) == NULL) { std::cout << "Make Context Current Second Failed!\n"; return false; } ... // OGL Buffer Initialization glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT); glBindVertexArray(vao); glUseProgram(myprogram); glDrawElements(GL_TRIANGLES, indexCount, GL_UNSIGNED_SHORT, (void *)indexStart); SwapBuffers(GetDC(window_handle));  
    • By Tchom
      Hey devs!
       
      I've been working on a OpenGL ES 2.0 android engine and I have begun implementing some simple (point) lighting. I had something fairly simple working, so I tried to get fancy and added color-tinting light. And it works great... with only one or two lights. Any more than that, the application drops about 15 frames per light added (my ideal is at least 4 or 5). I know implementing lighting is expensive, I just didn't think it was that expensive. I'm fairly new to the world of OpenGL and GLSL, so there is a good chance I've written some crappy shader code. If anyone had any feedback or tips on how I can optimize this code, please let me know.
       
      Vertex Shader
      uniform mat4 u_MVPMatrix; uniform mat4 u_MVMatrix; attribute vec4 a_Position; attribute vec3 a_Normal; attribute vec2 a_TexCoordinate; varying vec3 v_Position; varying vec3 v_Normal; varying vec2 v_TexCoordinate; void main() { v_Position = vec3(u_MVMatrix * a_Position); v_TexCoordinate = a_TexCoordinate; v_Normal = vec3(u_MVMatrix * vec4(a_Normal, 0.0)); gl_Position = u_MVPMatrix * a_Position; } Fragment Shader
      precision mediump float; uniform vec4 u_LightPos["+numLights+"]; uniform vec4 u_LightColours["+numLights+"]; uniform float u_LightPower["+numLights+"]; uniform sampler2D u_Texture; varying vec3 v_Position; varying vec3 v_Normal; varying vec2 v_TexCoordinate; void main() { gl_FragColor = (texture2D(u_Texture, v_TexCoordinate)); float diffuse = 0.0; vec4 colourSum = vec4(1.0); for (int i = 0; i < "+numLights+"; i++) { vec3 toPointLight = vec3(u_LightPos[i]); float distance = length(toPointLight - v_Position); vec3 lightVector = normalize(toPointLight - v_Position); float diffuseDiff = 0.0; // The diffuse difference contributed from current light diffuseDiff = max(dot(v_Normal, lightVector), 0.0); diffuseDiff = diffuseDiff * (1.0 / (1.0 + ((1.0-u_LightPower[i])* distance * distance))); //Determine attenuatio diffuse += diffuseDiff; gl_FragColor.rgb *= vec3(1.0) / ((vec3(1.0) + ((vec3(1.0) - vec3(u_LightColours[i]))*diffuseDiff))); //The expensive part } diffuse += 0.1; //Add ambient light gl_FragColor.rgb *= diffuse; } Am I making any rookie mistakes? Or am I just being unrealistic about what I can do? Thanks in advance
    • By yahiko00
      Hi,
      Not sure to post at the right place, if not, please forgive me...
      For a game project I am working on, I would like to implement a 2D starfield as a background.
      I do not want to deal with static tiles, since I plan to slowly animate the starfield. So, I am trying to figure out how to generate a random starfield for the entire map.
      I feel that using a uniform distribution for the stars will not do the trick. Instead I would like something similar to the screenshot below, taken from the game Star Wars: Empire At War (all credits to Lucasfilm, Disney, and so on...).

      Is there someone who could have an idea of a distribution which could result in such a starfield?
      Any insight would be appreciated
  • Popular Now