Jump to content
  • Advertisement

Nemo Persona

Member
  • Content Count

    13
  • Joined

  • Last visited

Community Reputation

268 Neutral

About Nemo Persona

  • Rank
    Member
  1. Nemo Persona

    Researching render loops for OpenGL 4.0+

    No they have there own buffer and the shaders get read access to those (as a uniform buffer object), after that you pass an id to the material and/or textures to use for every command in the command buffer. the id's are stored in buffers as well and passed along as vertex attributes to shaders (use divisor so they only change per command and not per vertex). All those buffers can be bound in the vertex array object so you don't need to bind/unbind in sub loops.   The texture size needs to be the same in a default texture array, this can be a problem in some case (not for me atm), in theory this can be solved using spare textures. (using bind-less textures, might need to expand on that and ref the info on them)   Proof of concept still in the pipeline, if I find the time to finish it.
  2. Nemo Persona

    Researching render loops for OpenGL 4.0+

    For a hobby project I had to look at opengl render loops so here is a fast recap of the main things I pulled out of a day of reading and searching. The best hint I found was a talk from GDC 2014, linked down belowe. Also note all code here is just pseudo code, I plan to actually implement the loop in the coming days. Tutorial Loop Most intro tutorials will teach you a render loop like the one below, it is the most basic loop you can have to draw elements to screen and functions well to explain opengl behaviors but is useless in production code. Pseudo code: for each (object(s)) { glBind(object.shader); glBind(object.textures); glBind(object.indexBuffer); glBind(object.vertexBuffer); glBind(object.normalBuffer); glUniform(object.info) glDraw(object.data) } In a decent tutorial they will replace binding of buffers with a vertex buffer object at some point, removing some of the opengl calls they do in the loop above. But that is still far from optimal, to optimize this loop you have to realize that state changes in opengl are slow and need to be minimized. (Note: Almost every opengl call is a state change or requires a state to be changed while executing) Real render loop In production code your render loop will look more like the one down below, it groups all opengl state changes and execute the task that require the same state to be active in a sub loop. This reduces the amount of state changes dramatically and leaves you with more time to actually draw object to the screen. A typical render loop in pseudo code: for each (shader) // Shader program glBind(shader) for each (material setup) // Program configuration: glBind(material.textures) // - Setup Texture binding glUniforms(material.info) // - Setup program uniforms for each (vertex buffer) // Vertex buffer object for each (object(s)) // Objects inside the buffer { glUniform(object.info) // Set object uniforms glDraw(object.data) // Draw the object } It is nice and looks like the loop most opengl programmers have been using for years. But with changes in hardware and the move to concurrent programing this loop needs to change. So lets dive in and look at the changes required! The Inner loop: for each (object(s)) { glUniform(object.info) glDraw(object.data) } This loop interacts with the opengl driver at every iteration, these calls are not thread safe and might need to be synchronized by multi threaded the drivers. Lets improve this bit and use one or more uniform buffer in combination with a draw command buffer that allows us to submit multiple draw commands in one draw call. Optimized: for each (object(s)) { uniforms = object.info commands = object.data } glUniformBuffer(uniforms) glDrawIndirect(commands) You now have full control over the inner loop and could split it up in to multiple threads without having to worry about driver synchronization. On top of that the program only interrupt the driver once by submitting all the data in one call. The Outer loop: for each (material setup) { glUniforms(material.config) glTextures(material.textures) ... (draw) } Again you interact with the driver for every possible material configuration, just like the inner loop we could optimize this part out by using buffers to store the configuration and allowing the shader to access those buffers. Of course this require you to pass a index or handle along to each object that need to drawn so that the shader knows witch configuration to use for a given object. So lets optimize again: for each (material setup) { matUniforms = material.config matTextures = material.textures } glUniformBuffer(matUniforms) glTextureArray(matTextures) ... (draw with id) And again you get full control of the loop and removed allot of interaction with the driver. Also note that sub part of loop is no longer nested inside the material loop, giving you the option to create bigger vertex buffers when possible and further reduce the final draw call count. Modern render loop If you put this together you have the base for a more modern loop that is already allot better but we can still dig down deeper. Due to changes in the structure of the loop you can optimize even more. Base: for each (shader) glBind(shader) for each (material setup) { matUniforms = material.config matTextures = material.textures } glUniformBuffer(matUniforms) glTextureArray(matTextures) for each (vertex buffer) glBind(vertex buffer) for each (object(s)) { uniforms = object.info commands = object.data } glUniformBuffer(uniforms) glDrawIndirect(commands) At this point you can pull the updates to buffers out of the render loop and use an is dirty flag to update them when needed. (Note: the pseudo code is not using data oriented design to avoid branching in the loops, look that one up) Optimized: for each (shader) glBind(shader) if (shader.isDirty) UpdateMaterialBuffer() glUniformBuffer(matUniforms) glTextureArray(matTextures) for each (vertex buffer) glBind(vertex buffer) if (vertex buffer.isDirty) UpdateObjectBuffer() glUniformBuffer(uniforms) glDrawIndirect(commands) And last but not least, due to the new structure it now becomes possible to move the buffer binding in to the vertex buffer object so that all binds are stored on the driver side to further reduce the communication with the driver in the render loop. Optimal: for each (shader) glBind(shader) if (shader.isDirty) UpdateMaterialBuffer() for each (vertex buffer) glBind(vertex buffer) if (vertex buffer.isDirty) UpdateObjectBuffer() glDrawIndirect(commands) And all was good ... Conclusion The end result is a simple loop that requires almost no state changes. When using opengl v4.4 the persistent mapped buffers means no state changes are needed to update the buffers, but we do need to be careful and make sure to synchronize them when updating. (Double buffers and glSync objects will be needed especially in multi threaded environment, more on that later) A simple fall back from opengl v4.4 is possible for v4.0 by using single indirect draw calls on the buffers. The performance penalty for this fall back is extremely high (talking 5x to 10x reduction), as such the fall back also requires you to not only lower the lod on objects but also to drop the amount objects that are being rendered. (Drop target examples: rocks, grass, plants, decorations, particle effect, ...) That is it for now happy hunting ;) Techniques to use: * Texture Arrays (OpenGL v3.0) Use texture array to pack multiple textures * Multi draw indirect (OpenGL v4.3) Use draw commands to pack multiple draw calls. (Fall back: draw call for every object in a buffer v4.0) * Persistent mapped buffers (OpenGL v4.4) Use persistent mapped buffers to avoid having to map/unmap when ever we add, remove or change an object int the scene graph. (Fall back: use map/unmap (and/or Subdata) to update v1.2 (v 1.5)) Ref: http://gdcvault.com/play/1020791 http://www.slideshare.net/CassEveritt/approaching-zero-driver-overhead https://www.khronos.org/assets/uploads/developers/library/2014-gdc/Khronos-OpenGL-Efficiency-GDC-Mar14.pdf https://www.opengl.org/wiki/GLAPI/glDrawArraysIndirect https://www.opengl.org/wiki/GLAPI/glDrawElementsIndirect https://www.opengl.org/wiki/GLAPI/glMultiDrawArraysIndirect https://www.opengl.org/wiki/GLAPI/glMultiDrawElementsIndirect
  3. Nemo Persona

    The Poor Man's Voxel Engine

    "The voxel format is simply a 3D array represented as an XML string of ASCII 1s and 0s"   So basicly you use xml to store 0s and 1s, why can't I stop smiling ? ..... oh dear old 2010 where people still loved xml I miss you <3
  4. But why not simply use 64bit integers where "universe scale" is measured in units of, say, 1,000 km and "solar system scale" is measured in units of nanometers?   This is, unlike floating point, perfectly deterministic with no rounding issues, no surprises, and no weird special cases, and uniformly distributed precision.   Nanometer resolution should be enough for everybody. Outside a solar system, it doesn't matter whether you're off a few thousand kilometers, since the next closest thing is at about 1012 kilometers, so there's no observable difference and no meaningful way of travelling other than with some hypthetical faster-than-light speed. Hence a 1,000km resolution is mighty fine, too.     I second this, or use a fixed point system if you really must have fractional units.     I can not speak for the other guy but in my case "the stars" you see are not static objects, some (allot) of them need to be animated. When I tryed to do a smooth animation with integers the end result was ether to slow and/or had visable stuter depending on frame rate. (I'm sure fixed point math would just be to slow)   For static objects yes, integers could work but then you have a extra system for none static objects and that would just complicate things even more.
  5. Nemo Persona

    C# .Net Open Source

      Doubtful they have there own branch of mono and it has not been synced with main branch for years; as a result it is not compatible with up to date versions of MS .net or Mono. If there was a easy fix to get away from the 'stop the world' garbage collector in there version they would have updated a long time ago ... (note: I have not looked at unity 5 so I might be outdated)   Not a big fan of Java or C# but lets hope this means we are one step closer to ditching security issues caused by Java or that it forces Oracle to step up and fix there broken platform/patching schedule.
  6. Most important things have been said but ... You also need to ask your self what does the cpu do when a cash miss happens ? Short answer, most of them will just spin and wait; this results in waisted cpu cycles. The table below gives you a idea of those cycles and average time required on most modern desktop systems:   [attachment=24330:cpu-cach-info.png] * Screen from a article I'm working on.
  7. Nemo Persona

    Bad code or usefull ...

    Thx for recap I kinda did but refuse to accept that this is good code unless you perform bounds checking in release code and then you get the performance loss. (quote below is good reply as to why this is acceptable in game libs)   I actual have over 10 years xp as a freelancer in c, c++ and sadly c#; work still comes to me and I don't have to look for it so I probably don't suck to badly at it. Define "non-trivial" ? The work is related to old custom db's and networking tools that are not trivial, mostly small Unix like tools that could be consider a trivial code base, so you are right to some degree.   thx, I kinda looked over those things, still have mixed feelings but it is clear I'm not gone convince people here.
  8. Nemo Persona

    Bad code or usefull ...

      If you are not OK with the idea of writing code that can write to or read from invalid memory, you shouldn't use C++. Given that the language and the standard library are written with the assumption that the programmer knows what he is doing, I don't see a problem with having a vector class that gives you access to coordinates by index: It's perfectly idiomatic and unlikely to be a problem in practice. People have already pointed out situations where you might want to use access by index. If you don't need it, don't provide an operator[]. But please don't make absurd analogies.   It is not a bold it is the conclusion I draw after reading most of replies, as for the analogy sorry if that offended people but I stand by it. If you don't agree that is fine by me but just because something is written and designed a certain way does not mean it is the correct way and telling people to go away or to use something else because they don't share your view point is counter productive.     The example is nice but it does not solve the problem mentioned, it still leaves room for serious bugs in release build and that is simply unacceptable to me, prevention is still better then fixing in my mind. (but that’s just my believe)
  9. Nemo Persona

    Bad code or usefull ...

    And that's why video games don't have bugs or crashes any more. What are you talking about!?! I haven't played a video game since Baldurs Gate that didn't have a bug or crashed in some way! Don't even get me started on Skyrim.   Exactly why fix when you can prevent the bugs ... don't have allot more to add, seems people are ok with the idea of writing code that can write to or read from invalid memory and they trust Debug/QA to find all possible edge cases. I guess there ok with smoking as well, god will protect them and doctors will fix the cancer when god fails. (sad panda was hoping for more examples of bad code in game libs, but we end up with believe in god and promises of an easy cure)
  10. Nemo Persona

    Bad code or usefull ...

    That is exactly what I suggest.   It localizes the safety issues and minimizes overall risk. If some spatial trees are the only reason to use the [] operator the you would not really litter the entire code base with random casts, you could even implement a special private struct to make it cleaner.   I'm still not convinced the benefits out way the risk or that it is possible to make the operation secure without losing performance.   You can simply add an 'assert(i < 3);' to the function. No impact on non-debug builds and a clean error signaling in debug builds. What about non-debug builds ? Are you sure your debug process is gone catch every edge case where the operation is used ? I'm sorry but to me this sounds like raw pointers all over again.
  11. Nemo Persona

    Bad code or usefull ...

    Ok, branching in this case is bad, but for a specific cases like this it is still possible to cast locally and avoid the branching. By adding the operator to the class it self you create a potential problem that can be hidden any where in the code base. (where a programmer feels like typing [i + 1] in staid of getY())     It does not cause a performance hit but there is no performance gain ether, there is just the potential for bugs or slower and or more complex code in case you do decide to make it a secure conforming function.   Just relaying on the faq that no one is supposed to be using the operator "unless in specific cases" sounds bad, code stays around for a long time but programmers come and go. Just like comments or docs saying "// Do not use unless really needed" make no sense at all, "needed" is undefined and/or defined differently by different people.   And yes, there are some other questionable functions in a vector class but this one has the highest risk factor of all those I can come up with.
  12. If you know other examples like this related to game libraries I would love to hear about them.   So I’m looking over maths libraries and most of those I have seen implement something like the struct/class below: struct Vec3 { float x, y, z; float& Vec3::operator[](unsigned int i) { return (&x)[i]; } } Note: this has many variations, private members or array, reinterpret_cast, ... matrix representations with same structure, ... but they all result in the risk of writing or reading memory that does not belong to the class.   My question is why would any one think this is good and acceptable code ?   - It is not faster then direct access or in worse case an in-line function call to access private member. - On top of that possible bugs resulting form the operator[] are hard to id and/or track down unless you implement bound checking but that would slow things down even more. Then we argue bounds checking is debug only, again why risk the bugs in production code in the first place ?   Did i miss a memo somewhere ? Is the 'operator[]' really that useful ?   Just a semi rant/question ... but I would love some other opinions on this.
  13.   In a lot of cases I don't actually call the init method, usually when I know required values are build in next code block thus avoiding unneeded constructor/method call and/or stack allocation for temp values to pass on to constructor/method. This could still be achieved with placement but then it becomes obfuscated and less obvious when reading true the code months later, so boils down to personal preference I guess. (Pod objects only, for objects with resources that require to be managed following RAII is a must in my mind and placement seems like a nice option.)
  14.   My object pool allocates a data block using malloc(sizeof(Type) * count) and then it uses the return pointer as a standard array or objects. Using malloc avoids the constructor calls, sins you are working with a object pool you usually implement init/create/delete methods to reset the object and that makes the constructor/destructor obsolete in most cases.   As for invalidating pointers, keeping the memory block coherent does invalidate them but sins I'm using a data oriented design I don't rely on the pointer as a id to a object. To id specific objects every object pool has a secondary array with id's that is kept in sync with the data array, kinda like a key value pair. This works because I almost never access the 'objects' based on there id due to the data oriented design, if you do access objects based on id the look ups would be to slow.   As mentioned before std::vector reallocate objects and invalidate pointers as well, you can get around this by using a vector of pointers to objects, std::vector<Object*> will do the trick and keep things simple.   ps. damm spell checker why you disabled >.<
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!