• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.


  • Content count

  • Joined

  • Last visited

Community Reputation

177 Neutral

About Grumple

  • Rank
  1. Hi,   I can't find a direct answer to this anywhere...so hopefully I can get one here...   I've implemented a relatively simple transform feedback shader that reads elements from one VBO, and writes them to another.  Then I do a glGetBufferSubData() to read results back to client memory.   Now, I don't seem to have any trouble just executing my transform feedback draw, then reading back the VBO without an explicit glFinish in between, but I'm worried that I'm just getting lucky timing-wise.  I dont want to run into issues with reading partially populated feedback buffers, etc.   Does anyone know for certain  if I should require a glFinish between the transform feedback draw call and a glGetBufferSubData() to read the output of the transform back to client mem?     Thanks!
  2. Thanks a lot, for this reply....I knew about the old/deprecated fixed function feedback system, but didn't realize there was an official replacement for the shader world.  I'll do some more reading before diving in, but it looks to be a great solution.   I know the transformation is relatively cheap, but in my current implementation it is happening for 6 stages of render, per model, with potentially thousands of models.  I'm also going to be doing something similar for label rendering, but will need to be able to generate the NDC coord buffer and potentially read it back for de-clutter processing on the CPU.  Having a shader stage that will just populate an NDC coord buffer for readback/post-processing would be awesome.     Sorry, but I don't quite follow you here...can you describe a bit more, or link some reading material?   Thanks again!
  3. Hello,   I am working on a problem where I want to render 3D objects in pseudo 2D by transforming to NDC coordinates in my vertex shader.  The models I'm drawing have numerous components rendered in separate stages, but all components of a given model are based from the same single point of origin.     This all works fine, but each vertex shader for the various stages of the model render redundantly transform from cartesian xyz to NDC coordinates prior to performing work.  Instead, I'd like to perform an initial conversion stage, populating a buffer of NDC coordinates, such that all vertex shaders can then just accept the NDC coordinate as input.   I'm also looking to avoid doing this on the CPU as I may have many thousands of model instances to work with.   So, with an input buffer containing thousands of Cartesian positions, and an equal sized output buffer to receive transformed NDC coordinates, what is my best options to perform the work on the GPU?  Is this something I need to look to OpenCL for?   Being fairly unfamiliar to OpenCL, I was thinking of looking into ways of setting things up so that the first component to be rendered for my models will 'know' it is first, have the vertex shader do standard transform to NDC, and somehow write the results back to an 'NDC coord buffer '.  All subsequent vertex shaders for various model components would use the NDC coord buffer as input, skipping the redundant conversion effort.     Is this reasonable?
  4. This is probably a silly question, but I've managed to get myself turned around and I'm second guessing my understanding of instancing.   I want to implement a label renderer using instanced billboards.  I have a VBO of 'label positions', as 2D vertices, one per label.  My 'single character billboard' is what I want to instance, and is in its own VBO.  Due to engine/architectural reasons, I have an index buffer for the billboard, even though it is not saving me much in terms of vertex counts.     For a various reasons I still want to loop through individual labels for my render, but planned to call glDrawElementsInstanced, specifying my 'billboard model', along with the character count for a label.   However, I can't see how I can tell glDrawElementsInstanced where to start in the other attribute array VBO's for a given label?  So, if I am storing a VBO of texture coords for my font, per-character, how do I get glDrawElementsInstanced to start at the first texture coord set of the first character of the current label being rendered?   I see that glDrawElementsInstancedBaseVertex exists, but I'm getting confused about what the base vertex value will do here.  If my raw/instanced billboard verticies are from index 0..3 in their VBO, but the 'unique' attributes of the current label start at element 50 in their VBO, what does a base vertex of 50 do?  I was under the impression that it would just cause GL to try to load billboard vertices from index+basevertex in that VBO, which is not what I want.   I guess to sum my question up, if I have an instanced rendering implementation, with various attribute divisors for different vertex attributes, how can I initiate an instanced render of the base model, but with vertex attributes starting from offsets into the associated VBO's, while abiding by attribute divisors that have been set up?   EDIT: I should mention, I've bound all the related VBO's under a Vertex Array Object.  By default I wanted all labels to be sharing VBO memory to avoid state changes, etc.  It seems like there must be a way to render just N instances of my model starting at some mid-point of my vertex attrib arrays.
  5. Update #2:  Problem solved!   For anyone encountering similar issues, it turns out some older ATI cards (maybe newer) do NOT like vertex array attributes that are not aligned to 4-byte boundaries.   I changed my color array to pass 4 unsigned bytes instead of 3, and updated my shader to accept vec4 instead of vec3 for that attribute and everything now works as intended.   Kind of a silly issue....but that is what i get for trying to cut corners on memory bandwidth, etc.  =P
  6. Update:   I still haven't figured out the root of the issue, but as a test I have switched to using floats for my color attribute instead of gl_unsigned_byte.  My unsigned byte colors were being passed in the range 0..255 with normalized set to GL_TRUE, and floats are passed 0..1.0 with normalized param of GL_FALSE.  Without really changing anything else , the problem goes away completely, so I am really suspicious of the ATI driver...   Anyone else seeing issues using multiple glDrawElement calls from a single bound VAO containing unsigned-byte color vertex attributes?  
  7. Hello,   I'm running out of ideas trying to debug an issue with a basic line render in the form of a 'world axis' model.   The idea is simple:     I create a VAO with float line vertices (3 per vertex), int indices (1 per vertex), and unsigned byte color (3 per vertex) I allow room and pack the array such that the first 12 vertices/indices/colors are for uniquely colored lines representing my +- world axis, and then a bunch of lines forming a 2D grid across the XZ plane.   Once data is loaded, I render by binding my VAO, activating a basic shader then drawing the model in two stages.  One glDrawElements call is made for the axis lines after glLineWidth is set to 2, and the grid lines drawn through a separate glDrawElements with thinner lines.   Whenever I Draw this way, the last 6 lines of my grid (i.e. the end of the VAO array) show up as random colors.  However, the lines themselves are correctly positioned, etc.   If I just do one glDrawElements call for all lines (ie world axis and grid lines at once), then the entire model appears as expected with correct colors everywhere.     This is only an issue on some ATI cards (ie radeon mobility 5650), but works on NVidia no problem.   I can't see what I would have done wrong if the lines are appearing fine (ie my VAO index counts/offsets must be ok for glDrawElements), and I don't see how it could be that I'm somehow packing the data into the VAO wrong if they appear correctly via a single glDrawElements call instead of two calls separated by changes to glLineWidth()?   Any suggestions?  glGetError, etc return no problems at all...   Here is some example render code, although I know it is just a small piece of the overall picture.  This causes the problem: TFloGLSL_IndexArray *tmpIndexArray = m_VAOAttribs->GetIndexArray(); //The first AXIS_MODEL_BASE_INDEX_COUNT elements are for the base axis..draw these thicker glLineWidth(2.0f); glDrawElements(GL_LINES, AXIS_MODEL_BASE_INDEX_COUNT, GL_UNSIGNED_INT, (GLvoid*)tmpIndexArray->m_VidBuffRef.m_GLBufferStartByte); //The first remaining elements are for the base grid..draw these thin int gridLinesElementCount = m_VAOAttribs->GetIndexCount() - AXIS_MODEL_BASE_INDEX_COUNT; if(gridLinesElementCount > 0) { glLineWidth(1.0f); glDrawElements(GL_LINES, gridLinesElementCount, GL_UNSIGNED_INT, (GLvoid*)(tmpIndexArray->m_VidBuffRef.m_GLBufferStartByte + (AXIS_MODEL_BASE_INDEX_COUNT * sizeof(int)))); } This works just fine: glDrawElements(GL_LINES, m_VAOAttribs->GetIndexCount(), GL_UNSIGNED_INT, (GLvoid*)tmpIndexArray->m_VidBuffRef.m_GLBufferStartByte);
  8. Your link might be a workable option, but having read a bit more I think I might be confusing people with my description.   I think the real compatibility issue is more generalized to accessing a uniform array from within the fragment shader using a non-constant index.   The index variable I am using is originally received in the vertex shader as an attribute (in), and passed to the fragment shader (out),  The fragment shader then  uses that to index into the uniform array of texture samplers.      What I've found in hindsight is that glsl 330 doesn't like using any form of variable as an index into said uniform array, even though NVidia seems to allow it.  =/
  9. Hello,   After a lot of programming and debugging I feel like a dumb ass.   I set up an entire billboard shader system around instancing, and as part of that design I was passing a vertex attribute representing the texture unit to sample from for a given billboard instance.      After setting all of this up I was getting strange artifacts, and found that GLSL 330 only officially supports constant (compile-time) index into a uniform sampler2D array?   Is there any nice way around this limitation short of compiling against a newer version of GLSL?  Is there at least a way to check if the local driver supports the sampler index as vertex attribute through an extension or something?  I tested my implementation on an NVidia card and it worked despite the spec, but ATI (as usual) seems more strict.   For now I have patched the problem by manually branching *shudder* based on the index and accessing the equivalent sampler via a constant in my fragment shader.     For example: if(passModelTexIndex == 0) { fragColor = texture2D(texturesArray_Uni[0], passTexCoords); } else if(passModelTexIndex == 1) { fragColor = texture2D(texturesArray_Uni[1], passTexCoords); } else { fragColor = vec4(0.0, 0.0, 0.0, 1.0); }
  10. Hi MJP, Thanks for the response!  I see what you are saying and I think it makes complete sense....Being still somewhat new to shaders, I forget that the view frustum clipping doesn't happen prior to my vertex shader stage....for some reason I was assuming that GL would throw away my 3D vertices once I had set up for an ortho viewport/pixel-based coord system.     Sometimes it is easy to fall back into a fixed-function mentality.. =P   Thanks!
  11. Hello,   I've got an old billboard rendering implementing in fixed-function openg.   The billboards technically represent locations in 3D space, but I want them to render the same pixel dimensions no matter how far away they are in my 3D (perspective projection) scene.   The way I handled this in the current (ancient) implementation is as follows: Render my scene in standard 3D perspective projection without my billboards.   Switch to orthographic projection, Loop through the billboards client-side, transforming their 3D position into 2D Screen coords Render all billboards as 2D objects with a constant size used in pixels. This is horrible for a number of reasons, first being it uses the oldest of old GL functionality.  I'm switching to shaders and using opengl 'instancing' as the basis for the update.     One of my main goals is to eliminate the client-side step of projecting the billboards into 2D screen space, and rendering via a second pass in an orthographic projection.     As far as I can tell, the only way to render the billboards within the perspective projection, while having them all maintain a fixed pixel size on screen, is to re-size each billboard dynamically from within the vertex shader, based on distance from eye .     Is the above assumption correct?  Is there any simpler way to go about this that eliminates the client-side transforms/orthographic render stage while maintaining constant pixel dimensions for billboards in a 3D scene?   For what it's worth I'm intending to render the shader-based billboards using OpenGL 'instancing', such that I can store a vertex array with each point representing a unique billoard, and my 'instance' data containing the base corner offsets.   Thanks!
  12. No, I don't think I am inadvertently reading from the mapped write buffer, although that is a really good tip.....I'll have to remember that for the future.   I've been thinking about my implementation and might have a few trouble maker candidates.   The first is that at the end of any given render frame, I map a buffer and allow sub threads to write to it for (up to) the duration of the frame.  Then at the end of the next render frame if writing is signaled as complete, I unmap it, and immediately call my glTexSubImage2D to ask ogl to start transferring its contents to texture(s).    I wonder if I should be deferring this and allowing some time (ie 1 frame) between my unmap and my call to glTexSubImage2D?  I had assumed OGL would handly this nicely internally on its own but now i'm not feeling so sure of how the texture access is handled.  This leads me to my next question...   Has anyone tried creating multiple sets of destination textures and copying to/rendering from different sets each frame?  I wonder if I could see an improvement in performance if I mirror my 'back buffer' pbo's with 'back buffer textures'?   For example:   Init: - init texture_A through PBO_A   Frame 1: - Render texture_A - Map PBO_B, copy data into it, unmap it - Init transfer to texture_B   Frame 2: - Render texture_B - Map PBO_A, copy data into it, unmap it - Init transfer to texture_A   Frame 3: - Repeat Frame 1   Any thoughts?
  13.   I agree 100% that I would expect to get better performance passing 32 bit textures, but that just doesn't seem to be the case.  Passing 24 bit (it was actually GL_BGR) texture data to glTexSubImage2D was faster than GL_BGRA.  vOv   I will review my texture formats, etc but I am 99% certain I am using the OpenGL preferred formats everywhere.     Also, I do allocate a RAM buffer each time I perform a video frame conversion in order to simplify my threading model (new frames are allocated, and given to the main thread for processing, such that I never have to mutex access to a shared, persistent buffer).   However, this alloc is very cheap, and unrelated to my glMapBuffer performance (PBO's are allocated once and persist).  I'm timing my specific calls with a profiler, so I am 100% certain the gl calls are the point of slowdown.   Related to all this, I did a new test last night whereas I created a ring buffer of PBO's (tested creating 2-6 of them), and tried glMapBufferRange, as well as various combinations of using the orphaning technique, etc.   All I've managed to do is make my performance unpredictable.   In any given run, glMapBuffer/glMapBufferRange would report taking an average of anywhere from 1 to 14 ms.   As far as I can tell it was random whether it executed quickly or not.  I am still seeing stutter in my frame rate though so at some level the data upload is still taking too long.     At this point I can't imagine where my stall is coming from.  Surely calling glMapBuffer or glMapBufferRange (with flags to discard and access without synchronization) on a PBO that hasn't bee used in five frames should execute without a stall?!  =P
  14. Hi BornToCode,     I thought i had replied to this but apparently it didnt commit for some reason.  Thanks for the insight into your approach.  I ended up doing the 'two large PBOs' method described above, and it worked reasonably well, but I ran into a bottleneck mapping a PBO for writes that eliminated most of my gains.  Here is a new post I made regarding my updated implementation and current glMapBuffer woes:    http://www.gamedev.net/topic/644241-glmapbuffer-slow-even-after-orphaning-old-buffer/   I'm about to try a circular buffer of PBO's now to (hopefully) eliminate my stall, and will post my results in case it can help anyone else.
  15. Hi Promit,   Thanks for the suggestions, but can you elaborate on what you mean by 'three to four' for double buffering using the orphaning approach?     I'm starting to wonder if maybe I'm better off creating half a dozen PBO's and cycling through them without ever using the orphan call at all.  The idea being that I create enough PBO's to guarantee that by the time I need to map PBO N, it it's contents are 5 frames old and hopefully not in use by the renderer.  Any thoughts on this?