• Advertisement


  • Content count

  • Joined

  • Last visited

Community Reputation

154 Neutral

About kastro

  • Rank
  1. I have been working on implementing a D3D12 backend for my toy/self-education engine and have been reading and re-reading the MSDN programming guide (among other sources).  I understand most of the page that describes using a ring-buffer allocation strategy for managing an upload heap [1].  However, this one line has me scratching my head: Note that, ring-buffer support is expected to be a popular scenario; however, the heap design does not preclude other usage, such as command list parameterization and re-use. Can anyone help me in explaining what "command list parameterization and re-use" is?  Is that just referring to the use of root parameters? [1] https://msdn.microsoft.com/en-us/library/windows/desktop/dn899125(v=vs.85).aspx
  2. Vertex buffer efficiency

    Thank you, both of you.  Your notes are very helpful.     That's very good advice.  While I have a very definite problem (I'm creating far too many vertex buffers for my streamed in levels--1 per mesh), I think I've been looking too generically for a solution.  Part of my problem was I didn't really know my options since my current approach is so poor and far from a solution.     I think I'm going to go with your approach of using immutable buffers, particularly for the streamed levels that are larger and/or expected to live longer.  For smaller, details-providing levels, I think I may try out Matias suggestion of using one or a few default buffers, mapping the data in via staging buffers.  At least this seems like a reasonable starting point based on both of your suggestions.  I'm trying to stick to what Niklas Frykholm of Bitsquid said here: http://www.gamasutra.com/view/news/172006/Indepth_Read_my_lips__No_more_loading_screens.php   Thanks again.
  3. Vertex buffer efficiency

    Thanks for the explanation.  While I wasn't focusing on that part, the transient buffers make more sense to me now from a synchronization standpoint.  When you talk about creating a default buffer, do you mean I should try to have as much as possible of my non-dynamic streaming data stored within a single buffer, and the pool refers to the staging buffers?   I would think with level streaming, it would be risky to implement a single fixed capacity on the live buffer(s), so pool-type management for the live buffers would be useful too.  For example, when streaming in a new package, the loader knows exactly how much capacity it will require, and can grab how ever many buffers it needs from the pool of unused buffers. Likewise as a level is streamed out, the buffers are no longer needed and are added back into the pool of unused buffers.  Maybe I'm over complicating things.  I definitely see the benefit of staging buffers for updating dynamic data, but I guess it's not as clear for the case of loading in a large amount of streaming data (not behind a loading screen).
  4. In an engine I'm currently working on for self-education pruposes, all resources are asynchronously streamed in and out.  Up to now, I've simply just kept a one-for-one relationship between index/vertex buffers and each model.  I recently read the GDC 2012 presentation by John McDonald of NVIDIA (https://developer.nvidia.com/sites/default/files/akamai/gamedev/files/gdc12/Efficient_Buffer_Management_McDonald.pdf) and have been looking at how to incorporate its recommendations.  Specifically, I'm working on implementing his "long-lived buffers" that are reused to hold streaming (static) geometry data.  I've been unable to find much information on how best to implement it, however.   The presentation goes into detail on using "transient" buffers for UI/text, allocated similarly to a block of memory treated as a heap.  Is it adviseable to do something similar for the longer-lived buffers?  i.e. in streaming in a resource package containing geometry, I'd step through each allocation, first preferring space in the set of existing buffers found via the heap, and if not enough space is found create a new buffer to include in the heap (similar to how in managing system memory, Doug Lea's malloc can be backed by virtual page allocations from the OS via calls to mmap/munmap).  Or is the CPU overhead of that likely to cause hitches?  The other alternative I've considered is something more static, where individual buffers are assigned all or none and the geometry in a resource package is pre-packed in the pipeline to fit buffers of that size.  Upon loading and unloading a streaming package, entire buffers would be marked as used/unused, creating new ones as needed.   I'm doing this as a learning experience, so I'm working on trying out each of the above anyways, but any tips from someone more knowlegeable would be greatly appreciated.
  5. Apologies, as this really wasn't a graphics-specific question, despite dealing with vertex and index buffers.  It's more just a basic data structure/algorithm issue.  Doing some more research I've realized there isn't a good way to avoid multiple nested loops.  This code will only be called outside the runtime in building resources for the target platform, so it's not a big deal as long as it works I suppose.   In case anyone is curious, I ended up utilizing some of the responses in this StackOverflow question.  Nothing groundbreaking.
  6. Hello,   I am currently working on the geometry importer code for a simple engine I'm working on purely as a way to teach myself more about game engines and low-level graphics programming.  The file I am reading in segregates and independently indexes each input element's vertex data.  As an example, consider a basic cube containing just position and normal elements.  The position buffer contains eight items (one per corner), and the normal buffer contains 6 items (one per side).  In triangle list form, the two index buffers both contain 36 indices (with different values).   Given that I'm working with DirectX, I need to have the data arranged to be indexed by a single set of indices.  This means that the element data I read in needs to be reordered and individual elements duplicated so as to share a common indexing.  Currently, I am using a non-optimal approach of just re-sizing the vertex buffers to match 1-for-1 the number of triangles and restoring the data by stepping through the indices.  Obviously this is not idea as the vertex buffers are not reduced to their minimal size (making the benefits of using an index buffer moot).  In the cube example, I should have a single index buffer with 36 indices and position and normal buffers each having 24 items ordered identically.   Does anyone have any references or tips on the algorithm (or better yet, library implementation) to use to reduce the vertex buffers to their minimal size without loss of information?   I conceptually can see how it's just a sort and merge where duplicates are removed based on a comparison of all elements for an index.  But where my attempts have gotten messy is handling an arbitrary number of input elements and seemingly allocating multiple (potentially) massive blocks of memory just for temporary usage.   Thanks  
  7. Thanks, that was sort of what I assumed the answer would be. I'll just make sure that my render queue optimizer that I'm working on right now orders by vertex stream towards the top of my shader tree. By any chance does anyone have a link to an example that makes use of a mixed format vertex stream? I won't implement it for what I'm doing now, but since I'm making this engine purely for enjoyment and educational purposes, I'm curious to see how that is done.
  8. I'm woking on the GUI subsystem of my engine right now (I'm just making the engine as a educational/fun experience), and I'm trying to figure out the best way to render my GUI efficiently. All of the rendering done for my GUI is either of textured quads or simple colored lines/quads. I've read some things that say that switching vertex streams can be a costly renderr state change (though apparently not so much since DX8), and that mixing multiple vertex formats within a single buffer can be advantageous over having a number of smaller buffers that are swapped in and out. The optimal goal being to try to achieve buffers of around 1000 vertices each. Is it worthwhile to try and figure out to do this in DX9? I use a single dynamic buffer right now for the textured stuff (don't have anything in yet for untextured stuff). I'm thinking te easiest may be to just have two dynamic buffers, one for the textured vertices and one for simple colored vertices. Is this probably the best way to go? One other option I'm considering is to use the single buffer, and just have the untextured stuff have unused values in the texture coordinates and change the color operation to ignore the texture. This would waste space, but I'm thinking it'd be pretty fast (I assume changing color operation state is much faster than switching vertex streams). Anyone have any suggestions? Thanks.
  9. I'm working on implementing the 2D graphics needed for rendering a GUI overlay. I am wondering if anyone can comment on which way of the following is the most efficient way to render the non-textured, colored lines & triangles in the gui: 1) Use a static vertex buffer containing only non-transformed position data for 4 vertices. Define an index buffer containing sets of indices to describe a vertical line, a horizontal line, a diagonal line, and two triangles making up a rectangle. The position data is normalized on (0,0) to (1,1). To render, the vertex position is transformed via the world transform, which is changed for every rendering of a line/triangle every frame. Color data is likewise changed for each call, though the color is applied via diffuse lighting. What I see as the benefits of doing so in this fashion is that dynamic vertex buffers are not needed and little memory is used. All of the combinations of colored lines/triangles that need to be drawm can be done so using the same small vertex/index buffers using state chnges to the world transfor and diffuse lighting color. 2) Use a dynamic vertex buffer containing transformed position and color data. Each call to draw a specific line/triangle requires modifying the vertex buffer to contain the correct transformed position and color data. The benefits of this technique would be ease of implementation (versus option 1 above), and also being similar to how textured elements of the GUI are rendered (can't use option 1 for that, as the texture coordinate posibilities are too great to be able to define a 'generic' line or triangle) and requiring small amounts of memory. 3) Same as option 2, except for instead of having a single dynamic vertex buffer ith the same vertices shared by all GUI elements, each GUI element maintains a refernce to its own vertex data (either in it's own vertex buffer as well, or to a portion of a shared buffer). The benefits of this technique versus option 2 is that the number of unlocks/locks done on vertx buffers is dramatically reduced, only requiring changes when an element changes it's location or coloring (when a window is dragged for instance). The drawback is that this technique uses the most memory to store vertex data. Does anyone that has implemented a GUI scheme have any comments on the best way to itegrate GUI rendering within the overall rendering of the entire scene (the GUI elements in my engine are treated just like 3D objects--they are all just nodes within my scene graph, differing only in the render states they set)? Comments on the above are greatly appreciated... Is one technique obviously the best way to go? Is there some oher technique I'm not considering? Thanks in advance.
  • Advertisement