Propagating data through an engine to constant buffers

Will Pearce · 2015-02-02T16:48:41

I was hoping to get some advice about engine design, and more specifically, about passing data that models need to exist in a constant buffer at render time. The below is based on an entity-component framework, in case that helps clarify where data should be coming from. For probably >90% of the content in my scene (basic models with a vertex shader, pixel shader, and only really need a WVP transform) I have a reserved (in the context of my engine) constant buffer slot I'm using to set data that comes from the Entity's corresponding TransformComponent combined with the view and projection matrix that comes from the current camera being used. For such a simple case, this is easy and straightforward enough that I haven't really thought to revisit it. Recently, I've started adding tesselated heightmap-based terrain to the engine, and unlike the common entities, it also requires an additional constant buffer that houses things like min and max tesselation factors, camera position, frustum planes (for culling), and a view matrix (used to move generated normals to view space for the GBuffer). I haven't done a good job in building flexibility into the current pipeline to accomodate the need for anything outside of the standard constant buffer described above which, again, houses mostly the WVP transformation matrix. When I started thinking longer term, I realized I was going to run into the same issues for things like ocean rendering, volumetric fog, or really anything that is "non-standard" in that it's not just a model with a straightforward Vertex Shader -> Pixel Shader -> Done type of setup. I'll go over below what I have existing right now to band-aid this situation, but I would really appreciate input about how to better be able to get at the data I need for a specific model's constant buffer requirements without the model having to know about upstream objects (for example, without the model being able to query the scene for the current camera being used to get the view matrix). Current solution: In my render component, which houses a pointer to a model (which contains a vertex buffer, index buffer, and subset table describing offsets and shaders/texture per subset - ideally where a would like constant buffer data to live since models depend on it), I have added a std::function member to allow for "extra work" and a boolean flag to acknowledge its presence. The gist is that during setup if a renderable entity (one with a RenderComponent) needs to perform extra setup work, it can define what work needs to be done in that std::function member and the main render loop will check if it's flag is set during each iteration. So, like below: // during scene setup - create a render component with the provided model RenderComponent* pRC = Factory<RenderComponent>.create(pTerrainModel); pRC->setExtraWork = [&](DeviceContext& deviceContext, FrameRenderData& frameRenderData) { // do the additional work here - in the case above, retrieve extra data needed // for constant buffer from the frameRenderData and store and enable that to a // shader constant buffer slot } /////// later in rendering loop if(pCurrent->hasExtraWork()) { pCurrent->getExtraWork()(deviceContext, frameRenderData); } //////// and the way the extra work member is defined in RenderComponent std::function<void(DeviceContext& deviceContext, FrameRenderData& frameRenderData)> m_extraWork; The FrameRenderData is just a generated struct of references to the data relevant to any given frame - the current camera, the current entities to be rendered, etc. The other thought I had would be to trigger an event at the start of each frame containing the FrameRenderData and let anything that wants to know about it listen for it, but then I feel like my models or render components would need to have event listeners attached, which also seems like iffy design at best. While the above technically works, I feel like it's kludgy and was wondering if anyone had thoughts on a better way to get data to dependent constant buffers in a system setup similar to what's above. Thanks for your time and help.

Graphics and GPU Programming Programming

Started by WFP January 29, 2015 09:32 PM

25 comments, last by Seabolt 9 years, 2 months ago

WFP

2,787

Author

January 31, 2015 04:15 AM

Hi all,

I was still hoping to get a little help with the items mentioned in Post #16 and thought of another issue to think about.

When building render queues, a lot of places I've been looking seem to recommend building out something of a descriptor key for each object - for example, a 64 bit unsigned integer where certain bits are reserved for shader id, texture id, pass type, depth, etc.

What should be responsible for filling this data out? Should the model/terrain/water/whatever be able to fill this information out about themselves (and have to know what a RenderQueueKey is, or at least its format)? When the setup is like my existing implementation and everything's kind of homogeneous (everything's a model) it seems a little clearer, but what's the best way to be able to do this once the render system is taking over after a scene update for everything else (including models, but also extending into terrain and anything else)?

Thanks for your help.

L. Spiro

25,818

January 31, 2015 08:20 AM

a lot of places I've been looking seem to recommend building out something of a descriptor key for each object - for example, a 64 bit unsigned integer where certain bits are reserved for shader id, texture id, pass type, depth, etc.

This usually comes from this site, and while he has the right general idea, and while I have his invaluable book, with all due respect, he is not a graphics programmer.
In practice, you can almost never fit things into a single 64-bit integer, especially when you find 2 objects that have the same shader and vertex buffer and set of textures, etc. (which would make them a great candidate for instancing) but need to decide between them which to draw first, which of course always goes to depth (opaque objects should always be drawn front-to-back). A sort key cannot be fewer than 96 bits, practically speaking.

By now it should be obvious that if you are thinking, “How can I make this work?”, it isn’t because you aren’t getting something, it is because what they’ve described simply doesn’t suit what you need. It’s a fallacy to read a paper or a site and then get muddled in the details wondering why it won’t work for you. It won’t work for you because it wasn’t designed by you for you. If you need more bits, you need more bits. Just because he suggested it, 64 bits is not the standard, and if you need more then you need more. The idea, however, is still to reduce the bits you need to as few as possible. Don’t feel bad if you can’t get it down to 64. It has to be 96 anyway, and I myself always use 128 bits.

What should be responsible for filling this data out?

Should the model/terrain/water/whatever be able to fill this information out about themselves (and have to know what a RenderQueueKey is, or at least its format)?

Why not?
The graphics module doesn’t know what a model is etc. And a render-queue doesn’t need to care about those details either. It just needs to provide a structure and sort it.

If the model library, terrain library, etc. are all getting vertex buffers from the graphics library, there’s no reason they can’t also know the details they need to pass into a render-queue, also provided by the graphics library.

what's the best way to be able to do this once the render system is taking over after a scene

Trick question.
The graphics library is not providing any form of “render system”.
The scene manager orchestrates how objects are rendered. It relies on helpers such as render queues to determine the order in which to render, and each individual render is up to each individual object, so that terrain can do by itself vastly different operations from what standard models are doing.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

WFP

2,787

Author

January 31, 2015 12:49 PM

Hi L. Spiro,

Thanks for the great reply. My issue was that I was thinking the RenderQueue was higher up in the hierarchy than it really needed to be. Moving it down to part of the graphics module makes the picture a lot clearer. And that's good to hear that going up to 128 bits for a key isn't uncommon - that seems to provide a lot more flexibility.

The only question I have left at this point is when it comes to shadow mapping. Basically, should models, terrain, etc., all know how to draw themselves to a shadow map, as well. That is, should these objects have both a renderStandard and renderShadowMap type of function that gets called at the appropriate time? The main reason this is an issue is that, while for most objects a simple transform vertex shader and no pixel shader works great, it obviously doesn't handle cases where features like alpha clip (needs a pixel shader), or tesselation (hull + domain shaders) are required.

I have a method of working around this in my current implementation, but it's kludgy and while I'm redesigning this part of my engine to be more flexible, I'd like to update as much as possible in one go. I think with that question answered, I'll be in good shape.

EDIT: Thought of another question that someone may be able to clear up. With the render queue items example explained thus far - how does the texture ID portion of it adapt to cases where an object uses more than one texture and/or a blend map? Should the render queue item just base its texture ID on the first one used, since that's the most common case, or should it somehow be able to store the IDs of all textures used?

Thanks for your help!

L. Spiro

25,818

February 01, 2015 01:26 AM

With the render queue items example explained thus far - how does the texture ID portion of it adapt to cases where an object uses more than one texture and/or a blend map? Should the render queue item just base its texture ID on the first one used, since that's the most common case, or should it somehow be able to store the IDs of all textures used?

When you keep in mind that a render-queue is just for optimization, and the worst-case scenario is that things be drawn less optimally then they should be, you are free to make very big assumptions without worrying too much if they are correct for every single possible situation (plus you have the luxury of benchmarking to verify your assumptions).

You can assume safely that if any 2 objects are using the same diffuse texture, they are also using the exact same set of accompanying textures, such as normal maps etc. In practice, I’ve never heard of the case where 2 objects are using the same diffuse texture but different normal maps, and if such a rare case ever does appear, it gets drawn less-than-optimally. That’s obviously a better case than always slowing down every render by checking 128 textures on every single object.

In short, it is enough to test only the 1st texture.

The only question I have left at this point is when it comes to shadow mapping. Basically, should models, terrain, etc., all know how to draw themselves to a shadow map, as well. That is, should these objects have both a renderStandard and renderShadowMap type of function that gets called at the appropriate time?

The best solution is as you would expect the most time-consuming. Your graphics engine really should be fully data-driven, and allow shadow-map-creation to be just another data-defined pass.
In practice, this is almost never the case because people just don’t have that kind of time to wait to see anything drawn on the screen.

Until you can get things fully data-driven, yes, making a “hard-coded” pass for shadow-map-creation is typical.

And once again objects are deciding for themselves how to draw shadows. Opaque objects just return black, foliage has to do some discarding, and alpha objects need to run full shaders to get color information.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

WFP

2,787

Author

February 01, 2015 01:50 AM

Thanks again, for the well thought-out response. I've already started building this out a little, and came to the same conclusion that you mentioned about texture IDs - if they're really varying after the first one (a very uncommon case), then it'll just have to be a sub-optimal draw.

Here's what I've come up with so far to describe RenderItems - items that act as the keys in the RenderQueue. I've gone with a 128-bit approach like you suggested, and while I currently have a lot of empty space, I think it's good to have that extra room to grow into as the engine matures.


/*
 * HIGHSET - Bits 128 to 65
 *
 * Bits [128 - 125] - 1111
 *   Pass type - Opaque, Transparent, Shadow (Other ?)
 *
 * if Opaque or Shadow
 *   Bits [124 - 109] - 1111 1111 1111 1111
 *     Shader technique ID
 *   Bits [108 - 93] - 1111 1111 1111 1111
 *     Texture ID
 *   Bits [92 - 87] - 1111 11
 *     Subset
 *   Bits [86 - 65] - unused
 *    
 * else if Transparent (these switch because transparency has to be sorted back to front)
 *   Bits [124 - 93] - 1111 1111 1111 1111 1111 1111 1111 1111
 *     Depth
 *   Bits [92 - 77] - 1111 1111 1111 1111
 *     Shader technique ID
 *   Bits [76 - 65] - unused
 *
 * LOWSET - Bits 64 - 1
 * 
 * if Opaque or Shadow
 *   Bits [64 - 33] - 1111 1111 1111 1111 1111 1111 1111 1111
 *     Depth
 *   Bits [32 - 1] - unused
 *
 * else if Transparent
 *   Bits [64 - 49] - 1111 1111 1111 1111
 *     Texture ID
 *   Bits [48 - 43] - 1111 11
 *     Subset
 *   Bits [42 - 1] - unused
 *
 */

        class RenderItem
	{
	public:
		uint64 highSet;
		uint64 lowSet;
	};

	inline bool operator<(const RenderItem& lhs, const RenderItem& rhs)
	{
		return lhs.highSet < rhs.highSet ? true : lhs.highSet > rhs.highSet ? false :
			// highSets are equal so compare low sets
			lhs.lowSet < rhs.lowSet ? true : false;
	}

	inline bool operator>(const RenderItem& lhs, const RenderItem& rhs)
	{
		return rhs < lhs;
	}

Is there anything you see that's definitely missing that should be there before going any further? One thing I know you've mentioned in a few posts is having a vertex buffer ID included. Currently my vertex buffer object is just a very thin wrapper around the underlying ID3D11Buffer object holding the actual data. It wouldn't be too much of a hassle to add an ID to each new buffer as they're created, though. Is this something that should definitely be added? Are there any other critiques, as well?

Thanks for all your help on this!

WFP

EDIT: Corrected the math around the bit layouts. More edits for formatting.

Tessellator

1,402

February 02, 2015 11:00 AM

Just throwing in another idea that's a slight change on the render key approach that I'm trying recently:

In my hobby engine I maintain a list of sorted material instances - they're sorted by state, shader, texture and hashed constant data. They don't need much sorting after being loaded (changing material params or streaming in new materials requires another small re-sort). I sort these much like you would with a render key and maintain a list of indices into the material instances. Additionally, each material instance also knows its sorted index.

When I render, rather than trying to pack lots of info into the render key, I make use of the fact that the material instances are already sorted and simply use the material instance's sorted index + mesh data (vb/ib handles munged together) to then sort the data. I can also pack depth in at this stage since I don't need much space for material index (14 bits currently and that's overkill for my needs).

Seabolt

782

February 02, 2015 04:48 PM

Also, I like to remove the concept of a pass from your lower level renderer. The render pass should really just be a set of input targets, output targets, and then a bunch of information that your objects provide on how to draw themselves, the renderer generally doesn't care about the higher level construct about passes.

Perception is when one imagination clashes with another

Propagating data through an engine to constant buffers

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Propagating data through an engine to constant buffers

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines