Propagating data through an engine to constant buffers

Started by
25 comments, last by Seabolt 9 years, 2 months ago

Hi all,

I was still hoping to get a little help with the items mentioned in Post #16 and thought of another issue to think about.

When building render queues, a lot of places I've been looking seem to recommend building out something of a descriptor key for each object - for example, a 64 bit unsigned integer where certain bits are reserved for shader id, texture id, pass type, depth, etc.

What should be responsible for filling this data out? Should the model/terrain/water/whatever be able to fill this information out about themselves (and have to know what a RenderQueueKey is, or at least its format)? When the setup is like my existing implementation and everything's kind of homogeneous (everything's a model) it seems a little clearer, but what's the best way to be able to do this once the render system is taking over after a scene update for everything else (including models, but also extending into terrain and anything else)?

Thanks for your help.

Advertisement

a lot of places I've been looking seem to recommend building out something of a descriptor key for each object - for example, a 64 bit unsigned integer where certain bits are reserved for shader id, texture id, pass type, depth, etc.

This usually comes from this site, and while he has the right general idea, and while I have his invaluable book, with all due respect, he is not a graphics programmer.
In practice, you can almost never fit things into a single 64-bit integer, especially when you find 2 objects that have the same shader and vertex buffer and set of textures, etc. (which would make them a great candidate for instancing) but need to decide between them which to draw first, which of course always goes to depth (opaque objects should always be drawn front-to-back). A sort key cannot be fewer than 96 bits, practically speaking.

By now it should be obvious that if you are thinking, “How can I make this work?”, it isn’t because you aren’t getting something, it is because what they’ve described simply doesn’t suit what you need. It’s a fallacy to read a paper or a site and then get muddled in the details wondering why it won’t work for you. It won’t work for you because it wasn’t designed by you for you. If you need more bits, you need more bits. Just because he suggested it, 64 bits is not the standard, and if you need more then you need more. The idea, however, is still to reduce the bits you need to as few as possible. Don’t feel bad if you can’t get it down to 64. It has to be 96 anyway, and I myself always use 128 bits.


What should be responsible for filling this data out?

Should the model/terrain/water/whatever be able to fill this information out about themselves (and have to know what a RenderQueueKey is, or at least its format)?

Why not?
The graphics module doesn’t know what a model is etc. And a render-queue doesn’t need to care about those details either. It just needs to provide a structure and sort it.

If the model library, terrain library, etc. are all getting vertex buffers from the graphics library, there’s no reason they can’t also know the details they need to pass into a render-queue, also provided by the graphics library.


what's the best way to be able to do this once the render system is taking over after a scene

Trick question.
The graphics library is not providing any form of “render system”.
The scene manager orchestrates how objects are rendered. It relies on helpers such as render queues to determine the order in which to render, and each individual render is up to each individual object, so that terrain can do by itself vastly different operations from what standard models are doing.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Hi L. Spiro,

Thanks for the great reply. My issue was that I was thinking the RenderQueue was higher up in the hierarchy than it really needed to be. Moving it down to part of the graphics module makes the picture a lot clearer. And that's good to hear that going up to 128 bits for a key isn't uncommon - that seems to provide a lot more flexibility.

The only question I have left at this point is when it comes to shadow mapping. Basically, should models, terrain, etc., all know how to draw themselves to a shadow map, as well. That is, should these objects have both a renderStandard and renderShadowMap type of function that gets called at the appropriate time? The main reason this is an issue is that, while for most objects a simple transform vertex shader and no pixel shader works great, it obviously doesn't handle cases where features like alpha clip (needs a pixel shader), or tesselation (hull + domain shaders) are required.

I have a method of working around this in my current implementation, but it's kludgy and while I'm redesigning this part of my engine to be more flexible, I'd like to update as much as possible in one go. I think with that question answered, I'll be in good shape.

EDIT: Thought of another question that someone may be able to clear up. With the render queue items example explained thus far - how does the texture ID portion of it adapt to cases where an object uses more than one texture and/or a blend map? Should the render queue item just base its texture ID on the first one used, since that's the most common case, or should it somehow be able to store the IDs of all textures used?

Thanks for your help!

With the render queue items example explained thus far - how does the texture ID portion of it adapt to cases where an object uses more than one texture and/or a blend map? Should the render queue item just base its texture ID on the first one used, since that's the most common case, or should it somehow be able to store the IDs of all textures used?

When you keep in mind that a render-queue is just for optimization, and the worst-case scenario is that things be drawn less optimally then they should be, you are free to make very big assumptions without worrying too much if they are correct for every single possible situation (plus you have the luxury of benchmarking to verify your assumptions).

You can assume safely that if any 2 objects are using the same diffuse texture, they are also using the exact same set of accompanying textures, such as normal maps etc. In practice, I’ve never heard of the case where 2 objects are using the same diffuse texture but different normal maps, and if such a rare case ever does appear, it gets drawn less-than-optimally. That’s obviously a better case than always slowing down every render by checking 128 textures on every single object.

In short, it is enough to test only the 1st texture.


The only question I have left at this point is when it comes to shadow mapping. Basically, should models, terrain, etc., all know how to draw themselves to a shadow map, as well. That is, should these objects have both a renderStandard and renderShadowMap type of function that gets called at the appropriate time?

The best solution is as you would expect the most time-consuming. Your graphics engine really should be fully data-driven, and allow shadow-map-creation to be just another data-defined pass.
In practice, this is almost never the case because people just don’t have that kind of time to wait to see anything drawn on the screen.

Until you can get things fully data-driven, yes, making a “hard-coded” pass for shadow-map-creation is typical.


And once again objects are deciding for themselves how to draw shadows. Opaque objects just return black, foliage has to do some discarding, and alpha objects need to run full shaders to get color information.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Thanks again, for the well thought-out response. I've already started building this out a little, and came to the same conclusion that you mentioned about texture IDs - if they're really varying after the first one (a very uncommon case), then it'll just have to be a sub-optimal draw.

Here's what I've come up with so far to describe RenderItems - items that act as the keys in the RenderQueue. I've gone with a 128-bit approach like you suggested, and while I currently have a lot of empty space, I think it's good to have that extra room to grow into as the engine matures.


/*
 * HIGHSET - Bits 128 to 65
 *
 * Bits [128 - 125] - 1111
 *   Pass type - Opaque, Transparent, Shadow (Other ?)
 *
 * if Opaque or Shadow
 *   Bits [124 - 109] - 1111 1111 1111 1111
 *     Shader technique ID
 *   Bits [108 - 93] - 1111 1111 1111 1111
 *     Texture ID
 *   Bits [92 - 87] - 1111 11
 *     Subset
 *   Bits [86 - 65] - unused
 *    
 * else if Transparent (these switch because transparency has to be sorted back to front)
 *   Bits [124 - 93] - 1111 1111 1111 1111 1111 1111 1111 1111
 *     Depth
 *   Bits [92 - 77] - 1111 1111 1111 1111
 *     Shader technique ID
 *   Bits [76 - 65] - unused
 *
 * LOWSET - Bits 64 - 1
 * 
 * if Opaque or Shadow
 *   Bits [64 - 33] - 1111 1111 1111 1111 1111 1111 1111 1111
 *     Depth
 *   Bits [32 - 1] - unused
 *
 * else if Transparent
 *   Bits [64 - 49] - 1111 1111 1111 1111
 *     Texture ID
 *   Bits [48 - 43] - 1111 11
 *     Subset
 *   Bits [42 - 1] - unused
 *
 */

        class RenderItem
	{
	public:
		uint64 highSet;
		uint64 lowSet;
	};

	inline bool operator<(const RenderItem& lhs, const RenderItem& rhs)
	{
		return lhs.highSet < rhs.highSet ? true : lhs.highSet > rhs.highSet ? false :
			// highSets are equal so compare low sets
			lhs.lowSet < rhs.lowSet ? true : false;
	}

	inline bool operator>(const RenderItem& lhs, const RenderItem& rhs)
	{
		return rhs < lhs;
	}

Is there anything you see that's definitely missing that should be there before going any further? One thing I know you've mentioned in a few posts is having a vertex buffer ID included. Currently my vertex buffer object is just a very thin wrapper around the underlying ID3D11Buffer object holding the actual data. It wouldn't be too much of a hassle to add an ID to each new buffer as they're created, though. Is this something that should definitely be added? Are there any other critiques, as well?

Thanks for all your help on this!

WFP

EDIT: Corrected the math around the bit layouts. More edits for formatting.

Just throwing in another idea that's a slight change on the render key approach that I'm trying recently:

In my hobby engine I maintain a list of sorted material instances - they're sorted by state, shader, texture and hashed constant data. They don't need much sorting after being loaded (changing material params or streaming in new materials requires another small re-sort). I sort these much like you would with a render key and maintain a list of indices into the material instances. Additionally, each material instance also knows its sorted index.

When I render, rather than trying to pack lots of info into the render key, I make use of the fact that the material instances are already sorted and simply use the material instance's sorted index + mesh data (vb/ib handles munged together) to then sort the data. I can also pack depth in at this stage since I don't need much space for material index (14 bits currently and that's overkill for my needs).

Also, I like to remove the concept of a pass from your lower level renderer. The render pass should really just be a set of input targets, output targets, and then a bunch of information that your objects provide on how to draw themselves, the renderer generally doesn't care about the higher level construct about passes.

Perception is when one imagination clashes with another

This topic is closed to new replies.

Advertisement