Advanced Render Queue API

Started by
30 comments, last by melbow 11 years, 3 months ago
Hodgman, wouldn't this violate the strict aliasing rule when you cast a Command reference to a Foo or Bar reference, or vice versa?

Yes. Technically, casting a Foo* to a Command* is undefined behaviour, but in practice, it will work in most situations.

We're never writing to an aliased Command and reading from an aliased Foo (or vice versa) inside the one function, which minimizes the risks.
e.g. this code would be dangerous:


assert( command.id == 0 );//assume the command is actuall a "Foo"
command.id = 42;//change the id value
Foo& foo = *(Foo*)&command;
assert( foo.id == 42 );//the id value should be changed on the "Foo" also, but this might fail in optimized builds!

The worst thing in the earlier code is a sub-optimal assertion:


assert( command.id >= Commands::Bar0 && command.id <= Commands::Bar2 );//this will load command.id from RAM
Bar& bar = *(Bar*)&command;
device.SetBarSlot( bar.id - Commands::Bar0, bar.value );//bar.id will generate another "load" instruction here, even though the value was loaded above


Also, the only value that we actually need to "alias" is the first member -- u8 id -- and it doesn't actually need to be aliased as a different type, so it's possible to write this system in a way that doesn't violate strict aliasing if you need to -- e.g.


//Instead of this:
Foo foo = { Commands::Foo, 1337 };
Command* cmd = (Command*)&foo;
SubmitCommand( device, *cmd );

//We could use
Foo foo = { Commands::Foo, 1337 };
u8* cmd = &foo.id;
SubmitCommand( device, cmd );

//with:
inline void SubmitCommand(Device& device, u8* command)
{
	g_CommandTable[*command](device, command);
}
void Submit_Foo(Device& device, u8* command)
{
	assert( *command == Commands::Foo );
	Foo& foo = *(Foo*)(command - offsetof(Foo,id));
	device.DoFoo( foo.value );
}

P.S. u8* (my version of unsigned char*) is allowed to alias any other type (strict aliasing rule doesn't apply to it), but the above version will work even if this wasn't true.

Advertisement
Also, the only value that we actually need to "alias" is the first member -- u8 id -- and it doesn't actually need to be aliased as a different type, so it's possible to write this system in a way that doesn't violate strict aliasing if you need to -- e.g.

Thanks for the example. Would you still need the Command struct with this design? Also, I was wondering whether you think it's worth trying to always avoid breaking the strict aliasing rule, or do you think it's better to just risk the undefined behavior if it's the simplest option?

Thanks for the example. Would you still need the Command struct with this design? Also, I was wondering whether you think it's worth trying to always avoid breaking the strict aliasing rule, or do you think it's better to just risk the undefined behavior if it's the simplest option?

No, the command struct has been replaced with a pointer to the id's primitive type.

Yes, breaking the strict-aliasing rule can be very bad, because it can cause the compiler to emit code that doesn't do what you intended it to! So it should be avoided.

I've taken this thread off-topic enough already, so I've started new topic just about the strict aliasing rule over here wink.png

I would like to add another question in this topic:

How do you handle RenderItems (objects) that require a Texture that is generated by a differente RenderStage.

Example:

In a deferred renderer, every light source needs to have it's shadow map generated, but you only have a GPU resource to store the shadow map so you have to:

-Generate Light 1 shadow map;

-Draw Light 1;

-Generate Light 2 shadow map;

-Draw Light 2;

...

Currently I handle this by having a command called ExecuteRenderStage that stop the rendering of the current render stage, executes another stage and restores back to the "main" one, but I would like to hear how you do it.

All this talk of unpredictable behavior has me questioning this approach. What if a Command was simply a sort of container, like:
struct Command {
Commands::Type id;
union {
Foo* foo;
Bar* bar;
} u;
};

I don't know why I didn't mention it before, but in my own engine I get around the undefined behaviour the potential aliasing issues with inheritance...


struct Command { Commands::Type id; };
struct Foo : public Command { int value; };

How do you handle RenderItems (objects) that require a Texture that is generated by a differente RenderStage

I just submit a series of stages. e.g. the stage to generate a shadow-map, then a stage that draws the light (which is a draw-call paired who's paired state-group sets the texture generated by the first stage).

[quote name='Hodgman' timestamp='1357784838' post='5019739']
I get around the undefined behaviour with inheritance...
[/quote]

But if you use inheritance, don't the structs become non-POD types? That might create more undefined behavior to deal with -- for example, I was thinking of using memcmp for detecting redundant state-changes in the RenderGroup class, but that would only work if the structs were POD.

I too am puzzled by how redundant state changes are eliminated in this model. Am I correct in that states may be submitted in any order? And if this is the case, then states may be sorted and then linearly compared. However, this seems expensive considering how many states may be set per frame. I'm sure you have a much more clever way of doing this.
But if you use inheritance, don't the structs become non-POD types? That might create more undefined behavior to deal with -- for example, I was thinking of using memcmp for detecting redundant state-changes in the RenderGroup class, but that would only work if the structs were POD.
You've got a good eye for C++ details ;) I should've said inheritance avoids the strict-aliasing issues, but you're right, the standard says that using inheritance like that means they're now non-POD.
However, on the compilers that I support, they still act as if they were POD, so I can still memcmp/memcpy them on these compilers. Relying on compiler details should generally be avoided, but it's something you can choose to do unsure.png

Instead of inheritance, I guess I could've used composition to be fully compliant, e.g.
struct Command { int id; };
struct FooCommand { Command baseClass; int fooValue; };
I too am puzzled by how redundant state changes are eliminated in this model. Am I correct in that states may be submitted in any order? And if this is the case, then states may be sorted and then linearly compared. However, this seems expensive considering how many states may be set per frame. I'm sure you have a much more clever way of doing this.
I haven't really mentioned redundant state removal, except that I do it at the "second level". The 1st level takes a stream of commands, and can't do any redundant state removal besides the traditional technique, which is to check the value of every state before submitting it, something like:
if( 0!=memcmp(&cache[state.id], &state, sizeof(State)) ) { cache[state.id]=state; Apply(state); }

A lot of renderers do do redundant state checking at that level, which pretty much means having an if like the above every time you go to set a state. I do a little bit of this kind of state caching, but try to avoid it.
Instead, I do redundant state checking at the next level up -- the part that generates the sequences of commands in the first place. This part of the code also submits commands to set states back to their default values if a particular draw-call hasn't been paired with any values for that state.
After sorting my render-items, the "2nd layer" which produces the stream of commands for the 1st layer looks like:
defaults[maxStates] = {/*states to apply if a value doesn't exist for them*/}

previousState[maxStates] = {NULL} // a cache of which states are 'current'

nonDefaultState[maxStates] = {true} // which states have a non-default value

for each item in renderItems

 draw = item.draw
 stateGroups = item.stateGroups
 
 statesSet[maxStates] = {false} //which states have been set by this item
 for each group in stateGroups
  for each state in group
   if statesSet[state.id] == false && //this state not set by a previous group in this item
      previousState[state.id] != state //this state not set by a previous item and still current
    then
     Submit(state) // add to command buffer, or send to device
     statesSet[state.id] = true
     previousState[state.id] = state
   endif
  endfor
 endfor

 setToDefault = nonDefaultState & ~statesSet
 nonDefaultState = statesSet
 for each id in setToDefault
  Submit(defaults[state.id]) // add to command buffer, or send to device
  previousState[state.id] = defaults[state.id]
 endfor

 Submit(draw) // add to command buffer, or send to device

endfor
Except the actual C++ code uses a lot of bitmasks instead of arrays of bools, and uses pointers to identify state value equality, and everything is tightly laid out to be cache-friendly, etc...
Thanks again Hodgman. I really appreciate how detailed yet concise your responses are. The only thing that is still not completely clear to me is the generation of RenderItems. Are they allocated each frame from a data cache (like what is described here http://docs.madewithmarmalade.com/native/api_reference/iwgxapidocumentation/iwgxapioverview/datacache.html )? And would a higher level object like a GLShader or GeometryPacket class then maintain their respective Commands? I am not seeing a way to to check for duplicate states by comparing pointers unless the Commands are maintained by global, shared resources, or I guess if Commands ARE global shared resources, but the first option seems cleaner.

Again, I really appreciate everyone's input on this thread, it has helped me a great deal.

This topic is closed to new replies.

Advertisement