Just a quick mockup of a renderer::render method. Am I completely on the wrong track?
Yeah that looks similar to what I'm used to. I use something analogous to your "[font="Courier New"]uiUsedCommands[/font]/[font="Courier New"]bApply[/font]" code to ensure commands at the top of the stack take precedence over commands of the same type lower in the stack.
My [font="Courier New"]bApply[/font] test is a bit more complicated though, as it also checks if the command being inspected was already set by the previous render-instance. i.e. if two consecutive render instances use the same material, then all the states from the material's state-group can usually be ignored when drawing the 2nd instance.
My "Iterate render instances" loop is also passed a "default" state-group, which is conceptually put at the bottom of every state-stack. If an instance
doesn't set a particular state
and the default group contains that state, then the default value will be used.
If you don't do this, then you end up with behaviours like -- one object enables alpha blending, and then all following objects also end up being alpha-blended, because they didn't manually specify a "disable alpha blending" command.
Also, with the way your code is at the moment, only a single [font="Courier New"]SetRenderState[/font] command will be applied per instance. If you want to set two
different render-states, only the first one will actually be set at the moment (the second will be ignored). For this reason, I have every different render-state as a different command ID.
I'm having a hard time figuring out which commands I could define as all I could come up with where the 5 I've shown. I'm also a bit in doubt why you would make a separate Drawcall class instead of having it as a command.[/quote]As above, I've got commands for each different render-state. I've also got commands for each different CBuffer slot and each texture-binding slot (for each type of shader).
I've limited myself to 14 CBuffer slots each for the vertex and pixel shader, so, there's actually 28 different IDs that are associated with the "bind cbuffer" command.
My draw-calls are actually a command, just like state-changes. However, I split commands into 3 different categories -- general state-changes, draw-calls, and per-pass state-changes.
State-groups can only contain general state-changes. Actual render-instances must use a draw-call command (not a state-change command).
The 3rd category are stored in something similar to a state-group, which is used to set up an entire "pass" of the rendering pipeline -- commands such as binding render-targets, depth-buffers, viewports, scissor tests, etc go into this category.Come command queue execution I have to apply / update these uniforms after having bound the shader program. Would that result in a new command type?[/quote]There's a bunch of different abstractions for how uniforms are set, depending on your API... GL uses this model you're familiar with, you set the uniforms on the currently bound program... DX9 uses a model where there's a set of ~200 global registers, and any changes made to them persist from one shader to the next... DX10/11 are similar to 9, but you've got a set of bound CBuffers instead of individually bound uniforms.
So, I looked at these abstractions, and decided that the cbuffer approach made the most sense to me. No matter what the back-end rendering API actually is, my renderer deals with cbuffers -- and as as above, I've got 14 cbuffer binding slots/commands per shader type.
The way this is used generally, is that a "shader" state-group on the bottom of the stack contains commands to bind cbuffers that contain default values. The "material" and "object/instance" cbuffers then contains commands to bind their own cbuffers (which override the "default" commands).
On APIs that don't actually use the cbuffer abstraction, then yes, there's a step that looks at the currently bound cbuffers and sets all of the individual uniforms. I do this step prior to every draw call (with a whole bunch of optimisations to skip unnecessary work).
Regarding memory layout, I allocate all my cbuffer blocks (which are blobs containing uniforms) from a separate linear allocator.