Sign in to follow this  

Modern Renderer Design

This topic is 817 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

in the current state of my engine, I have an interface called "IRenderable" and a bunch of classes implementing this interface.

The interface (being a pure virtual class) let's every derived class define it's own rendering logic. However, I don't like the fact that those classes know how to render themselves. Just imagine porting to another backend. 

 

I am thinking of an another approach with cache locality in mind. I've created structures like the following:
 

struct Light{
	enum class LightType{
		SPOT;
		DIRECTIONAL;
		POINT;
	} lightType;
	union{
		// ...
	} lightData;
};

struct Mesh{
	glm::mat4	worldMatrix;
	int		vertexBuffer;
	int 		indexBuffer;
};

I want to feed the renderer with this data (which is possibly stored as a chunk of continous memory).
But I have no idea how to feed the renderer with this data. My naiv approach is/was to have methods which add those classes into a vector and a render method which processes over the structs.

 

TL:DR, the question:

However, I have no idea how a modern renderer is designed in C++. I never touched a C++ engine (yet).

All my projects involving large scale rendering where done using Unity (Drag'n'Drop the Mesh, done.). My 3D-demos were all written in C# using the "IRenderable" approach.
Are there any papers, blogs, articles (and what so ever) that can give me an impression on how to design a renderer?

LG Julien
P.S.: Before the "Dont-Make-Engine-Make-Games"-People declare Jihad against me, I'd like to mention, that I am simultaneously working on a game RTS game (does anyone remember the "Jungle Troll Mod" for WC3?)

Share this post


Link to post
Share on other sites

Game Engine Architecture, Second Edition 
by Jason Gregory 
Link: http://amzn.com/1466560010

 

This is most likely your best resource for doing it yourself.  I don't know of any thorough online resource.

 

--Edit--

You may also find this helpful: While we wait – Approaching Zero Driver Overhead  http://cg.alexandra.dk/?p=3778  

Edited by Glass_Knife

Share this post


Link to post
Share on other sites

 


I have an interface called "IRenderable" and a bunch of classes implementing this interface.
The interface (being a pure virtual class) let's every derived class define it's own rendering logic. However, I don't like the fact that those classes know how to render themselves. Just imagine porting to another backend. 
Yeah I hate that design. Different types of "renderables" should not have to write backend-specific code.

In my engine, I've made a base "DrawItem" structure (which is ported to every backend). Different types of "renderables" can then be composed of DrawItems (not inherit from them).

I've made lots of posts about this so I'll just link one tongue.png http://www.gamedev.net/topic/666419-what-are-your-opinions-on-dx12vulkanmantle/#entry5215127

 

 

But in my case my "DrawItem" isn't in API (D3D/GL/ect) "native form". (not sure if this is better or worse)

So everytime I execute a DrawCall(or actually for every type of call ClearBuffer/Execute a compute program/ect.) I translate that strcture to the Native API and then execute it.

struct DrawCall {
    // ResourceProxy is just a pointer...

    ResourceProxy<ShadingProgram> m_shadingProg; // I "link" all the programs in one "shading program" in order ot reduce the size of that strcture.
    ResourceProxy<BufferResource> m_vertexBuffers[GraphicsCaps::NUM_VERTEX_BUFFER_SLOTS];
    uint32 m_vbOffsets[GraphicsCaps::NUM_VERTEX_BUFFER_SLOTS];
    uint32 m_vbStrides[GraphicsCaps::NUM_VERTEX_BUFFER_SLOTS];
    boost_small_vector<VertexDecl, 3> m_vertDecl;
    PrimitiveTopology::Enum m_primTopology;
    ResourceProxy<BufferResource> m_indexBuffer;
    UniformType::Enum m_indexBufferFormat;
    uint32 m_indexBufferByteOffset;

    // these here a basically std::vectors, I really want to avoid all that dynamic allocations, becase of that I use boost::small_vector
    // But I'm not sure if this is the right thing to do? What is your solution?
    BoundCBuffersContainter m_boundCbuffers;
    BoundTexturesContainter m_boundTextures;
    BoundSamplersContainter m_boundSamplers;
    
    ResourceProxy<FrameTarget> m_frameTarget; // render targets + depth stencil
    Viewport m_viewport; // Currently I support only one viewport...
    
    // I'm concidering to combine those 3 in 1 object in order to shrink that strcture a bit.
    ResourceProxy<RasterizerState> m_rasterState;
    ResourceProxy<DepthStencilState> m_depthStencilState;
    ResourceProxy<BlendState> m_blendState;

    DrawExecDesc m_drawExec; // aka. the Draw/DrawIndexted/DrawIndexedInstanced/ect.
}
Edited by imoogiBG

Share this post


Link to post
Share on other sites


They achieve nothing unless you need to change the API at run-time, which you never will.

Well, actually I wanted to. Let's say a target machine doesn't support OpenGL 4 due to missing drivers.

I'd fall back to OpenGL 2.

 


Having a centralized location (a renderer module) trying to manage how all of these types of objects render is a gross violation of the single-responsibility principal and invariably leads to monolithic spaghetti code.

That's the problem I am trying to avoid at all cost.

 


The renderer module is low-level.  Everything can access it and do what they want.  It’s only job is to provide a universal interface so that models, terrain, etc. don’t have to worry about which API is being used.


So it's better to abstract the rendering api (perhaps into a stateless rendering api; I just stumbled upon this, kind of similar to hodgeman's approach, isn't it)? 


About the sceneManager: Is it just a bunch of classes holding every mesh etc. in a std::vector (or something comparable)? 
And is the sceneManager commiting the draw calls?


So, finally: Thanks to any who replied. I've got some awesome content to think about (@L.Spiro, @Hodgeman)
LG Julien

Share this post


Link to post
Share on other sites

in my case my "DrawItem" isn't in API (D3D/GL/ect) "native form". (not sure if this is better or worse)
So everytime I execute a DrawCall(or actually for every type of call ClearBuffer/Execute a compute program/ect.) I translate that strcture to the Native API and then execute it.

If CPU-usage becomes an issue for you, you'll be able to optimize that later to pre-convert from agnostic (platform-independent) data into platform-specific data once in advance, instead of on every draw. Performing this optimization will make dynamic renderables a bit more cumbersome though -- e.g. often UI code, debug visualisations, some special effects, have their DrawItems recreated every frame, which is likely easier in your system.

// these here a basically std::vectors, I really want to avoid all that dynamic allocations, becase of that I use boost::small_vector
// But I'm not sure if this is the right thing to do? What is your solution?

I often use in-place, variable-length arrays, via ugly C-style code, which requires the size of your array to be immutable. IMHO pre-compiled DrawItems should be largely immutable anyway:

struct Widget
{
  uint8_t fooCount;
  uint8_t barCount;
  Foo* FooArray() { return (Foo*)(this+1);  }
  Bar* BarArray() { return (Bar*)(FooArray()+fooCount); }
  size_t SizeOf() const { return sizeof(Widget) + sizeof(Foo)*fooCount + sizeof(Bar)*barCount; }
};
static_assert( alignof(Foo)%alignof(Widget) == 0 || alignof(Widget)%alignof(Foo) == 0 );//assume that it's safe to allocate the arrays end-to-end like this...
static_assert( alignof(Bar)%alignof(Foo) == 0 || alignof(Foo)%alignof(Bar) == 0);

//Create a nice compact Widget from two std::vectors
Widget* MallocWidget( const std::vector<Foo>& inFoo,  const std::vector<Bar>& inBar )
{
  Widget temp = { inFoo.size(), inBar.size() };//init counts
  Widget* out = (Widget*)aligned_malloc( temp.SizeOf(), alignof(Widget) );//compute full size
  *out  = temp;//copy count members
  Foo* outFoo = out->FooArray();
  Bar* outBar = out->BarArray();
  for( size_t i=0, end=foo.size(); i!=end; ++i )
    outFoo[i] = inFoo[i];
  for( size_t i=0, end=bar.size(); i!=end; ++i )
    outBar[i] = inBar[i];
  return out;
}

So it's better to abstract the rendering api (perhaps into a stateless rendering api; I just stumbled upon this, kind of similar to hodgeman's approach, isn't it)?

Yep, after using a few stateless rendering APIs, they're now the only choice for me smile.png

 

IMHO it's also a very good idea to have a simple rendering API as the "base level", which doesn't know anything about scenes/etc... all it does is act like D3D/GL, but easier to use, and cross-platform. Your scene-manager(s) are then the next layer that is built upon this simple base API.

Edited by Hodgman

Share this post


Link to post
Share on other sites

So, I am doing a prototype (a draft) based on the information you gave me.

I've following classes:

struct Mesh; // Data only
struct Light; // Data only

class  MeshRenderer;
class  LightRenderer;

class  Scene;
class  RenderQueue;
class  Renderer;

The idea is, that each renderable entity has it's own renderer. The structures are being submitted to the "Scene", the "*Renderer" class process the data and submit drawcalls to the "RenderQueue" using DrawItems. The RenderQueue sorts the draw calls (opacity hey ho!) and if it finds the same drawcalls (e.g. using same vertex/indexbuffers handles) it batches them into a "InstancedDrawCall".
Finally the "Renderer" processes these DrawCalls. 

Is this the way to go? I am still not sure on how much I should abstract things. Should the renderQueue be aware of updating/creating resources (loading vertices in a VertexBuffer?).

EDIT:
Should the render queue be like the "deferred context" as known from DirectX 11, yet stateless?
 

Edited by MyNameIsJulien

Share this post


Link to post
Share on other sites

Should the renderQueue be aware of updating/creating resources (loading vertices in a VertexBuffer?).

Absolutely not. The render-queue is nothing but a very small set of integers (or a single 64-bit integer if possible) which contains data needed for sorting. That means a shader ID, texture ID, any small ID numbers that you want to include for sorting, and the fractional bits of a normalized (0-1) float for depth.

Should the render queue be like the "deferred context" as known from DirectX 11, yet stateless?

See above. A render-queue has no relationship to contexts. It simply sorts draw calls. There is no reason it needs to know about resource creation or contexts or literally anything else but integers.


L. Spiro

Share this post


Link to post
Share on other sites

I submit render queues into "GpuContexts", and a "GpuContext" is a wrapper around an Immediate Context / Deferred Context / OpenGL Context / Command List/etc...

 

Any thread can build a render-queue without even having a pointer to a GpuDevice/GpuContext, as it's just application data. After that, you can submit the queue into a context that is owned by the current thread.

Edited by Hodgman

Share this post


Link to post
Share on other sites

 

Should the renderQueue be aware of updating/creating resources (loading vertices in a VertexBuffer?).

Absolutely not. The render-queue is nothing but a very small set of integers (or a single 64-bit integer if possible) which contains data needed for sorting. That means a shader ID, texture ID, any small ID numbers that you want to include for sorting, and the fractional bits of a normalized (0-1) float for depth.

Sorry for hijacking this, but I have a question related to this. First, in my Engine, which uses a similar design to imoogiBGs DrawCall-Structure, my rendering-queue is a std::vector which holds the pushed DrawCalls of the current frame. These then have their Sorting-Key stored inside them and implement the <-operator.

 

Now, I thought about maybe putting the key and a pointer to the structure it represents into a pair to use this for sorting for better cache locality. I guess since you explicitly say it should only store the keys as integer and nothing else, one should go with two lists for each queue where one contains the states and one the keys? After doing sort which gives doesn't move the data, but rather generates a set of new indices for the values you could use that on the list of DrawCalls?

 

Then, secondly: I have read a lot about the stateless approach and everywhere only saw people talking about storing the id-values of the shaders, buffers or textures. That makes sense for OpenGL, but how would I access the objects for something like D3D? As said, my current structure for D3D11 stores the pointers to the different ID3D11*-Objects a drawcall needs. Of course I could put everything in a hashmap, but I don't see why I should do that when I could store the pointers by having a class that understands the current API and pulls the data out of my generic buffer and shader structures.

 

Thanks in advance!

Edited by mind in a box

Share this post


Link to post
Share on other sites

Now, I thought about maybe putting the key and a pointer to the structure it represents into a pair to use this for sorting for better cache locality. I guess since you explicitly say it should only store the keys as integer and nothing else, one should go with two lists for each queue where one contains the states and one the keys? After doing sort which gives doesn't move the data, but rather generates a set of new indices for the values you could use that on the list of DrawCalls?

I think he doesn't mean "just the key" literally. You need the key and a reference to the data, so swapping 2 units during sorting is relatively cheap (because their small and fixed size). The data itself is then stored in a 2nd structure, yes.

 


Then, secondly: I have read a lot about the stateless approach and everywhere only saw people talking about storing the id-values of the shaders, buffers or textures. That makes sense for OpenGL, but how would I access the objects for something like D3D? As said, my current structure for D3D11 stores the pointers to the different ID3D11*-Objects a drawcall needs. Of course I could put everything in a hashmap, but I don't see why I should do that when I could store the pointers by having a class that understands the current API and pulls the data out of my generic buffer and shader structures.

My way of handling this is that the said id-values are in fact indices into arrays. For each type of such IDs there is a pair of arrays. One of them holds the engine style / low-level API agnostic variant of objects, while the other (managed by the graphic sub-system's backend) holds the API specific objects. There is obviously a 1:1 relation between these objects, because both sides use the same index to address them. But there is not necessarily a 1:1 correspondence between the content. For example, the OpenGLES 2 specific structure equivalent for a texture sampler does not refer to a texture sampler because OpenGLES 2 does not know such a thing. When a drawcall has a parameter that addresses such an operational resource, the low-level API wrapper then looks into its table, initializes the found structure if necessary by falling back upon the engine's object, but uses its specific object otherwise.

 

You could, of course, wrap the IDs into a kind of handle to add some type safety … all the usual things.

 


Well, actually I wanted to. Let's say a target machine doesn't support OpenGL 4 due to missing drivers.
I'd fall back to OpenGL 2.

See above: Splitting the data into 2 halves is one possible way to go.

Edited by haegarr

Share this post


Link to post
Share on other sites

This topic is 817 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this