Wow, this topic has really exploded while I was away...! Let me throw some ideas around and see if anyone can take something away from it...
To handle the actual rendering of the GC, there are hundreds of places in my engine where this takes place. Each geometry chunk can be rendered by itself (if you're just trying to get it on the screen), whole meshes can be rendered (and all geometry chunks are rendered in turn), or you can create compiled lists of geometry chunks that are sorted, optimized, and then rendered (plus several other places).
[EDIT]
I realized some sample code might be helpful to some people, so here's the simplest place my code renders a geometry chunk: in the geometry chunk object itself.
Shader->fillCache( *this, DataUpdate );//DataUpdate = false;////if( !SortingID ){ XSortingID* SortID = new XSortingID; // the sorting ID is filled by the shader with all necessary // information to sort and render this geometry chunk Shader->fillSortingID( *this, *SortID ); SortingID = SortID;}////((XVertexBuffer*) SortingID->VertexBufferPtr )->preLoad();((XVertexBuffer*) SortingID->VertexBufferPtr )->bindBuffer( D3DDevice, SortingID->VertexSize );////((XIndexBuffer*) SortingID->IndexBufferPtr )->preLoad();((XIndexBuffer*) SortingID->IndexBufferPtr )->bindBuffer( D3DDevice );////Shader->enterShader( this );////for( u32 k = 0; k < SortingID->UsedTextureUnits; k ++ ){ D3DDevice.getD3DDevice()->SetTexture( k, ((XTexture*) SortingID->TextureUnits[ k ])->getTextureObj() );}////if( SortingID->WorldMatrix ){ D3DDevice.getD3DDevice()->SetTransform( D3DTS_WORLD, (D3DMATRIX*) SortingID->WorldMatrix );}////SortingID->RenderStates.applyRenderStates( D3DDevice );SortingID->TextureStates.applyRenderStates( D3DDevice );////D3DDevice.getD3DDevice()->DrawIndexedPrimitive( D3DPT_TRIANGLELIST, 0, 0, VertexCount, ((XIndexBuffer*) SortingID->IndexBufferPtr )->getOffsetIndex(), IndexCount / 3 );////XRenderStates difRenderStates;XTextureStates difTextureStates;////difRenderStates = SortingID->RenderStates.findDifference( XRenderStates() );////SortingID->TextureStates.findDifference( XTextureStates(), difTextureStates );////difRenderStates.applyRenderStates( D3DDevice );difTextureStates.applyRenderStates( D3DDevice );////if( SortingID->WorldMatrix ){ D3DDevice.getD3DDevice()->SetTransform( D3DTS_WORLD, NULL );}////for( k = 0; k < SortingID->UsedTextureUnits; k ++ ){ D3DDevice.getD3DDevice()->SetTexture( k, NULL );}////Shader->exitShader();
There are many more efficient ways of rendering geometry chunks (such as batching by texture, render states, position [front to back], etc.) Here's what my XSortingID and XRenderInfo objects look like, to give you an idea of how the shader gives information back to the engine.
/** * XRenderInfo * * an object to encapsulate all necessary data * needed to render a shader * */class EngineMode XRenderInfo{ public: /** * default constructor * */ XRenderInfo(); /*! the used number of texture units */ u32 UsedTextureUnits; /*! the texture units */ u32 TextureUnits[ 8 ]; /*! the vertex buffer pointer */ u32 VertexBufferPtr; /*! the index buffer pointer */ u32 IndexBufferPtr; /*! the vertex size (in bytes) used by the shader */ u32 VertexSize; /*! supported render states */ XRenderStates RenderStates; /*! supported texture stage states */ XTextureStates TextureStates; /*! the world transformation matrix for this object */ matrix4x4* WorldMatrix; /*! boolean flag indicating whether to render this object seperately or not */ bool DoNotOptimize;};
/** * XSortingID * * an object to encapsulate all common attributes of * geometry chunks in order to minimize state changes * */class EngineMode XSortingID : public XRenderInfo{ public: /** * default constructor * */ XSortingID(); /*! the shader ID */ u32 ShaderID; /*! flag indicating this sorting ID has been modified */ bool DirtyUpdate; /** * operator overload to test if two sorting * ID's are equal to each other in the correct * sense of the operation * * other: the other sorting ID to test for equality * * /ret: true if the two objects are equal, false otherwise */ bool operator==( const XSortingID& other ) { if( this == &other ) return true; if( WorldMatrix || other.WorldMatrix ) return false; if( UsedTextureUnits != other.UsedTextureUnits ) return false; if( ShaderID != other.ShaderID ) return false; if( RenderStates.getStateUnion() != other.RenderStates.getStateUnion() || RenderStates.getRigidAlphaValue() != other.RenderStates.getRigidAlphaValue() ) return false; if( TextureStates.getUsedTextureUnits() != other.TextureStates.getUsedTextureUnits() ) return false; u32 UsedUnits = other.TextureStates.getUsedTextureUnits(); for( u32 k = 0; k < UsedUnits; k ++ ) { if( TextureStates.getOperationUnion( k ) != other.TextureStates.getOperationUnion( k ) ) return false; if( TextureStates.getArgumentsUnion( k ) != other.TextureStates.getArgumentsUnion( k ) ) return false; } for( k = 0; k < UsedTextureUnits; k ++ ) if( TextureUnits[k] != other.TextureUnits[k] ) return false; return true; }};
[/EDIT]
Level-of-Detail (LOD) is handled as an external 'add-in' in my engine. This is due to the flexibility that my geometry chunk format gives me (remember: I use 'property lists' or 'attribute lists' as described by poly-gone). I have one attribute named '.position.xyz' which is the stream of vertex positions (not UV coords, normals, or whatever...that's a waste of memory to duplicate). It is this stream of vertex positions that will be rendered. I then have several other properties such as this:
.position.xyz.lod01 - stream of vertex data at LOD level 1
.position.xyz.lod01.loaded - flag indicating whether data is valid (save some memory by unloading it if it's not needed)
.position.xyz.lod02 - stream of vertex data at LOD level 2
....
Then, when the scenegraph is being traversed, and a mesh has a modifier (an attachable class that modifies a node) that handles LOD (calculates LOD or whatever), the modifier can calculate the distance to camera, and set this variable
if( !.position.xyz.lod01.loaded ){ generateLODLevel( .position.xyz.original, LOD_Level01 ); .position.xyz.lod01.loaded = true;}.position.xyz = .position.xyz.lod01
to the desired LOD level, and the engine will automatically handle it. Pretty clean and efficient, eh? And the lovely thing is the engine doesn't even need to know about it. The same idea applies to shader LOD's (they even have similar names, like .shader.lod01, etc) and you might even specify parameters for the LOD algorithm in the property list and the modifier could use it, such as
.position.xyz.lod01.startDistance = 0
.position.xyz.lod01.endDistance = 1
.position.xyz.lod01.desiredPoly = 400
....
This lends itself really well to a scripting language as well.
To handle lack of hardware support, I use multipass fallback techniques (automatically or manually generated) to attempt to produce the exact same result on lesser hardware. When specifying an effect, you seperate the effect into potential splits in hardware support, such as:
.cubemap.generate - need to generate a cubemap
.texture.unit1 - need a texture in unit 1
.cubemap.reflection - need cubemap reflections
Then, if the .cubemap.generate stage is not supported in hardware (render to texture is not supported or whatever), the .cubemap.reflection shader would have unresolved inputs, which would cause it to be invalidated, and only the .texture.unit1 shader would be used. This means on lower hardware, reflections would automatically get taken out. Obviously, this doesn't always work, so you can manually specify the shaders desired for specific hardware, like so:
Hardware_ReallyLow : Hardware_Average
{
.texture.unit1
}
Hardware_AboveAverage : Hardware_ReallyHigh
{
.cubemap.generate
.texture.unit1
.cubemap.reflection
}
and it would produce the same results as the automatically generated solution.
Say you could handle the effect on this lower hardware, but it can't do it all at once (e.g. having a shader handle all three components in a single 'pass'), then you would have to multipass this effect. To do this, you need to describe the relationship between these components, or the engine would be doing this blindly and the results would be wrong. Therefore, at effect declaration stage (because this takes some preprocessing, but you could do it at geometry chunk level if you have enough CPU cycles to spare), you would parse a shader sequencing script, which looks like this:
.out.color = add{ .texture.unit1, .cubemap.reflection }
Once you build a tree from this sequence, you seperate the terms into local and global terms. Local terms are those handled by a single shader (say the shader could handle both of these qualifiers: .texture.unit1 and .cubemap.reflection. The 'add' command would then be 'local' to the shader) while Global terms are automatically handled by the engine by adjusting blending modes for each multipass chunk. For global terms, the engine would detect:
add
shaderA shaderB
and would set the necessary SetSrcBlend, SetDstBlend states. For local ops, the shader would detect:
add
.texture.unit1 .cubemap.reflection
and would set the necessary SetTextureStageState states.
Hope this makes a bit of sense to everyone, I realize I'm not a teacher, just a coder
Chris Pergrossi
My Realm | "Good Morning, Dave"
[edited by - c t o a n on April 10, 2004 10:13:39 PM]