Shader System Questions

Started by
113 comments, last by cyansoft 19 years, 11 months ago
quote:Original post by _DarkWIng_
quote:Original post by ffx
Where is actual rendering code placed and who owns GC's?

The rendering code (calls to glDrawElements...) are placed in Renderer class. In my case GC's are owned by Scene class. It holds all world data in either ABT tree(static) or entity list(dynamic). But I belive others have better implementations of this part


My design is following(still in design phase)

class GC :
pure geometry
shader state info(or call it effect info)
list of all SPGCs needed to render this GC's

class SPGC :
actual vertex and index buffers(ptr to VRAM slot)
all device changes, textures, shaders, etc..

class CRenderable :
vector< GC* > LOD_levels;

Now each visible entity is derived from CRenderable and placed in in scenegraph. Only after entity has passed visibilty pipeline, it is put in render chain. But first, his LOD is determined by distance from camera.

LOD is difference in :
1. geometry complexity
2. number of effects applied (no bump maping or pixel shaders when far away)
3. rendering technique applied(such as 3d_model->billboard for trees)

Now I was thinking of placing actual rendering directly in shader classes because they know how they will prepare for rendering.

How do you guys separate LODs?

[edited by - ffx on April 10, 2004 1:41:26 PM]
So... Muira Yoshimoto sliced off his head, walked 8 miles, and defeated a Mongolian horde... by beating them with his head?

Documentation? "We are writing games, we don't have to document anything".
Advertisement
As of yet, I don''t have any LOD yet. Hadn''t even thought about it yet. But it shouldn''t be any problem. I add the shaders to the renderqueue using a state machine approach. So first the current GC is set, and that is used for all shaders until it is changed. I would have my scenegraph model class have multiple indices into the ResourceManager and let it choose which LOD should be used. So it has absolutely nothing to do with the shaders and effects.
Wow, this topic has really exploded while I was away...! Let me throw some ideas around and see if anyone can take something away from it...

To handle the actual rendering of the GC, there are hundreds of places in my engine where this takes place. Each geometry chunk can be rendered by itself (if you're just trying to get it on the screen), whole meshes can be rendered (and all geometry chunks are rendered in turn), or you can create compiled lists of geometry chunks that are sorted, optimized, and then rendered (plus several other places).

[EDIT]
I realized some sample code might be helpful to some people, so here's the simplest place my code renders a geometry chunk: in the geometry chunk object itself.

Shader->fillCache( *this, DataUpdate );//DataUpdate = false;////if( !SortingID ){	XSortingID* SortID = new XSortingID;        // the sorting ID is filled by the shader with all necessary        // information to sort and render this geometry chunk	Shader->fillSortingID( *this, *SortID );	SortingID = SortID;}////((XVertexBuffer*) SortingID->VertexBufferPtr )->preLoad();((XVertexBuffer*) SortingID->VertexBufferPtr )->bindBuffer( D3DDevice, SortingID->VertexSize );////((XIndexBuffer*) SortingID->IndexBufferPtr )->preLoad();((XIndexBuffer*) SortingID->IndexBufferPtr )->bindBuffer( D3DDevice );////Shader->enterShader( this );////for( u32 k = 0; k < SortingID->UsedTextureUnits; k ++ ){	D3DDevice.getD3DDevice()->SetTexture( k, ((XTexture*) SortingID->TextureUnits[ k ])->getTextureObj() );}////if( SortingID->WorldMatrix ){	D3DDevice.getD3DDevice()->SetTransform( D3DTS_WORLD, (D3DMATRIX*) SortingID->WorldMatrix );}////SortingID->RenderStates.applyRenderStates( D3DDevice );SortingID->TextureStates.applyRenderStates( D3DDevice );////D3DDevice.getD3DDevice()->DrawIndexedPrimitive( D3DPT_TRIANGLELIST, 0, 0, VertexCount, ((XIndexBuffer*) SortingID->IndexBufferPtr )->getOffsetIndex(), IndexCount / 3 );////XRenderStates  difRenderStates;XTextureStates difTextureStates;////difRenderStates = SortingID->RenderStates.findDifference( XRenderStates() );////SortingID->TextureStates.findDifference( XTextureStates(), difTextureStates );////difRenderStates.applyRenderStates( D3DDevice );difTextureStates.applyRenderStates( D3DDevice );////if( SortingID->WorldMatrix ){	D3DDevice.getD3DDevice()->SetTransform( D3DTS_WORLD, NULL );}////for( k = 0; k < SortingID->UsedTextureUnits; k ++ ){	D3DDevice.getD3DDevice()->SetTexture( k, NULL );}////Shader->exitShader();



There are many more efficient ways of rendering geometry chunks (such as batching by texture, render states, position [front to back], etc.) Here's what my XSortingID and XRenderInfo objects look like, to give you an idea of how the shader gives information back to the engine.


/**  * XRenderInfo  *  * an object to encapsulate all necessary data  * needed to render a shader  *  */class EngineMode XRenderInfo{	public:		/**		* default constructor		*		*/		XRenderInfo();		/*! the used number of texture units */		u32							UsedTextureUnits;		/*! the texture units */		u32							TextureUnits[ 8 ];		/*! the vertex buffer pointer */		u32							VertexBufferPtr;		/*! the index buffer pointer */		u32							IndexBufferPtr;		/*! the vertex size (in bytes) used by the shader */		u32							VertexSize;		/*! supported render states */		XRenderStates				RenderStates;		/*! supported texture stage states */		XTextureStates				TextureStates;		/*! the world transformation matrix for this object */		matrix4x4*					WorldMatrix;		/*! boolean flag indicating whether to render this object seperately or not */		bool						DoNotOptimize;};


/**  * XSortingID  *  * an object to encapsulate all common attributes of  * geometry chunks in order to minimize state changes  *  */class EngineMode XSortingID : public XRenderInfo{	public:		/**		* default constructor		*		*/		XSortingID();		/*! the shader ID */		u32							ShaderID;		/*! flag indicating this sorting ID has been modified */		bool						DirtyUpdate;		/**		* operator overload to test if two sorting		* ID's are equal to each other in the correct		* sense of the operation		*		*	other:					the other sorting ID to test for equality		*		*	/ret:					true if the two objects are equal, false otherwise		*/		bool operator==( const XSortingID& other )		{			if( this == &other )				return true;						if( WorldMatrix || other.WorldMatrix )				return false;						if( UsedTextureUnits != other.UsedTextureUnits )				return false;			if( ShaderID != other.ShaderID )				return false;			if( RenderStates.getStateUnion() != other.RenderStates.getStateUnion() || RenderStates.getRigidAlphaValue() != other.RenderStates.getRigidAlphaValue() )				return false;			if( TextureStates.getUsedTextureUnits() != other.TextureStates.getUsedTextureUnits() )				return false;			u32 UsedUnits = other.TextureStates.getUsedTextureUnits();			for( u32 k = 0; k < UsedUnits; k ++ )			{				if( TextureStates.getOperationUnion( k ) != other.TextureStates.getOperationUnion( k ) )					return false;				if( TextureStates.getArgumentsUnion( k ) != other.TextureStates.getArgumentsUnion( k ) )					return false;			}			for( k = 0; k < UsedTextureUnits; k ++ )				if( TextureUnits[k] != other.TextureUnits[k] )					return false;			return true;		}};


[/EDIT]

Level-of-Detail (LOD) is handled as an external 'add-in' in my engine. This is due to the flexibility that my geometry chunk format gives me (remember: I use 'property lists' or 'attribute lists' as described by poly-gone). I have one attribute named '.position.xyz' which is the stream of vertex positions (not UV coords, normals, or whatever...that's a waste of memory to duplicate). It is this stream of vertex positions that will be rendered. I then have several other properties such as this:

.position.xyz.lod01 - stream of vertex data at LOD level 1
.position.xyz.lod01.loaded - flag indicating whether data is valid (save some memory by unloading it if it's not needed)
.position.xyz.lod02 - stream of vertex data at LOD level 2
....

Then, when the scenegraph is being traversed, and a mesh has a modifier (an attachable class that modifies a node) that handles LOD (calculates LOD or whatever), the modifier can calculate the distance to camera, and set this variable

if( !.position.xyz.lod01.loaded ){    generateLODLevel( .position.xyz.original, LOD_Level01 );    .position.xyz.lod01.loaded = true;}.position.xyz = .position.xyz.lod01


to the desired LOD level, and the engine will automatically handle it. Pretty clean and efficient, eh? And the lovely thing is the engine doesn't even need to know about it. The same idea applies to shader LOD's (they even have similar names, like .shader.lod01, etc) and you might even specify parameters for the LOD algorithm in the property list and the modifier could use it, such as

.position.xyz.lod01.startDistance = 0
.position.xyz.lod01.endDistance = 1
.position.xyz.lod01.desiredPoly = 400
....

This lends itself really well to a scripting language as well.

To handle lack of hardware support, I use multipass fallback techniques (automatically or manually generated) to attempt to produce the exact same result on lesser hardware. When specifying an effect, you seperate the effect into potential splits in hardware support, such as:

.cubemap.generate - need to generate a cubemap
.texture.unit1 - need a texture in unit 1
.cubemap.reflection - need cubemap reflections

Then, if the .cubemap.generate stage is not supported in hardware (render to texture is not supported or whatever), the .cubemap.reflection shader would have unresolved inputs, which would cause it to be invalidated, and only the .texture.unit1 shader would be used. This means on lower hardware, reflections would automatically get taken out. Obviously, this doesn't always work, so you can manually specify the shaders desired for specific hardware, like so:

Hardware_ReallyLow : Hardware_Average
{
.texture.unit1
}

Hardware_AboveAverage : Hardware_ReallyHigh
{
.cubemap.generate
.texture.unit1
.cubemap.reflection
}

and it would produce the same results as the automatically generated solution.

Say you could handle the effect on this lower hardware, but it can't do it all at once (e.g. having a shader handle all three components in a single 'pass'), then you would have to multipass this effect. To do this, you need to describe the relationship between these components, or the engine would be doing this blindly and the results would be wrong. Therefore, at effect declaration stage (because this takes some preprocessing, but you could do it at geometry chunk level if you have enough CPU cycles to spare), you would parse a shader sequencing script, which looks like this:

.out.color = add{ .texture.unit1, .cubemap.reflection }

Once you build a tree from this sequence, you seperate the terms into local and global terms. Local terms are those handled by a single shader (say the shader could handle both of these qualifiers: .texture.unit1 and .cubemap.reflection. The 'add' command would then be 'local' to the shader) while Global terms are automatically handled by the engine by adjusting blending modes for each multipass chunk. For global terms, the engine would detect:

add
shaderA shaderB

and would set the necessary SetSrcBlend, SetDstBlend states. For local ops, the shader would detect:

add
.texture.unit1 .cubemap.reflection

and would set the necessary SetTextureStageState states.

Hope this makes a bit of sense to everyone, I realize I'm not a teacher, just a coder

Chris Pergrossi
My Realm | "Good Morning, Dave"

[edited by - c t o a n on April 10, 2004 10:13:39 PM]
Chris PergrossiMy Realm | "Good Morning, Dave"
quote:Original post by cyansoft
Like many, I've been reading and rereading (and re-rereading) the famous "Materials and Shaders Implementation" thread, where the great Yann L. describes his engine's pluggable effect/shader system.

I think I'm starting to understand how the system works. But in the process of designing my own shader system, I came up with many questions that were never addressed in the original thread, are needed to provide more insight and ideas, or I've mistakenly overlooked because of the shear amount of discussion over this topic.



I'm not sure I'd implement a shader system in exactly that way. More or less, there are 2 basic ways people construct shader systems.

The first method is to bind shader assets with the material. That is, at author time, bind all resources to a material that are required for it to be rendered, thereby constructing a modular packet. This is the fundemental tenat of the EAGL system (E.A.'s genearl purpose graphics engine), and you can read about it in the SIGGRAPH 2002 proceedings. This is a simple and elegent system.

This approach doesn't work well on PCs, however, since the cost of changing shaders is nontrivial (due largely to driver overhead). Almost all (sane) PC games are shader driven. That is, the a particular shading technique gets a queue of all the objects that it is to render and then it processes them to avoid costly shader changes. In this model, a shading object has no geometry associated with it.

Note there is a very careful balance between state sorting and state reprocessing - from my expeirences states should only be sorted at a very course level (e.g. what shader they use, always sort by this first). The overhead of sophisiticated sorting usually exceeds the benefit.

By far the biggest bottleneck for performance today is the update speed of the shader constants - so careful management here is crucial. Never set them discontingously, keep the number to a minimum, and try to use a global def bank of literals.

On the DX side of things, the SDK provides a shader managment system known as Effects. These encapsulate things like the number of passes, fallbacks etc. Additionally, the upcoming verison (due out any time now) does some nice things like pull static expressions out of shaders and evaluate them on the CPU (known as preshading). Nvidia's FXComposer is based on this system, and I beleive ATI's rendermonkey can export to .fx files. This system is being used in more and more profesional PC projects. But it is not a complete solution, you still have to write a fair amount of code.




[edited by - EvilDecl81 on April 11, 2004 3:48:14 PM]
EvilDecl81
I implemented shaders as a sequence of simple GL commands and a render callback. Multipass shaders call the render callback multiple times. Each shader has got some resources and a list of entry points. Shaders can call each other and shaders can inherit from other shaders. The whole shader/material system is self describing.

The base material definies only entry points.

material Material
{
[Shadow]
RenderLight:
break;
RenderMaterial:
break;
}

The standard bumpmapping shader:

material diffusespecular extends "stdmat\Material"
{
// resource declarations
TNormalmap bumpmap="";
TTexture2D diffuse="";
TTexture2D specular="";

RenderDiffuseSpecular="stdmat\RenderDiffuseSpecular";

float specular_exp=8;
RenderLight:
RenderDiffuseSpecular(bumpmap,diffuse,specular,specular_exp);
break;
}

shader RenderDiffuseSpecular(TTexture bumpmap,TTexture diffuse,TTexture specular,float exp)
{
TVertexShader vs="shader\diffusespecular.vp";
TFragmentShader fs="shader\diffusespecular.fp";

// link with light shader
LightShader(vs,fs);
glUniform(specular_exp,exp);
glBindTexture(0,diffuse);
glBindTexture(1,bumpmap);
glBindTexture(2,specular);
gluniform1i(diffusemap,0);
gluniform1i(normalmap,1);
gluniform1i(specularmap,2);
glenable(gl_blend);
glblendfunc(gl_one,gl_one);
render; // callback to render this material block
gldisable(gl_blend);
}

A simple material using this shader and overriding resource declarations. It is also allowed to override entry points, because there is no difference between a material and a shader.

material clang_floor2 extends "stdmat\diffusespecular"
{
bumpmap "textures\base_floor\clang_floor_bump.jpg";
diffuse "textures\base_floor\clang_floor2.jpg";
specular "textures\base_floor\clang_floor_gloss.jpg";
specular_exp=16;
}

The engine just executes these scripts, without knowing any details of the material.

[edited by - LarsMiddendorf on April 12, 2004 6:57:41 AM]
ctoan
Nice design. What I am trying to do, and what you have done obviously, is to totally abstract engine''s pipeline to the level where you can control it via scripts (something like LarsMiddendorf). That gave me ideas and now I am off to implementing it.
So... Muira Yoshimoto sliced off his head, walked 8 miles, and defeated a Mongolian horde... by beating them with his head?

Documentation? "We are writing games, we don't have to document anything".
I''ve started implementing a similar system to those discussed here, and ran into a glitch while coding a cubemap shader.

The shader is sorted to the start, and is done before any geometry is drawn. I was planning on using pBuffers to do this. Creating and drawing to the pBuffer shouldn''t be a problem (haven''t tested yet though). However, I''m unsure how to do the binding of the texture object. According to the ATI example I need to do:

glBindTexture(GL_TEXTURE_2D, pBufferTexID);wglBindTexImageARB(hPBuffer, WGL_FRONT_LEFT_ARB); 


I could change the texID in my texture structure to pBufferTexID without any trouble. Then when drawing the geometry associated with the texture in the pbuffer, the first of the two above lines is automatically executed. But how should I make sure the second line is executed? I would have to point back to the cubemap shader in someway I think.
quote:Original post by cyansoft
quote:Original post by McZ
well.. if the material has the Ambient color set, then the material format is MF_AMBIENT if it also has the diffuse color then it has the format MF_AMBIENT | MF_DIFFUSE and this way I sort different shaders, the shader knows wich formats it can render and thats how each GeometryChunk gets it''s shaders


Interesting idea. So in your system, when parsing the effects, you go beyond just assigning a priority which indicates if the effect is fully do-able or not. By using bit flags, the effects say what they want, and each shader says what it supports. With this, the priority really doesn''t need to indicate if the effect is complete, but rather the visual quality of the effect.



sorry for my late reply..

I use the bitflags to let the system determin what shaders to use, and the priority is used becouse two shaders can have the same bitflags set and then I need some way to choose one of them and then I build my shaders so they work in pairs or more e.g. I have one "Texture" shader that sets the textures wich will be used if the geometrychunk''s material has a texture.
quote:Original post by poly-gone
Nice discussion going on here...

This is how I've implemented my shader library :-



This is along the same lines as the way I had been doing my system, using .fx files, except that I also found myself making extra classes when I needed new constants set. For example a water shader that needs the fresnel term set - I would need to make a class that sets these attributes for the shader.

How do you solve this problem? ( If you don't mind me asking )

Edit: Do you have possible attributes already coded in? For example, if you enconter a attribute named "Fresnel", your shader base class will query the Geometry Chunk ( or whatever format you are using ) for it's fresnel value (or just a global value)?

[edited by - RobertC on April 13, 2004 11:07:19 PM]
@RobertC :-

That''s simple. Don''t link your .fx files in any way with your shader library. Your shader library should be abstract enough to handle any shader. For that, you need that "attributes" class I mentioned earlier. When you export the level from your DCC Tool, the exporter grabs all the properties from the HLSL shader, including the "fresnel term", for example, along with its "value" you would''ve set in the DCC Tool, and exports this as an attribute in the "water material". When the "water shader" is used for rendering the "water", the corresponding " water material" is assigned to the shader. When the shader library encounters an attribute like "fresnel bias", for example, it simply assigns the value of "fresnel bias" to the "fresnel bias" constant in the shader.

This topic is closed to new replies.

Advertisement