Jump to content

  • Log In with Google      Sign In   
  • Create Account

Shader System Questions


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
114 replies to this topic

#1 cyansoft   Members   -  Reputation: 307

Like
Likes
Like

Posted 04 April 2004 - 07:22 AM

Like many, I''ve been reading and rereading (and re-rereading) the famous "Materials and Shaders Implementation" thread, where the great Yann L. describes his engine''s pluggable effect/shader system. I think I''m starting to understand how the system works. But in the process of designing my own shader system, I came up with many questions that were never addressed in the original thread, are needed to provide more insight and ideas, or I''ve mistakenly overlooked because of the shear amount of discussion over this topic. 1) In Yann''s implementation, the actual shader class does not contain any render method. He says that the rendering occurs outside of the shader. I thought that the purpose of the shader is to implement all or part of an effect. Why keep the API specific rendering calls (i.e. glDrawElements) outside of the shader? What are the advantages/disadvantages of having the shader render the GC verses having the GC render itself? 2) For those whom implemented a shader system (or in the process of implementation), do you have every possible renderable object consist of GC''s that are piped into the renderer? Such objects include the HUD, the in-game GUI, the command console, and particle systems. What''s the advantages and disadvantages of having *everything* renderable go through renderer system? 3) If you were supporting multiple API''s/hardware (i.e. OpenGL, DX9, PS2, Game Cube, etc.), I would assume you would create a different set of shaders for each API, while carefully loading the correct set of shaders for the target API/hardware. This comes back to the first question, why keep the API specifics for rendering outside the shader? 4) When resolving the best shader for an effect based on the hardware capabilities and the user quality preferences, I assume when a GC is loaded from disk, it can ignore loading vertex stream data it does not need. For example, if bump mapping is not supported, then the tangents do not need to be in the vertex stream. 5) For those whom implemented effects, what kind of effects did you come up with and how did the implementation work out? What absolute minimum effects should a system provide? Here are the very basics others have mentioned in other threads: a) Diffused Texture .vertexstreamdata XYZ | UV(diffused) .gcdata TextureID(diffused) b) Lightmapped Diffused Texture .vertexstreamdata XYZ | UV(diffused) | UV(lightmap) .gcdata TextureID(diffused) | TextureID(lightmap) c) Gouraud Shaded .vertexstreamdata XYZ | RGBA | NxNyNz .gcdata LightID 6) How liberal are you about making a whole new effect verses adding an additional parameter to an effect? What’s your criterion in deciding? For example, say you want to render some geometry in wire frame mode. Would you create a new effect for this or add an additional parameter to an existing effect? 7) How do you provide parameters and vertex stream data to the shader via shader_params? Do you pass an object with methods to return pointers to the param data in the GC? Or do you store a single buffer in the GC, with the params and vertex stream data in a predefined order based on the required format for the effect? What are the advantages and disadvantages to each method? 8) How do you store your vertex data stream and the parameters in the GC? Do you use a single buffer as discussed in question 7, or do you use a specific GC class that''s inherited from a base GC, to be used by specific effect? 9) If you store your vertex data stream outside of the GC, in a buffer (i.e. OpenGL''s VAR/VBO/Vertex Array or DX9''s VB) do you have one buffer per GC (seams very inefficient), one per shader, one for the entire scene, or multiples buffers for optimized performance (I believe Yann once said that 1000 to 4000 triangles per buffer is the most optimal)? If you do use multiple buffers, how do you minimize switching buffers? 10) For level of detail purposes, imposters may need to be created. How would you design an effect to render a bunch of GC''s to a texture, in a preprocess pass, then render the generated texture on a quad in the main pass? Would this be as simple as passing a list of GC''s as a parameter for the effect? 11) How do you resolve using multiple shaders for multi-pass rendering? More specifically, how does the shader tell the GC it can only do part of the effect? And more importantly, how does the shader know it can only do part of the effect? Yann didn''t go into details on how the whole messaging system would work. Any ideas on how to implement this in a clean and efficient manner? 12) Besides the obvious SPGC sorting criteria such as render target, render pass, shader, texture(s), and depth from camera(transparent pass only), what other factors/state changes should be looked at and sorted by? 13) How can a shader be implemented to take advantage of OpenGL''s display lists? For obvious performance/memory usage reasons, display lists are perfect for static geometry that is repeated dozens or more times per frame. Compiling and executing a display cannot happen every frame, since compilation would outweigh any speed benefit. Wow! That was a lot of questions. I look forward in reading what others have done and to what suggestions they have for implementing such a complex system. Bob

Sponsor:

#2 jamessharpe   Members   -  Reputation: 497

Like
Likes
Like

Posted 04 April 2004 - 11:57 AM

I can provide answers to some of your questions although not all, some of these are still issues in my own design that I need to resolve.

1) The advantages of having the actual render call outside the shader is that you are guarenteed to render somethng, it would be all to easy to forget to put in the render call. The purpose of the shader is to set up all the necesary states that are requierd to make the geometry look at certain way(an effect).

2) Yes I put everything though the same pipeline. My reasoning behind this is I didn''t want any special purpose code, the 2D elements are the same as 3D elements but with a constant depth. This is easily achieved with a matrix. The disadvantages are that it becomes harder to directly use SDK examples directly in your code and you need to adapt some of the techniques to be able to use the system.

3) Remember that the shaders aren''t the only part of the render system - there''s also stuff like texture/vertex buffer management, window management that is specific to an API that isn''t covered by the shaders. What a different API boils down to is a different set of concrete classes to the abstract interface that you have defined, be that for the shaders, texture manager or vertex buffer systems. Basically the shaders are probably the only place where you have freedom to make calls to (almost) any API functions. In all other places outside of the shader you will have a predefined function that needs to be implemented e.g. loading a texture and then you will be restricted as to the API calls that can be made, meaning you are bound by the interface that is provided by the abstract renderer, this is why the shader system is so powerful.

4) That''s correct, although you only need to ignore the streams when copying the GC''s into Vertex Buffers, trying to drop streams in the GC''s(which should be loaded independently from the renderer) would require the engine to have knowledge of which shaders are linked to each effect and how they work, in other words you lose the layer between the renderer and the engine. So the answer is that the GC''s provide the maximum data that could be required by a shader and then the shader selectively copies what it needs into the vertex buffer for rendering.

5) What are you aiming to achieve? There are many different possible effects, you could implement one for each and every example program in the many SDK''s for opengl/Directx. Ultimately, the thing is if you need a new effect in your engine and you can''t produce it with your current feature set, then add it. This is the point where you can do the learning. Use the initial simple shaders( diffuse, gouraud etc..) as practise in implementing the shaders. Then pick something a little more challenging say a dynamic per-pixel specular effect and try to implement that. Every once in a while review your effects to make sure you aren''t duplicating anything that could be achieved by simplifying or generalising one of the other shaders.

6) I would say that if the effect gives a different overall outward appearance (eg. a wireframe vs. diffuse shader) then create a new effect. Use the meta-bouncing idea as a criteria for deciding, even if you haven''t implemented this in your engine. If the overall appearance is different or it takes a different set of input streams(that can''t be derived from another) then you will probably need a new effect/shader combination.

9) The GC''s are not at all aware of API specific implementations such as VAR, VBO etc.. The GC is simply an array of data that the renderer can access. It is the shaders that then access this data and copy it into VRAM, i.e. VAR, VBO, VAO etc.. The way you go about this is to have a manager that is responsible for allocating and sharing buffers between VRAM structures. So you can have multiple VRAM entries in a single buffer. The VRAM structure controls access to the range of the data in the specific buffer it can access.

Wow, that''s quite long already, and I''m pretty tired I will finish answering the questions when I have some more time.

James

#3 c t o a n   Members   -  Reputation: 163

Like
Likes
Like

Posted 04 April 2004 - 03:15 PM

I've been working on a system like this in my own engine for a while, though it's design differs quite radically from YannL's. Where you came to these forums to ask your questions, I sat and thought about it... Anyway, if you would like some potential ideas, check this page out.

7 + 8) I answered these questions with some sample code of mine that handles geometry chunks a while back. Here's the link.

10) Well, what I did was create a 'render list' class, that can manage the rendering (and shader management) of a whole lot of geometry chunks (it will also optimize their rendering, but that's not important here). To make the solution as clean and to conform to our shader model, I build a shader that creates/destroys a render surface when the shader is first created. Then you can pass it a geometry object or whatever, it will set the render target in it's 'enterShader' function (and clear it in 'exitShader') and the engine will handle the rendering of the polygonal data. Then, you create another geometry chunk and set the diffuse texture to the output of the other geometry chunk (the rendered 'imposter' map that the shader stores in the geometry chunk). Therefore, the input to this billboarded quad is the imposter texture created by the imposter shader, but the billboarded quad can be an entirely different (and generic) shader.

11) The setup phase in my engine is REALLY complex. I wrote it blindly the first time through (just trying to get it to work) and it's really hard to get it to choose the right shader combinations. Be sure to clearly define what you want your system to do; for example, if you want your system to use as few passes as possible (duh...), make sure to use a heuristic to determine how many passes a shader combination will take. What I ended up doing was build a 'potential shader run' for each potential combination of shaders that would not break the effect (so the effect might not be 100% valid, but it wouldn't have texturing if it's a diffuse effect, for example), then looping through each of these 'potential' shader runs and building a heuristic out of information from the run (number of shaders, number of invalidities, number of local/global ops [I'll explain in a second], etc.). This heuristic is then used to sort the runs, and I use the best one as the final effect shader list.

Local/Global Blending Operations
When I was designing my system, I realized that I had no control over fallback multi-pass techniques, and that my control over shaders that controlled multiple visual qualifiers was very limited; i.e. I couldn't say, multiply the diffuse color by the first texture, then add the second texture to that and expect the effect to work whether I use a single pass implementation or multi-pass implementation. I then made the concept of blending ops apply to this effect/shader system. When declarting an effect, you specify what blending ops you want for the color (and if supported, alpha). A blending script looks something like this:

effect tex.diffuse[0]/tex.diffuse[1]/color.diffuse
{
.description "two diffuse textures and diffuse vertex color"

.visual
{
.color.diffuse
.texture.diffuse(unit0)
.texture.diffuse(unit1)
}

// this is the color blending script
// the effective result is:
// (tex.unit0 * tex.unit1) + color.diffuse
//
.out.color = add{ mul{ .texture.diffuse(unit0), .texture.diffuse(unit1) }, .color.diffuse }
}

These are parsed using standard script parsing (I wrote it from scratch and it wasn't too hard) and are classified as local ops or global ops. Local ops are performed by a single shader; I have a shader that can handle two diffuse textures. The term in the script that says (tex.unit0 * tex.unit1) would be passed to the shader as a local op and the shader would be told to interpret the op on it's own. Global ops are multi-pass operations, and can be easily translated into the engine setting alpha blending renderstates. NOTE: if you use this system even for ideas you must remember: you cannot allow a single shader to be stretched (if it's 'instanced', thats ok) across a Global operation, as that is not possible with a single render target (the math just doesn't allow it most of the time, so I just called it invalid and said DON'T DO IT). It's not a big deal and it's easy to get around, but DON'T FORGET or you'll hit a brick wall and won't know what to do!

Hope this helps answers a few questions for ya and gives you some good ideas. Ciao,

EDIT: spelled 'ciao' wrong...

Chris Pergrossi
My Realm | "Good Morning, Dave"

[edited by - c t o a n on April 4, 2004 10:16:48 PM]

#4 McZ   Members   -  Reputation: 139

Like
Likes
Like

Posted 04 April 2004 - 10:50 PM

My rendersystem use some different setup, I discarded the effects-file and the effect stuff only using a shader and an ID of different material properties


// different material properties so far

enum eMaterialFormat {
MF_AMBIENT = 1<<0,
MF_DIFFUSE = 1<<1,
MF_SPECULAR = 1<<2,
MF_SHININESS = 1<<3,
MF_TEXTURE = 1<<4
};

// material class

class cMaterial {

int m_iFormat; // the above format of this material


cColor4f Ambient;
cColor4f Diffuse;
cColor4f Specular;
float Shininess;

// an array of texture handles...

cTextureHandle TextureHandles[MAX_TEXTURES];

public:

void SetAmbient( cColor4f a );
void SetDiffuse( cColor4f d );
void SetSpecular( cColor4f s );
void SetShininess( float s );

cColor4f SetAmbient( void );
cColor4f SetDiffuse( void );
cColor4f SetSpecular( void );
float SetShininess( void );

// and so on...

};


well.. if the material has the Ambient color set, then the material format is MF_AMBIENT if it also has the diffuse color then it has the format MF_AMBIENT | MF_DIFFUSE and this way I sort different shaders, the shader knows wich formats it can render and thats how each GeometryChunk gets it''s shaders

choosing shader
e.g.

gc = geometrychunk
do {
for( [s = each shader in list] )
{
mf = gc->GetMatrial()->GetFormat();

if( s->Supports( mf ) )
{
// if it does check if the priority is higher then the last
// found shader, if it is does this shader cover more
// stuff in the material then the last one if so set this as
// best shader
}
}

// remove the bestshaders material support format from mf
// and continue search for a shader for the rest of the format

// this will continue to loop so it can choose
// several shaders to display one material
// (it has a count so it won''t get stuck in the loop)
}while( mf != 0 );

explanation: well.. the way I code a shader is that I first have a shader that set the color, then a shader that set a texture or several, then if the material needs both shaders then both is choosed becouse the material format has the correct flags for them to be choosed.

for example:

if I have this shaders with this formats:
1) MF_AMBIENT | MF_DIFFUSE
2) MF_SPECULAR | MF_SHININESS
3) MF_AMBIENT | MF_DIFFUSE | MF_SPECULAR | MF_SHININESS
4) MF_TEXTURE

and I have a GC with this format MF_AMBIENT | MF_DIFFUSE | MF_TEXTURE, then it will choose shader 1 and 4 if I add the MF_SPECULAR | MF_SHININESS to the GC''s material format then it will choose the third and fourth shader becouse the shader 3 cover more of the material then the first 2 shaders does that one will be picked, unless it has lower priority level


#5 _DarkWIng_   Members   -  Reputation: 602

Like
Likes
Like

Posted 05 April 2004 - 05:04 AM

I'll just anwser a few.

9) I use single VBO for each GC. The VBO bind is not much of a speed loss in latest drivers.

11) Shader doesn't have to know it doesn't render the whole effect. And message passing between 2 shaders is not such a problem (at least for now). I just bind the same shaderd object to both shaders. For example: water effect. It's made form 2 shaders. One to render scene to texture and other to render water. I just bind the same "reflection" texture to both shaders and that's it.

12) Those are pretty much all you need. If you want to take another step further, you could use some kind of dynamic sorting criteria. Something to dynamicly balance the infuence the individual parts depending on curent scene.

13) Display Lists. Shader doesn't have to do anything how the GC data is submited to GPU. DL are just another way of doing it, just like VA, VBO,... Where it can come in use in shaders? You might gain something if you put a batch of state changes (Like some very long NV_RC setup) in a DL. Other than that you don't have much use in shaders themselfs.

You should never let your fears become the boundaries of your dreams.

[edited by - _DarkWIng_ on April 5, 2004 12:06:51 PM]

#6 cyansoft   Members   -  Reputation: 307

Like
Likes
Like

Posted 06 April 2004 - 06:38 AM

quote:
Original post by jamessharpe
The purpose of the shader is to set up all the necesary states that are requierd to make the geometry look at certain way(an effect).



So a shader doesn''t render an effect, but rather acts as a state change micromanager for an effect.

quote:
Original post by jamessharpe
GC''s provide the maximum data that could be required by a shader and then the shader selectively copies what it needs into the vertex buffer for rendering.



Since the shader copies the data it needs from the GC to the hardware, I could store the GC''s data in any way I want regardless of the underlying API the shader uses. Doing this, I could load the GC''s right off the disk with no conversion needed. I can even use a compressed format, as long as my shader knows how to decompress it.

quote:
Original post by jamessharpe
It is the shaders that then access this data and copy it into VRAM, i.e. VAR, VBO, VAO etc.. The way you go about this is to have a manager that is responsible for allocating and sharing buffers between VRAM structures.



This is where I start to worry. To support the multitude of vertex drawing methods that OpenGL provides (VAR, VBO, regular vertex arrays, compiled vertex arrays, etc), this means I would have to create an additional shader per method and an additional "vertex drawing manager" to implement each method.


#7 cyansoft   Members   -  Reputation: 307

Like
Likes
Like

Posted 06 April 2004 - 06:53 AM

quote:
Original post by c t o a n
7 + 8) I answered these questions with some sample code of mine that handles geometry chunks a while back. Here''s the link.



I like the idea of using a property manager for the GC''s data instead of using an array. It would make adding new data formats a lot easier.

quote:
Original post by c t o a n
10)Then, you create another geometry chunk and set the diffuse texture to the output of the other geometry chunk (the rendered ''imposter'' map that the shader stores in the geometry chunk). Therefore, the input to this billboarded quad is the imposter texture created by the imposter shader, but the billboarded quad can be an entirely different (and generic) shader.



Again, a property manager like you mentioned would make this fairly simple.

#8 cyansoft   Members   -  Reputation: 307

Like
Likes
Like

Posted 06 April 2004 - 07:17 AM

quote:
Original post by McZ
well.. if the material has the Ambient color set, then the material format is MF_AMBIENT if it also has the diffuse color then it has the format MF_AMBIENT | MF_DIFFUSE and this way I sort different shaders, the shader knows wich formats it can render and thats how each GeometryChunk gets it''s shaders



Interesting idea. So in your system, when parsing the effects, you go beyond just assigning a priority which indicates if the effect is fully do-able or not. By using bit flags, the effects say what they want, and each shader says what it supports. With this, the priority really doesn''t need to indicate if the effect is complete, but rather the visual quality of the effect.



#9 jamessharpe   Members   -  Reputation: 497

Like
Likes
Like

Posted 06 April 2004 - 07:30 AM

quote:
Original post by cyansoft

Since the shader copies the data it needs from the GC to the hardware, I could store the GC''s data in any way I want regardless of the underlying API the shader uses. Doing this, I could load the GC''s right off the disk with no conversion needed. I can even use a compressed format, as long as my shader knows how to decompress it.




Exactly that''s where you gain flexibility.

quote:

This is where I start to worry. To support the multitude of vertex drawing methods that OpenGL provides (VAR, VBO, regular vertex arrays, compiled vertex arrays, etc), this means I would have to create an additional shader per method and an additional "vertex drawing manager" to implement each method.



No, you don''t need any extra shaders - what you have is a structure - a VRAMSlot that simply has a method that returns a pointer to a stream that you can write to fill the vertex data. Using a Lock/Unlock pair of calls you can do this quite easily, almost directly with most of the extensions. You may eventually need some kind of vertex management system since VRAM available is finite and varies across different cards, and you''re going to hit the limit sooner or later, and need some way of paging the data in and out. Search for ''sliding slot'' to see Yann explain his vertex cache management system.

James

#10 cyansoft   Members   -  Reputation: 307

Like
Likes
Like

Posted 06 April 2004 - 07:33 AM

quote:
Original post by _DarkWIng_
12) Those are pretty much all you need. If you want to take another step further, you could use some kind of dynamic sorting criteria. Something to dynamicly balance the infuence the individual parts depending on curent scene.



This can be an interesting optimization.

quote:
Original post by _DarkWIng_
Shader doesn''t have to do anything how the GC data is submited to GPU.



If so, they why did Yann have a fill_cache method for his shader system? Jamessharpe also mentions the shader being used to fill VRAM.



#11 jamessharpe   Members   -  Reputation: 497

Like
Likes
Like

Posted 06 April 2004 - 07:45 AM

quote:
Original post by cyansoft
quote:
Original post by _DarkWIng_
Shader doesn''t have to do anything how the GC data is submited to GPU.



If so, they why did Yann have a fill_cache method for his shader system? Jamessharpe also mentions the shader being used to fill VRAM.




If the data in the GC''s is always in the format required for rendering from the shaders i.e. you don''t need to transform it, then you can get away with using VertexArray pointers straight into the GC data. I know that DarkWing uses VBO directly as the internal format for the GC, which is fine if you don''t need API independence, but if you do then you need an additional translation layer somewhere, whether it be by use of a polymorphic GeometryChunk or an extra level of indirection via a software buffer(Yann''s method). The extra level of indirection can also become useful if you are trying to adapt the system for streaming over a network.

Here''s an example of one of my fill_shader_cache functions:

void SimpleGouraud::FillShaderCache(ShaderPass &pass)
{
struct Vertex
{
float position[3];
unsigned char colour[4];
};
//get a VRAM slot

VRAMSlot * slot = tools.GetVRAMSlot(sizeof(Vertex)*pass.geometry->GetNumVerts());
pass.SetVRAM(slot);

//copy vertex and colour data from data stream via effect class

Effect * e = tools.GetEffect(pass.effectID);
unsigned short vertexOffset = e->VOffsetTable[OT_VERTEX];
unsigned short diffuseOffset = e->VOffsetTable[OT_DIFFUSE];
unsigned short stride = e->stride;
const unsigned char * src = pass.geometry->GetVertexStream();
int verts = pass.geometry->GetNumVerts();

Vertex * dest = (Vertex *)slot->Lock();
for(int i = 0; i < verts; ++i)
{
memcpy(&dest->position, src + vertexOffset, sizeof(float)*3);
memcpy(&dest->colour, src + diffuseOffset, sizeof(unsigned char) * 4);
dest += 1;
src += stride;
}

slot->Unlock();
}





#12 c t o a n   Members   -  Reputation: 163

Like
Likes
Like

Posted 06 April 2004 - 03:33 PM

Yeah, jamessharpe system sounds more similar to mine then _DarkWing_''s, and here''s a my fillCache function from my gourad shader as well


u32 ChunkName = Chunk.ChunkName.getValue();
u32 vertexSize = sizeof( vector3 ) + sizeof( u32 );
u32 vertexCount = Chunk.VertexCount;
u32 indexCount = Chunk.IndexCount;


XVertexBufferMgr* VertexBufferMgr = (XVertexBufferMgr*) g_UtilityObj->VertexCache;
XIndexBufferMgr* IndexBufferMgr = (XIndexBufferMgr*) g_UtilityObj->IndexCache;


XVertexBuffer* vbuffer = NULL;


// if we''ve not created the vertex buffer yet...

if( !Chunk.propertyExists( "vertexBuffer" ) )
{
vbuffer = g_UtilityObj->allocateVertexBuffer( VertexBufferMgr );

vbuffer->grab();

//

// this vertex buffer is basically a pointer to a chunk

// of AGP memory that can be used by whichever API is

// being used. the shader merely fills in necessary

// data

//


vbuffer->createBuffer( ChunkName + XHashString( "vertexBuffer" ).getValue() + XHashString( "diffuseShader" ).getValue(), vertexSize * vertexCount );


XAccessor myAccessor( vbuffer->accessBuffer(), vertexSize * vertexCount );


vector3* positions = (vector3*) Chunk.getProperty( ".position.xyz" );
u32* diffuse = (u32*) Chunk.getProperty( ".color.diffuse" );


for( u32 k = 0; k < vertexCount; k ++ )
{
myAccessor.addComponent( positions[k] );
myAccessor.addComponent( diffuse[k] );
}


Chunk.setProperty( "vertexBuffer", (u32) vbuffer, Property_SharedObj );
}
else
{
vbuffer = (XVertexBuffer*) Chunk.getProperty( "vertexBuffer" );


//

// the vertex buffer is already created, now we

// need to check if the vertex buffer has been moved

// out of the cache (which is how I handle memory),

// or if the geometry chunk data has been updated

//

if( !vbuffer->isCached() || DirtyUpdate )
{
XAccessor myAccessor( vbuffer->accessBuffer(), vertexSize * vertexCount );


vector3* positions = (vector3*) Chunk.getProperty( ".position.xyz" );
u32* diffuse = (u32*) Chunk.getProperty( ".color.diffuse" );


for( u32 k = 0; k < vertexCount; k ++ )
{
myAccessor.addComponent( positions[k] );
myAccessor.addComponent( diffuse[k] );
}
}
}


XIndexBuffer* ibuffer = NULL;


if( !Chunk.propertyExists( "indexBuffer" ) )
{
ibuffer = g_UtilityObj->allocateIndexBuffer( IndexBufferMgr );

ibuffer->grab();


//

// this index buffer is basically a pointer to a chunk

// of AGP memory that can be used by whichever API is

// being used. the shader merely fills in necessary

// data

//



ibuffer->createBuffer( ChunkName + XHashString( "indexBuffer" ).getValue() + XHashString( "diffuseShader" ).getValue(), indexCount );


XAccessor myAccessor( ibuffer->accessBuffer(), indexCount * IndexBufferMgr->getIndexSize() );


u16* indices = (u16*) Chunk.getProperty( ".indices" );


for( u32 k = 0; k < indexCount; k ++ )
{
myAccessor.addComponent( indices[k] );
}


Chunk.setProperty( "indexBuffer", (u32) ibuffer, Property_SharedObj );
}
else
{
ibuffer = (XIndexBuffer*) Chunk.getProperty( "indexBuffer" );


if( !ibuffer->isCached() || DirtyUpdate )
{
XAccessor myAccessor( ibuffer->accessBuffer(), indexCount * IndexBufferMgr->getIndexSize() );


u16* indices = (u16*) Chunk.getProperty( ".indices" );


for( u32 k = 0; k < indexCount; k ++ )
{
myAccessor.addComponent( indices[k] );
}
}
}


Unlike YannL implementation (as far as I know), I''ve also added different functions to each shader to perform tasks I thought would be most efficient at the shader level. Here''s a list of the functions each shader performs:


/**
* initializes the shader. called when the engine is starting up
*
* Utility: the utility object the shader should use
* Device: the device the shader should use
*/

void initShader( XShaderUtility* Utility, XD3DDevice& Device );

/**
* destroys the shader. called when the engine is closing down
*
*/

void destroyShader();

/**
* queries for hardware support
*
* Device: the device the shader should use
*
* /ret: true if supported by hardware, false otherwise
*/

bool queryHardwareSupport();

/**
* fills a geometry chunk sorting ID with valid data for
* this shader
*
* Chunk: the geometry chunk to update
* SortingID: the sorting ID to fill
*/

void fillSortingID( XGeometryChunk& Chunk, XSortingID& SortingID );

/**
* called to fill all necessary caches. all caches should be
* stored in the geometry chunk
*
* Chunk: the geometry chunk to fill
* DirtyUpdate: should the buffers be forced to update (has the vertex data changed?)
*/

void fillCache( XGeometryChunk& Chunk, bool DirtyUpdate );

/**
* sets all necessary shader data, such as shader declarations,
* binding shader constants, etc. all device data have undefined
* state coming into this function
*
* Chunk: one of the chunks being rendered (all chunks have the same basic properties as this one)
*/

void enterShader( XGeometryChunk* Chunk );

/**
* does whatever the shader needs to do to clean up execution. the
* shader is guaranteed to have been rendered when this function
* is called. the shader does not need to worry about engine-controlled
* data, such as render states, textures, and vertex/index buffers
*
*/

void exitShader();


This ''Materials and Shaders Implementation'' has skyrocketed into many, many similar but unique systems for everyone who''s tried to write it. I think that''s awesome!

Chris Pergrossi
My Realm | "Good Morning, Dave"

#13 _DarkWIng_   Members   -  Reputation: 602

Like
Likes
Like

Posted 07 April 2004 - 06:03 AM

Yeah, jamessharpe is right about my approach. I just use all data in GC as VBO. Works OK for me as I don't have that huge scenes (500k tris max), and becuse I don't care about other APIs (don't have and wishes to move to DX any time soon). If I have more time I'll implement system simmilar like jamessharpe (I have in detailed pseudocode, like some other "to be added" features)

I have 2 (simmilar) questions of my own regarding this shader system.

14) How do you handle shaders/effects that require recursive rendering of scene. For example shadowmapping or dynamic cubemaps. Do you create shaders that doesn't need any GC to be binded with them (since they don't need any geometry of their own)?

15) How do you handle shaders/effects for stuff like fullscreen glare(bloom filters) or simmilar as they don't belong to any specific object.

You should never let your fears become the boundaries of your dreams.

[edited by - _DarkWIng_ on April 7, 2004 1:05:55 PM]

#14 c t o a n   Members   -  Reputation: 163

Like
Likes
Like

Posted 07 April 2004 - 07:40 AM

14) Well, let me describe a few effects and how they work, and perhaps you can get the idea; remember: the key selling point for this system is consistency (i.e. no special code paths for any effects).

   Stencil Shadows
To do stencil shadows, you must remember that a shadow requires a shadow-caster, which means you can attach an effect that render's shadow volumes to a mesh. This effect would take as input the mesh to 'shadow volume', any light sources (however many you think the effect needs), and that's about it. The effect can build an edge list from the mesh and the lights and use a vertex shader to extrude the edges - just simply shadow volume stuff. The inputs to an effect are not limited by any means, and can be anything you can think of.

   Dynamic Cubemaps
The way I handle cube maps in my engine is to place specific markers where I want cube maps to be generated (a-la-Half Life 2) and these cube maps can be recaulculated every few frames or whatever. That's up to the effect (it can have a frame counter or a timer, for example). When a cube map is being updated, the effect queries the scene graph for geometry chunks within a certain distance from the marker (and perhaps of a certain type - e.g. a cube map doesn't need bird/fish meshes) and renders all these chunks after setting the camera to the 6 different sides of the cube map. It's even possible to space these calculations out over the course of many, many frames so as to minimize the performance hit and to provide constantly updating reflections (which is pretty cool IMHO).

[edited by - c t o a n on April 7, 2004 2:42:54 PM]

#15 jamessharpe   Members   -  Reputation: 497

Like
Likes
Like

Posted 07 April 2004 - 08:39 AM

14) The way I currently plan on doing this is to pass as a parameter a pointer to a renderable object. Somewhere along the chain, one of the GC''s has triggered the need for a cubemap etc.. to be generated, the problem is how do you keep track of this object so that you don''t render it - the problem is compounded with the depth of recursion allowed, each time you have to remove an extra GC from the possible render set. I''m thinking of changing from a direct renderable base class to use something similar to Ogre''s SceneQuery interface. This would be a good move since it unifies the interface for retrieving information, whilst allowing specialisations for particular circumstances e.g. for the main render occlusion culling can be done, but for say shadowmapping, since we don''t have an occlusion map from the lights POV, we can skip this step - it''s probably not woth generating it for a single render that can then be cached.

15) I think that anything fullscreen would have to be done seperately through a seperate interface - perhaps a fullscreen shader setup, the problem is that something like fullscreen glare you only want to occur on the framebuffer - or perhaps a video screen that''s is linked to a camera, you wouldn''t want it to occur on surfaces that you used as temporary scratchpads(pBuffers). Perhaps you can use a system where you link a shader for post-processing to a specific render-target, this way if you needed a fullscreen effect on a render-texture then the shader could attach the fullscreen effect onto the rendertarget before rendering. I''ve not looked into the details of implementing this, it''s not particularly near the top of my current list of priorities, but thanks for bringing the issue up, I''d not thought about that before and it''s something to bear in mind (esp. as you may want to do fullscreen transitions between scenes or game & cut-scene)

#16 _DarkWIng_   Members   -  Reputation: 602

Like
Likes
Like

Posted 07 April 2004 - 08:03 PM

quote:
Original post by jamessharpe
14) ...the problem is compounded with the depth of recursion allowed, each time you have to remove an extra GC from the possible render set. I''m thinking of changing from a direct renderable base class to use something similar to Ogre''s SceneQuery interface...


Good to hear from someone with same trouble. Curently I have a flag in GC that tells renderer to skip it (not even add it to list). I''ll have a look at how orge is doing it.

quote:
Original post by jamessharpe
15) I think that anything fullscreen would have to be done seperately through a seperate interface...

I tought so. Right now I''m experimenting with something like this. My scene class (that holds everything in the world) has a list of postprocessing effects to be done, and a flag to tell it if they need to be used. By default they are only used when rendering is called from main loop. I''ll probably make them share the same interface as other shaders since they don''t differ that much.

quote:
Original post by c t o a n
The way I handle cube maps in my engine is to place specific markers where I want cube maps to be generated (a-la-Half Life 2) and these cube maps can be recaulculated every few frames or whatever. .... It''s even possible to space these calculations out over the course of many, many frames so as to minimize the performance hit and to provide constantly updating reflections

I don''t really like the idea of special markers for cubemaps. But interesting point about generating textures every n frames. I had a bit differnt idea how to do it (implemented only on paper so far). I would assign priority to each to-be-renderd-texture. At start they would be all the same (1.0). Then each frame I would have a amount of "time" to do this. I would take x textures with highest prioritys and render them. (To pervent starvation I would also render all with priority over some treshold.) After rendering texture priority drops to 0. The textures that don''t get updated this frames get boost in their priority, based on a few things like distance of GC from camera and a few others.

You should never let your fears become the boundaries of your dreams.

#17 rick_appleton   Members   -  Reputation: 857

Like
Likes
Like

Posted 07 April 2004 - 10:12 PM

You guys have a very interesting discussion going on here. I myself haven't implemented or designed anything like this, but it's been going around my head for some time now.

About the post-processing stuff, I think it would be possible to do it in the normal framework. The only caveat is that postprocessing normally requires the entire scene to be rendered to texture right? Then draw a screensize quad with that texture, and the post-process shader?

By using a stack of rendercontexts you could push and pop RC. To do a postscreen effect you'd need a combination of two shaders. One which should be sorted at the beginning of the queue, this one sets the RC to a texture and doesn't do anything else (which is not what would normally happen with a general shader, since they would pop their own RC before finishing). The second one would be sorted to the end and it pops the RC (render to texture) from the stack, then draw the quad with the entire scene (and whatever other effects it needs to do).

Cube maps would be generated without problem, since the shader responsible for them would just push and pop an additional RC to/from the stack.

The only thing that needs some looking into is the recursive stuff. I was thinking on having three render lists instead of one. Pre-process stuff (generating textures, cubemaps, etc), geometry, post-process. This way you could just call all the preprocess stuff first (some of these might require the system to render all the geometry; cubemaps, reflections), so these wouldn't hinder each other. If you save the maps and textures from previous frames and use those for objects with dynamic textures you would get infinite recursion for free (albeit with a few frames lag).

Then you would call the mainrender list yourself, to really draw the geometry to the framebuffer (or to the 'incorrectly' not popped pBuffer for post-processing), then you go through the list of post-process stuff (usually very few items here).

What do you guys think of this?

[edited by - rick_appleton on April 8, 2004 5:16:01 AM]

#18 poly-gone   Members   -  Reputation: 148

Like
Likes
Like

Posted 08 April 2004 - 06:24 AM

Nice discussion going on here...

This is how I''ve implemented my shader library :-

One of the major problems with implementing a shader library is that one doesn''t know what shaders he/she will be adding during the course of developement, so usually one ends up writing a "Base Shader Class", like Yann, from which all the other shaders can be derived. But the problem with such a system is that everytime you need to create a new shader, you''ll need to write a seperate class for the shader.

I''ve taken my shader system a couple of steps ahead, abstracting it to such a high degree that the shader system doesn''t even need to know what shaders it will "encounter" during rendering or what parameters it will need to set. To accomplish this, the shader system has an "attribute" class that can stores all the general types of variables, like float, int, string, vector3 and so on, along with their values. When the shader system is asked to render a piece of geometry with a shader, it just goes through all the shader "attributes", setting the shader constants automatically.

Thus, you could render any geometry in the scene with any shader without changing a single line of code in the engine. A small problem with this system would be stream mapping, since different shaders would have different stream mappings. To solve this, the shader system checks the shader for the stream mapping(s) used in the shader and automatically sets up the FVF.

By using HLSL, integration of shaders into the DCC Tool is easy. This has the added advantage of the artist being able to tune the scene to get the desired "look". When the scene is exported from the DCC Tool, the shader parameters are embedded into the materials as "attributes" (explained earlier) which can then be "set" while rendering the scene by the shader system.

#19 c t o a n   Members   -  Reputation: 163

Like
Likes
Like

Posted 08 April 2004 - 06:29 AM

@rick:
that's currently almost exactly how my engine handles it. To solve the problem of render order (including global stages such as transparent objects, pre-process, main, and post-process as well as relative stages such as render chunk B before chunk C and after chunk A), I have an additional sorting run in my renderlists.

The sorting of geometry chunks is really expensive in my engine (I perform 8 sorting passes) because my engine also optimizes the geometry chunks into perfect batches, so I only sort my render lists once in a blue moon (a few specific events trigger a re-sort).

I add a u8 value to each geometry chunk specifying it's stage, then specify a few specific stages at varying points throughout the range of a u8 value (0 - 255). Here's what my engine uses currently:


/**
* ERenderStage
*
* a list of possible rendering stages that allow primitives to
* be rendered in groups, with optional sorting per-group
*
* NOTE:
* all chunks are sorted using this stage number. therefore, if you want
* shader 1 to be rendered a) before everything else, and b) after shader 2,
* render shader 2 with stage set to RenderStage_PreRender, and render
* shader 1 with stage set to RenderStage_PreRender + 1 (and recast to ERenderStage).
*
* TRANSPARENCY:
* to use transparency in the Spex Engine automatically, you must specify the
* RenderStage_Transparent value. all transparent objects will then be rendered
* seperately with automatic depth sorting (using the '.centroid' property in a
* geometry chunk).
*
*/

enum ERenderStage
{
/*! rendered before everything else (should be used by 'utility' shaders) */
RenderStage_Utility = 0,

/*! rendered before everything else */
RenderStage_PreRender = 64,

/*! rendered normally */
RenderStage_Main = 128,

/*! rendered after everything else */
RenderStage_PostRender = 192,

/*! rendered after everything else and loosely sorted according to depth */
RenderStage_Transparent = 256,

/*! forces this enumerated type to 32-bits [not used] */
RenderStage_ForceDword = 0xAFFFFFFF
};


For the transparent stage, chunks are sorted based on depth from camera (which is obtained automatically by projecting the centroid of the chunk onto the normalized camera direction) and for the rest of the stages, all chunks are sorted based on this u8 value, which is basically free anyway (I used a modified radix sort).

@poly-gone:
Your method for implementing 'attributes' (I call them 'properties' ) is exactly the same as mine, though I also abstract the exact details of a shader away from the engine. The engine merely tells the shader to set itself up to be rendered, and whether this has to do with setting vertex/pixel shaders, setting any render targets, loading data from disk, calculating an updated normal map, whatever, it's up to the shader to do. The engine then renders the chunk data outside of the shader. The engine also handles the setup of vertex/index buffers (my system will automatically concate similar geometry chunks so as to minimize draw calls) but that's it. Using this system, shaders are very flexible and easy to work out, and you can verify whether all shader input requirements are being met by looping through all required properties and making sure they exist in the geometry chunk... Awesome to hear someone thinking along the same lines

[edited by - c t o a n on April 8, 2004 1:37:54 PM]

#20 poly-gone   Members   -  Reputation: 148

Like
Likes
Like

Posted 08 April 2004 - 06:43 AM

Coming to multi-pass rendering and post-processing, any shader that needs multipass rendering or does post-processing has a list of scene commands written in it. When the shader system encounters such a shader, it goes through those commands performing the specified actions, like "SetRenderTarget" or "RenderToScreenQuad".

Thus, both multipass and post-processing issues are solved.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS