Jump to content

  • Log In with Google      Sign In   
  • Create Account


"Sorting out" render order


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
32 replies to this topic

#21 Niello   Members   -  Reputation: 130

Like
0Likes
Like

Posted 16 January 2013 - 03:25 PM

Hi again!

 

First, you didn't take into account ID3DXEffect calls like BeginPass, where SetVertexShader & SetPixelShader are called. It may not be actual when you use one tech for all, but it isn't practical, in any game you will use more, and if no, don't think too much about renderer at all.

 

Second. Since SetIndices is 900-5600, you can't just substitute 900 and make any assumptions. Why not, say, 4200? Or even 5600? It greatly changes things, isn't it? :) The answer is easy. Profile by yourself. Hardware changes, many other circumstances change, and more or less accurate profiling results can be gathered only on your target platform.

 

But the most significant my advice remains the same: write new features, expand your scene's quality and complexity, and start optimizing only when it comes necessary. Profiling has no real meaning in a synthetic environment. You should profile the things you user will receive or special test scenes where some bottlenecks are reproduced (like scene with lots of different particles to optimize particle systems).



Sponsor:

#22 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 18 January 2013 - 04:13 AM

Hi Niello.
Just starting working on all the changes.

Using a shared parameter and 3dxeffectpool for my 'view projection matrix' is working (leaving world matrix calculation as parameter alone for now, looking at future plans with per pixel lighting). What is maybe strange is that it both works with: "float4x4 viewProj"; as with "shared float4x4 viewProj";

I simply created a LP3DXEFFECTPOOL and set the viewProj matrix only once per frame, result is fine (with and without 'shared' in the FX file/shader.
For now I'll keep it in, although don't understand why it works without.

Short version of the code:

// changed part of shader/effect creation function at startup

		D3DXCreateEffectPool(&mEffectPool);
			if(D3D_OK != D3DXCreateEffectFromFileA(pD3ddev, pScene->mEffectFilenames[ec].c_str(), NULL, NULL, 0, mEffectPool, &mEffect[ec], &errorBuffer))

// new function that now only sets technique, instead of also viewProj matrix

bool CD3d::SetShaderTechnique(CD3dscene *pD3dscene, int pEffectIndex, char *pTechnique)
{
	if(D3DERR_INVALIDCALL == pD3dscene->mEffect[pEffectIndex]->SetTechnique(pTechnique)) return false;
	return true;
}

// new part of render function

	// SHADER rendering
	// Set shared parameters first
	if(D3DERR_INVALIDCALL == pD3dscene->mEffect[0]->SetMatrix("ViewProj", &pCam->mMatViewProjection)) return false;	// SHARED PAREMETER IN POOL
	
	if(!RenderScene(pD3dscene, pCam, "OpaqueShader", pD3dscene->mMeshIndexOpaque, pD3dscene->mNrD3dMeshesOpaque)) return false;
	
	if(!pD3dscene->SortBlendedMeshes(pCam->mPosition)) return false;
	if(!RenderScene(pD3dscene, pCam, "BlendingShader", pD3dscene->mMeshIndexBlended, pD3dscene->mNrD3dMeshesBlended)) return false;
	
	if(pD3dscene->mSkyBoxInScene) if(!pD3dscene->mSkyBox.Render(pCam->mPosition, pCam, mD3ddev)) return false;


 

Will go into splitting my mesh class into a 'real' mesh class and new meshinstance class (including all changes necessary with this).



#23 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 18 January 2013 - 06:59 AM

Just a short update;
Going from meshes to mesh and meshinstance is quite a job, but a big improvement for sure.
When I think about it, I had about 20 or so tree meshes eating memory and buffers, while beeing all the same.

Short update;
Rough implementation done, nice side effect is that loading time is decreased but a couple of thousand % :)
Next step is clean indices..

Will keep you posted rolleyes.gif

Edited by cozzie, 18 January 2013 - 10:02 AM.


#24 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 19 January 2013 - 09:08 AM

@Niello; still there?

In the middle of next steps for using mesh instances instead of full mesh for every object right now.
I'm now starting to take the following approach:

- create index table in mesh class containing a list with ID's of the instances of that mesh
(or maybe do this in my scene class, like 2 dimensional array, not sure if this works memory allocation wise?)
Do this trick 2 times, one for blended and one for opaque instances
- create index table in scene cass with array per material, containing the mesh ID's of meshes using this material
(create at startup for all static objects, no solution yet for dynamic objects)

- at rendertime I split rendering into a few main steps:

1a. culling; loop through all mesh instances and check against frustum. Mark with bool visible true/false
(in the future in this step I could add binary space checking, tree's, portals or whatever)
1b. sort blended meshinstances

2. main rendering loop:

a* loop through all materials
b* select material (state changes)
c* loop through mesh index that contains which meshes contain active material
d* select mesh (state changes, set buffers)
e* for each mesh, loop through the meshinstances index
f* if meshinstance visible true/false
h* if in frustum select meshinstance (state changes)
i* for each submesh of meshinstance do 'live' check boundingsphere in frustum
j* do draw call
... till end of scene

All steps above 2x, one for opaque and one for blended.
On state changes I will definitely save quite few setstreamsources/ setindices.

I'll also do some profiling on the number of batches/ draw calls I do per frame and how many triangles they include.

What's your advice on this, am I shooting myself in the foot for expansions in the future? (i.e. combining buffers, binary space positioning etc.).
Also curious what you think about the 'shared float' thingie above.

update 21-1;
still working on it and making nice steps, just decided I want a renderqueue class to handle all this. To be able in having a flexible 'render bucket'. In the class I'll have all indices for meshes, materials, submeshes, save depths, sorting functions etc.

Still curious though on your thoughts/questions on the last updates

Edited by cozzie, 20 January 2013 - 05:17 PM.


#25 Niello   Members   -  Reputation: 130

Like
0Likes
Like

Posted 22 January 2013 - 12:16 AM

Hi. Here I am again. Btw, happy birthday to both you and me)

 

Shared params in effects are shared between different effects. While you use 1 effect you won't see any difference, but when there are different ID3DXEffect objects, that are created with the same pool, setting shared variable to one of them sets it in them all.

 

Your mesh refactoring is a good news. Also, if you use .x mesh files in ASCII format, moving to binary files will result in another big loading time win. And the third could be using precompiled .fx shaders.

 

As of your indexing system, I prefer sorting each frame. My advice on it all - download a couple of popular 3D engines and explore them. There are different advanced techniques that had prove their efficiency. My teacher, for example, is The Nebula Device of versions 2 and 3, but I don't recommend to copypaste them, instead you can gather ideas from. After all I faced the need of reimplementing the whole Nebula scene graph and renderer. Irrlicht or Ogre are also a good starting point, not sure about architecture, but render techs - definitely.



#26 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 22 January 2013 - 03:06 PM

Hi, good to hear from you.

Unfortunately I made a mistake on my profile, birth day is due one month ,19th of february ;)

Happy birthday to you though! :)

 

Thanks for your remarks, I'm learning more then one thing from this mesh/ setting up good render(que) funding, one being that I can really use feedback like your input, and the other being that I should just do and try and not ask everything on forehand.

 

I'll keep you posted upcoming days after I finish the next steps and will show you the result (and get your comments :))

 

I will be sorting each frame, depending on which index we talk about. Specific static things like mesh/ material index won't change, so I'll not sort on that. What might be worth a try is sorting mesh instances index after culling based on visible yes/no, this should be done each frame them. Is this what you mean?

(I'm not sure if sorting only visible instances is worth it versus checking if visible or not in the render loop)

 

Another/last thing is that I now use (unsigned) int arrays for the indices.

What might bring a little is using (multi)maps instead of separate int arrays, but personally I think this would be micro optimization (not necessary).

 

For another optimization that could bring something I could check for redundant sate setting like you mentioned earlier and maybe do some profiling with PIX.

After that back to introduction new and nice goodies in the engine, which is then nicely funded and structured for future expansions / changes.



#27 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 26 January 2013 - 12:29 PM

Good news, the basics are set up and working nicely..

Created a renderqueue class taking care of things and split up updating the scene from rendering (might be usefull if I need multi threading in the future).

 

What I don't get yet, is how to implement a check for redundant setting of a vertexbuffer or indices.

Next steps are;

- add an index to sort per shader

- add a form a culling 'areas' (quadtree or something, rather think of something myself :))

- after that no more optimizing, just add lots of new goodies

 

Here are the results (code), please shoot :) really like to hear your suggestions

 

// RenderFrame function

bool CD3d::RenderFrame(CD3dscene *pD3dscene, CD3dcam *pCam)
{
	if(!CheckDevice()) { mDeviceLost = true; return true; }
	mDrawCallsPerFrame = 0;			mDrawTriPerFrame = 0;

	pCam->Update();

	/** CULLING AND SORTING	**/
	
	if(!UpdateScene(pD3dscene, pCam)) return false;

	mD3ddev->Clear(0, NULL, D3DCLEAR_TARGET | D3DCLEAR_ZBUFFER, D3DCOLOR_XRGB(0, 0, 0), 1.0f, 0);
	mD3ddev->BeginScene();

	/** SET SHARED FX/ SHADER PARAMETERS **/
	
	if(D3DERR_INVALIDCALL == pD3dscene->mEffect[0]->SetMatrix("ViewProj", &pCam->mMatViewProjection)) return false;	// SHARED PAREMETER IN POOL
	
	/** RENDER SCENE USING FX/ SHADER WITH SPECIFIC TECHNIQUE **/
	
	if(!RenderScene(pD3dscene, pCam, "OpaqueShader", _OPAQUE)) return false;
	if(!RenderScene(pD3dscene, pCam, "BlendingShader", _BLENDED)) return false;

	if(pD3dscene->mSkyBoxInScene) if(!pD3dscene->mSkyBox.Render(pCam->mPosition, pCam, mD3ddev)) return false;

	/** FFP RENDERING, I.E. SCENE STATISTICS **/
	if(!SetDefaultRenderStates()) return false;
	PrintSceneInfo(pCam, pD3dscene->mNrMaterials);		

	/** PRESENT THE FINAL RENDERED SCENE FROM BACKBUFFER **/
	mD3ddev->EndScene();
	HRESULT hr = mD3ddev->Present(NULL, NULL, NULL, NULL); 
	return true;
}

// Render a scene with specific technique

bool CD3d::RenderScene(CD3dscene *pD3dscene, CD3dcam *pCam, char *pTechnique, int mattype)
{
	for(fx=0;fx<mRenderQueue.mNrEffects;++fx)		
	{
		if(!SetShaderTechnique(pD3dscene, fx, pTechnique)) return false;							// 1x SetTechnique, 1x SetPixelShader/ SetVertexShader?	
		pD3dscene->mEffect[fx]->Begin(&pD3dscene->mEffectNumPasses[fx], D3DXFX_DONOTSAVESTATE);		// 'x' RenderStates, based on FX/shader content

		for(_i=0;_i<pD3dscene->mEffectNumPasses[fx];++_i)
		{
			pD3dscene->mEffect[fx]->BeginPass(_i);
			for(mat=0;mat<mRenderQueue.mNrMaterials;++mat)
			{
				if(!pD3dscene->PreSelectMaterial(mat, fx)) return false;							// 2x SetFloatArray, 1x SetTexture									   							
				for(m=0;m<mRenderQueue.mMaterialData[mat].nrMeshes;++m)		
				{
					mesh = mRenderQueue.mMaterialData[mat].meshIds[m];
					if(!pD3dscene->mMeshes[mesh].SetBuffers(mD3ddev)) return false;					// SetStreamSource, SetIndices
					
					for(mi=0;mi<mRenderQueue.GetNrInstances(mesh, mattype);++mi) 
					{
						instance = mRenderQueue.GetInstance(mesh, mi, mattype);
						if(mRenderQueue.mMeshInstData[instance].effectId == fx)						// INDEX NEEDED TO?
						{
							if(mRenderQueue.mMeshInstData[instance].visible)						// (MICRO-OPT) optimization? Sort index per frame
							{
								if(!pD3dscene->PreSelectMeshInst(instance, mD3ddev)) return false;	// 2x SetMatrix (World/WorldInvTransp)	
								pD3dscene->mEffect[fx]->CommitChanges();
						
								for(subm=0;subm<mRenderQueue.mMaterialData[mat].meshSubMeshes[m].nrSubMeshes;++subm) 
								{
									submesh = mRenderQueue.mMaterialData[mat].meshSubMeshes[m].subMeshes[subm];
									pD3dscene->mMeshes[mesh].RenderSubMesh(mD3ddev, submesh, LIST); 
								}
							}
						}
					}
				}
			}
			pD3dscene->mEffect[fx]->EndPass();
		}
		pD3dscene->mEffect[fx]->End();
	}
	return true;
}

// Update scene function

bool CD3d::UpdateScene(CD3dscene *pD3dscene, CD3dcam *pCam)
{
	// TODO here; introduce tree - spatial culling

	/** UPDATE DISTANCE TO CAM FOR BLENDED MESH INSTANCES				**/
	for(m=0;m<mRenderQueue.mNrMeshes;++m)
		for(mi=0;mi<mRenderQueue.mMeshData[m].nrInstancesBlended;++mi)
			pD3dscene->mMeshInstances[mRenderQueue.mMeshData[m].instancesBlended[mi]].UpdateDistToCam(pCam->mPosition);

	/** SORT BLENDED MESH INSTANCES, BACK TO FRONT						**/
	if(!mRenderQueue.SortBlendedMeshes(pD3dscene)) return false;

	/** UPDATE WORLD MATRIX, FOR DYNAMIC MESH INSTANCES ONLY			**/
	for(mi=0;mi<mRenderQueue.mNrMeshInstDynamic;++mi)
		pD3dscene->mMeshInstances[mRenderQueue.mDynamicMeshInstIndex[mi]].UpdateWorldMatrix();

	/** CULL MESH INSTANCES AGAINST FRUSTUM, VISIBLE YES/NO				**/
	for(mi=0;mi<mRenderQueue.mNrMeshInst;++mi)
	{
		if(pCam->SphereInFrustum(&pD3dscene->mMeshInstances[mi].mWorldPos, pD3dscene->mMeshInstances[mi].mBoundingRadius))
			mRenderQueue.mMeshInstData[mi].visible = true;
		else mRenderQueue.mMeshInstData[mi].visible = false;
	}
	return true;
}

// the small functions which do the actual parameter changes

bool CD3d::SetShaderTechnique(CD3dscene *pD3dscene, int pEffectIndex, char *pTechnique)
{
	if(D3DERR_INVALIDCALL == pD3dscene->mEffect[pEffectIndex]->SetTechnique(pTechnique)) return false;
	return true;
}

bool CD3dscene::PreSelectMeshInst(int pMeshInstId, LPDIRECT3DDEVICE9 pD3ddev)
{
	if(D3DERR_INVALIDCALL == mEffect[mMeshInstances[pMeshInstId].mEffectIndex]->SetMatrix("World", &mMeshInstances[pMeshInstId].mMatWorld)) return false;
	if(D3DERR_INVALIDCALL == mEffect[mMeshInstances[pMeshInstId].mEffectIndex]->SetMatrix("WorldInvTransp", &mMeshInstances[pMeshInstId].mMatWorldInvTransp)) return false; 
//	OR normalize in Shader for lighting

	return true;
}

bool CD3dscene::PreSelectMaterial(DWORD pMatId, int pEffectIndex)
{
	if(D3DERR_INVALIDCALL == mEffect[pEffectIndex]->SetFloatArray("MatAmb", mMaterials[pMatId].Ambient, 4)) return false;
	if(D3DERR_INVALIDCALL == mEffect[pEffectIndex]->SetFloatArray("MatDiff", mMaterials[pMatId].Diffuse, 4)) return false;
	if(mTextures[pMatId] != NULL) 
		if(D3DERR_INVALIDCALL == mEffect[pEffectIndex]->SetTexture("Tex0", mTextures[pMatId])) return false;
	return true;
}



#28 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 26 January 2013 - 12:32 PM

Addition;

I also profiled/ measured number of draw calls/ triangles per frame, just to know much batches I use and how big they are.

A few numbers:

 

Draw calls in frame: 399

Triangles in frame: 194616

Average tri per call: 487

D3D Renderframe: present successfull

 

Draw calls in frame: 399

Triangles in frame: 194616

Average tri per call: 487

D3D Renderframe: present successfull

 

Draw calls in frame: 381

Triangles in frame: 184848

Average tri per call: 485


 



#29 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 27 January 2013 - 10:35 AM

Hi Hiello,
This post is become a blog/ book on render queues.. never the less... :)

Profiling with PIX works great, just 'freezed' a frame and compared the results with what I expected from my render function.
I see that I've gained quite a lot with my renderqueue and not looping through unnecessary stuff (materials, meshes etc.).

I also see that adding an index per effect would probably be more then a micro optimization.
In the case of my current testscene, I have the following 'unneeded sets' in one frame because of no FX/shader index per mesh/material:

- 20x SetFloatArray
- 10x SetTexture
- 12x SetStreamSource
- 12x SetIndices

I also noticed that my FX/ shader doesn't set renderstates at all, as shown in pix.
Per frame I measured the following number of setting render states:

- None during going through the effects/shaders (the sampler and render states from HLSL/ FX files not found in PIX output)
- For skybox rendering (after effects/shaders):

* ZWRITEENABLE, false
* CULLMODE, D3DCULL_CW

<render skybox>

* ZWRITEENABLE, true
* CULLMODE, D3DCULL_CCW
* ZENABLE, true
* ZWRITEENABLE, true (redundant!!)
* CULLMODE, D3DCULL_CCW (redundant!!)
* LIGHTING, false
* STENCILENABLE, false
* FILLMODE, solid

I see that after Skybox rendering I change back cullmode and zwriteenable, which I also do in a set of default renderstates at the end of the frame.
Which I not necessary also. Think I'll have to decide what to do you with, making state blocks or do it all in shaders (something for later).

I could definately use an index for effects/ shaders to reduce the not needed 'set's', which will get more and more important as I enlarge my scene and increase the number of different shaders/ FX's.

For now I did a 80/20 quick implementation like this:
- when looping through both meshes and materials (1 time per frame), I check a generated bool table, giving back if the material / mesh combination uses the effect. This way I can early reject based on material and save a lot of sets.

Any ideas/ hints on all this ? :)

Edited by cozzie, 27 January 2013 - 12:53 PM.


#30 Niello   Members   -  Reputation: 130

Like
0Likes
Like

Posted 01 February 2013 - 05:59 PM

Hi.

I was working hard this week, so there was no time to post.

 

Now you are at the point where I can't see obvious problems in your code. Yes, it isn't perfect and may cause problems in the future, and, moreover, I would wrote (and I actually wrote) the whole scene graph + renderer differently. You are encouraged to dig into my code (there were links) if you want to know what I prefer :) I see no point in copying the same renderer in all projects around the world, and it is good that you try to architect your one by yourself.

 

And, definitely, implement spatial culling!

 

Hope to hear from you when you begin to implement new features. This always makes to rethink and improve rendering codebase.



#31 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 10 February 2013 - 08:25 AM

Hi Niello.

Thanks again for your feedback on the approach.

 

I actually complete rewrote my renderqueue when I introduced 'indices'/ scene graph also for each effect and with opaque/blended split up.

All working great now, really happy with it.  Profiling with PIX tells me that I eliminated ALL unnecessary setting of materials, states, buffers etc.

 

Next step; lighting system

Probably combined with introducing spatial culling, to prevent limiting myself to 8 point lights in the 'world'.

 

Here's the latest version of my demo, if you'd like to take a peek: www.sierracosworth.nl/gamedev/2013-02-10-demo.zip

(controls: W/S/A/D, PG UP/PG DWN)

 

I've posted just the 3d class rendering code, to give you an impression. Input and feedback always welcome :)

(I didn't post all renderqueue creation code, would be a waste)

 

/**************************************************************************************/
/***							RENDERFRAME											***/
/*** ==> usage: in main loop, for each frame; RenderQueue added 24-1-2013			***/
/*** ==> render a frame with 3d scene; 1. update cam and scene 2. render fx			***/
/**************************************************************************************/

bool CD3d::RenderFrame(CD3dscene *pD3dscene, CD3dcam *pCam)
{
	if(!CheckDevice()) { mDeviceLost = true; return true; }
	mDrawCallsPerFrame = 0;			mDrawTriPerFrame = 0;

	pCam->Update();

	/** CULLING AND SORTING	**/
	if(!UpdateScene(pD3dscene, pCam)) return false;

	mD3ddev->Clear(0, NULL, D3DCLEAR_TARGET | D3DCLEAR_ZBUFFER, D3DCOLOR_XRGB(0, 0, 0), 1.0f, 0);
	mD3ddev->BeginScene();

	/** SET SHARED FX/ SHADER PARAMETERS **/
	if(D3DERR_INVALIDCALL == pD3dscene->mEffect[0]->SetMatrix("ViewProj", &pCam->mMatViewProjection)) return false;	// SHARED PAREMETER IN POOL
	
	/** RENDER SCENE USING FX/ SHADER WITH SPECIFIC TECHNIQUE **/
	if(!RenderSceneOpaque(pD3dscene, pCam, "OpaqueShader")) return false;
	if(!RenderSceneBlended(pD3dscene, pCam, "BlendingShader")) return false;

	if(pD3dscene->mSkyBoxInScene) if(!pD3dscene->mSkyBox.Render(pCam->mPosition, pCam, mD3ddev)) return false;

	/** FFP RENDERING, I.E. SCENE STATISTICS **/		/** not used: SetDefaultRenderStates() doesn't do anything now **/
	PrintSceneInfo(pCam, pD3dscene->mNrMaterials);		

	/** PRESENT THE FINAL RENDERED SCENE FROM BACKBUFFER **/
	mD3ddev->EndScene();
	HRESULT hr = mD3ddev->Present(NULL, NULL, NULL, NULL); 
	OutputDebugRenderInfo(hr);

	return true;
}

/**************************************************************************************/
/***							UPDATE SCENE										***/
/*** ==> usage: within renderframe, before draw calls etc.							***/
/*** ==> checks all meshes against space positioning, frustum etc.					***/
/**************************************************************************************/

bool CD3d::UpdateScene(CD3dscene *pD3dscene, CD3dcam *pCam)
{
	/** UPDATE DISTANCE TO CAM FOR BLENDED MESH INSTANCES				**/													// TODO
	for(fx=0;fx<mRenderQueue.mNrEffects;++fx)
	{
		for(m=0;m<mRenderQueue.mEffect[fx].nrMeshesBlended;++m)
		{
			for(mi=0;mi<mRenderQueue.mEffect[fx].meshInstBlended[m].nrInstances;++mi)
				pD3dscene->mMeshInstances[mRenderQueue.mEffect[fx].meshInstBlended[m].instances[mi]].UpdateDistToCam(pCam->mPosition);
		}
	}

	/** SORT BLENDED MESH INSTANCES, BACK TO FRONT						**/
	if(!mRenderQueue.SortBlendedMeshes(pD3dscene)) return false;

	/** UPDATE WORLD MATRIX, FOR DYNAMIC MESH INSTANCES ONLY			**/
	for(mi=0;mi<mRenderQueue.mNrMeshInstDynamic;++mi)
		pD3dscene->mMeshInstances[mRenderQueue.mDynamicMeshInstIndex[mi]].UpdateWorldMatrix();

	// TODO here; introduce tree - spatial culling
	
	/** CULL MESH INSTANCES AGAINST FRUSTUM, VISIBLE YES/NO				**/
	for(mi=0;mi<mRenderQueue.mNrMeshInst;++mi)
	{
		if(pCam->SphereInFrustum(&pD3dscene->mMeshInstances[mi].mWorldPos, pD3dscene->mMeshInstances[mi].mBoundingRadius))
			mRenderQueue.mMeshInst[mi].visible = true;
		else mRenderQueue.mMeshInst[mi].visible = false;
	}
	return true;
}

/**************************************************************************************/
/***							RENDER SCENE - OPAQUE								***/
/*** ==> usage: within renderframe, to render specific technique of shader			***/
/*** ==> renders all opaque mesh instances through renderqueue for all effects		***/
/**************************************************************************************/

bool CD3d::RenderSceneOpaque(CD3dscene *pD3dscene, CD3dcam *pCam, char *pTechnique)
{
	for(fx=0;fx<mRenderQueue.mNrEffects;++fx)		
	{
		if(!SetShaderTechnique(pD3dscene, fx, pTechnique)) return false;	
		pD3dscene->mEffect[fx]->Begin(&pD3dscene->mEffectNumPasses[fx], D3DXFX_DONOTSAVESTATE);		

		for(_i=0;_i<pD3dscene->mEffectNumPasses[fx];++_i)
		{
			pD3dscene->mEffect[fx]->BeginPass(_i);
			for(mat=0;mat<mRenderQueue.mEffect[fx].nrMaterialsOpaque;++mat)									
			{
				matid = mRenderQueue.mEffect[fx].materialsOpaque[mat];
				if(!pD3dscene->PreSelectMaterial(matid, fx)) return false;	
				for(m=0;m<mRenderQueue.mEffect[fx].nrMeshesOpaque;++m)											
				{
					mesh = mRenderQueue.mEffect[fx].meshOpaque[m];
					if(!pD3dscene->mMeshes[mesh].SetBuffers(mD3ddev)) return false;					
					for(mi=0;mi<mRenderQueue.mEffect[fx].meshInstOpaque[m].nrInstances;++mi)				
					{
						instance = mRenderQueue.mEffect[fx].meshInstOpaque[m].instances[mi];				
						if(mRenderQueue.mMeshInst[instance].visible)	// (MICRO)optimization? Sort per frame
						{
							if(!pD3dscene->PreSelectMeshInst(instance, mD3ddev)) return false;
							pD3dscene->mEffect[fx]->CommitChanges();

							RenderSubMeshes(pD3dscene, matid, mesh);					
						}
					}
				}
			}
			pD3dscene->mEffect[fx]->EndPass();
		}
		pD3dscene->mEffect[fx]->End();
	}
	return true;
}

/**************************************************************************************/
/***							RENDER SCENE - BLENDED								***/
/*** ==> usage: within renderframe, to render specific technique of shader			***/
/*** ==> renders all blended mesh instances through renderqueue for all effects		***/
/**************************************************************************************/

bool CD3d::RenderSceneBlended(CD3dscene *pD3dscene, CD3dcam *pCam, char *pTechnique)
{
	for(fx=0;fx<mRenderQueue.mNrEffects;++fx)		
	{
		if(!SetShaderTechnique(pD3dscene, fx, pTechnique)) return false;	
		pD3dscene->mEffect[fx]->Begin(&pD3dscene->mEffectNumPasses[fx], D3DXFX_DONOTSAVESTATE);		

		for(_i=0;_i<pD3dscene->mEffectNumPasses[fx];++_i)
		{
			pD3dscene->mEffect[fx]->BeginPass(_i);
			for(mat=0;mat<mRenderQueue.mEffect[fx].nrMaterialsBlended;++mat)									
			{
				matid = mRenderQueue.mEffect[fx].materialsBlended[mat];
				if(!pD3dscene->PreSelectMaterial(matid, fx)) return false;	
				for(m=0;m<mRenderQueue.mEffect[fx].nrMeshesBlended;++m)											
				{
					mesh = mRenderQueue.mEffect[fx].meshBlended[m];
					if(!pD3dscene->mMeshes[mesh].SetBuffers(mD3ddev)) return false;					
					for(mi=0;mi<mRenderQueue.mEffect[fx].meshInstBlended[m].nrInstances;++mi)				
					{
						instance = mRenderQueue.mEffect[fx].meshInstBlended[m].instances[mi];				
						if(mRenderQueue.mMeshInst[instance].visible)	// (MICRO)optimization? Sort per frame
						{
							if(!pD3dscene->PreSelectMeshInst(instance, mD3ddev)) return false;
							pD3dscene->mEffect[fx]->CommitChanges();

							RenderSubMeshes(pD3dscene, matid, mesh);					
						}
					}
				}
			}
			pD3dscene->mEffect[fx]->EndPass();
		}
		pD3dscene->mEffect[fx]->End();
	}
	return true;
}

/**************************************************************************************/
/***							RENDER SUBMESHES									***/
/*** ==> usage: within renderscene, to render submeshes with specific material		***/
/*** ==> renders all submeshes of given mesh with corresponding material			***/
/**************************************************************************************/
	
void CD3d::RenderSubMeshes(CD3dscene *pD3dscene, int pMatId, int pMeshId)
{
	for(sub=0;sub<mRenderQueue.mMaterials[matid].meshes[mesh].nrSubMeshes;++sub)		// NOT OPAQUE/BLENDED SPECIFIC
	{
		submesh = mRenderQueue.mMaterials[matid].meshes[mesh].submeshes[sub];
		pD3dscene->mMeshes[mesh].RenderSubMesh(mD3ddev, submesh, LIST); 

		#ifdef _DEBUG 
		++mDrawCallsPerFrame;
		mDrawTriPerFrame += pD3dscene->mMeshes[mesh].mSubMeshTable[submesh].FaceCount;
		#endif
	}
}



#32 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 11 February 2013 - 04:02 PM

Hi Niello,

I've overseen something and would like your advice..

 

What I do now is, for split-up opaque or blended:

- loop through all materials

- loop through all opaque/ or blended meshes

- loop through instances

- if visible continue and do the draw call

 

This works 80/20 (80% ok), because I now loop through meshes with don't use that material.

I've thinking about it and started to make an index per material, for meshes that use that material.

 

This might work, but on second hand, will still make me loop through meshes and instances way more then necessary.

So I though of the following solution:

 

Keep the loop the same but do a check if the mesh uses the material:

- loop through all materials

- loop through all opaque/ or blended meshes

- NEW: only select mesh (buffers) and continue to instances if mesh uses the material

- loop through instances

- if visible continue and do the draw call

 

This will return 'number of meshes' * material if statements on the CPU, reducing potentially hundreds to thousands unecessary setting of vertex/index buffers and instances (world/ worldinvtrans matrices).

 

So definitely a big win compared to the one extra if statement for each mesh/ material combination.

 

What would you advice, any easier/ more profitable solutions?

 

Update;

I definately need to use my notepad more often to post replies to myself :)
Fixed it already, made a mesh instance list per mesh blended and per mesh opaque.


Edited by cozzie, 11 February 2013 - 05:18 PM.


#33 cozzie   Members   -  Reputation: 1420

Like
0Likes
Like

Posted 14 February 2013 - 01:34 PM

Short update;

- implemented point lights with success

- changed from boundingsphere culling (Frustum) to boundingbox, big improvement

(will go to flexible later based on length/width of the mesh)

 

Here's a nice screenshot, with all types of light sources in action (ambient, diffuse and point):

 

lightingcombi.jpg

 

Next steps;

- implementing a lighting system and spatial culling/ areas.

Will go for this with my own ideas instead of tutorials and see what comes out.

Dividing the scene/ world in cubes (Static) and do bounding box check on those cubes. All meshes and lights will be part of a 'cube' and rejected early if the cube is culled. Maybe go 1 or 2 levels 'deeper' to have smaller cubes and earlier rejections. Is this the same principal a quadtree follows?

 

Another 'challenge' is to find out if it's possible to tell my effect/ shader how many point lights I have and process those

(sort of dynamic array and for loops in the effect). Not sure though if this is possible with VS/PS 2.0 (keeping max lights 8 into account).


Edited by cozzie, 14 February 2013 - 01:36 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS