# new to HLSL - A few questions

This topic is 3517 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Been pouring over DirectX SDK examples and online tutorials. Looks cool, doable, but scary at the same time. Here are my questions. 1) If you have more then one effect you want to apply, would you write separate fx files and load them up as separate effects and run them one after the other or would you combine all your effects into one fx file and run it as one? 2) Is it faster to use shaders to handling lighting instead of the fixed-function lighting built into DirectX? The one benefit I see is the ability to define all the lights I need but then again I don't know what the speed hit would be compared to fixed-function. 3) It seems like shaders are being used for more then just visual effects these days. From calculating shadow volumes to skeletal animation, shaders are doing a lot these days. Does it make sense to offload all these tasks onto the GPU?

##### Share on other sites
1) .fx files support includes so you can keep it ordered even if you have multiple shaders in one file. I use to have 10/20 shaders per .fx.

2) I never measured the difference. Anyways, shaders is the future.

3) It makes total sense. Latest GPUs have over 20x the floating point performance of a CPU. It is achieved using a highly parallel pipeline. But this highly parallel architecture has downsides : it uses SIMD (Single Instruction Multiple Data) cores, making it very fast for stream processing but worse at branching.

##### Share on other sites
To expand on 2 and 3:

2: Any modern GPU (sold in the past 5 years) emulates the fixed function pipeline by using shaders anyway. That said, shader performance depends on the complexity of the program, and you can create very complex shaders as compared to the features of the fixed pipeline. Then again, simple shaders are faster.

3: It makes sense if your bottleneck is the CPU. A GPU usually has greater raw parallel performance so it makes sense to offload highly repetitive and parallelizable tasks (such as rasterizing and geometry setup) to it by default. A good counterexample might be vertex shaders, that could run faster on CPU on isolation - however, in practice the CPU usually has other things to do as well, that are more suited to it.

##### Share on other sites
Quote:
 Original post by OctavianTheFirst3) It makes total sense. Latest GPUs have over 20x the floating point performance of a CPU. It is achieved using a highly parallel pipeline. But this highly parallel architecture has downsides : it uses SIMD (Single Instruction Multiple Data) cores, making it very fast for stream processing but worse at branching.

It's not just the floating-point performance. Ideally you'd want your game to make use of every drop of power available on both the CPU and GPU, but in practice that's not really feasible on the PC since you have too much overhead from communicating with the GPU. Either you suffer overhead from driver transitions, or from transfering data to GPU memory, or from needing to sync the two processors. Thus it's usually ideal to just do as much as you can on the GPU...in the PC space, anyway. Consoles is a different story.

##### Share on other sites
More on #3:

If you do skeletal animation on the CPU, you've got to send the new data (i.e. transformed vertex positions) down to the GPU every time it animates. If you do it in your vertex shader, only the animation parameters are sent, and the vertex positions are calculated locally.
By offloading this work to the GPU, you're cutting down on the amount of data sent over the PCI bus ;)

##### Share on other sites
Quote:
 Original post by HodgmanMore on #3:If you do skeletal animation on the CPU, you've got to send the new data (i.e. transformed vertex positions) down to the GPU every time it animates. If you do it in your vertex shader, only the animation parameters are sent, and the vertex positions are calculated locally.By offloading this work to the GPU, you're cutting down on the amount of data sent over the PCI bus ;)

Yea, but it only helps if the bus is a bottleneck :D

##### Share on other sites
Quote:
 Original post by HodgmanMore on #3:If you do skeletal animation on the CPU, you've got to send the new data (i.e. transformed vertex positions) down to the GPU every time it animates. If you do it in your vertex shader, only the animation parameters are sent, and the vertex positions are calculated locally.By offloading this work to the GPU, you're cutting down on the amount of data sent over the PCI bus ;)

Thank you so much guys. This has really helped. This is really an exciting time in game development.

One thing I'm confused about. Below is how I send my primitives to DirectX.

void CXMesh::Render(){    for( int i = 0; i < faceLstCount; i++ )    {        // Select this material if not already selected        CMaterialMgr::Instance().SelectTexture( pXVertBuf.pMaterial );        // Set the material property        CXWindow::Instance().GetXDevice()->SetMaterial( &pXVertBuf.materialProp );        CXWindow::Instance().GetXDevice()->SetStreamSource( 0, pXVertBuf.xVertBuf, 0, sizeof(CVertex) );        CXWindow::Instance().GetXDevice()->SetFVF( D3DFVF_XYZ|D3DFVF_NORMAL|D3DFVF_TEX1 );        CXWindow::Instance().GetXDevice()->DrawPrimitive( D3DPT_TRIANGLELIST, 0, pXVertBuf.fcount );    }}	// Render

I need to do this every frame even though the geometry hasn't changed. Your comment suggests my geometry is somehow stored on the video card. How is that the case? That would really be helpful not to have to send static geometry over the buss all the time. How would you do that?

##### Share on other sites
The proper way to upload a mesh to the GPU is to use vertex buffers (and index buffers) that store a lot of triangles. If you use the managed or default pool, the data resides on the GPU memory and doesn't need to be uploaded again and again. Note that you still need to call SetStreamSource and SetIndices anyway when changing the geometry source.

Your code seems to draw one triangle at a time (or what does faceLstCount denote), which is very inefficient. The draw calls itself are expensive, so you should draw as much geometry as possible with as few device calls as possible.

##### Share on other sites
Quote:
 Original post by Nik02Your code seems to draw one triangle at a time (or what does faceLstCount denote)...
faceLstCount is the number of face lists. The lists are grouped by texture. So a mesh of 1000 faces but share one texture, the loop will iterate once.

##### Share on other sites
Okay, things aren't as bad as I thought :)

Anyway, default and managed pool buffers go to GPU memory if applicable. Unless you update() or lock() them, they stay in there. Managed resources might be paged out of GPU memory if the space is scarce, but this is transparent to the app (except for performance impact).

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 14
• 30
• 9
• 16
• 12
×

## Important Information

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!