Archived

This topic is now archived and is closed to further replies.

Supernat02

Hardware/Software Rendering Mixed?

Recommended Posts

I read the MaxPointSize value from the D3DCAPS structure on my computer, and it reads 64, yet the point sprites don't appear. They are merely pixels. I am doing everything as proper as I know how, because it works on my other computer that I know supports point sprites for sure. I read that a 0 or 1 would be returned if it did not support them. Anyway, if anyone has seen that before or knows why, please let me know. Otherwise...to my real question. I can render the point sprites with software vertex rendering and they work. With hardware or mixed, they do not. Like I said before, only on this one machine. I would LOVE to use hardware rendering for the terrain stuff I'm doing and software for the point sprites. I was pleasantly surprised with the awesome speed that it ran at with the point sprites. But anyway, is there a way to change the rendering mode real time? Is this a big deal, does it slow the game down tremendously? EDIT: Oh, by the way, I'm getting significantly higher frame rate with the software vertex processing instead of hardware or mixed. What's the reason for that? Thanks for the help, Chris [edited by - Supernat02 on January 22, 2004 12:19:16 AM]

Share this post


Link to post
Share on other sites
Create a device in MIXED mode, rather than HARDWARE or SOFTWARE, in your CreateDevice call.

To switch, DX8 there was a renderstate D3DRS_SOFTWAREVERTEXPROCESSING. DX9 there is a function SetSoftwareVertexProcessing(). It''s a bit slow, but doing it once or twice a frame won''t hurt. Just don''t over do it.

Depending on your CPU, GPU, and what you app is going, yes, you may see faster frame rate with software processing, if your app does nothing more than just draw things.

Typically you also have enemy AI, networking, collision detection, etc. You will get the benefit of the GPU drawing while the CPU processing these other things. It doesn''t matter if the GPU is actually slower at it, because they are working at the same time.

consider the following (simplified) cases:

Lets say your GPU take 6ms to handle your geometry.
Lets say your GPU takes 10ms to fill your geometry.
Lets say your CPU take 3ms to handle your geometry.
Lets says your CPU uses 14ms to calculate AI, collisions, etc.


software processing
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 (ms)
CPU geometry-Ai&Collisions-----------------------------
GPU ---------Filling------------------------

hardware processing
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 (ms)
CPU Ai&Collisions-----------------------------
GPU geometry----------Filling-----------------------


Note that even though the GPU is slower at transforming vertex data, the frame still ends up 1ms faster because they work in parallel. Also note that I made software take 17ms, while hardware took 16ms. On a console this would mean the hardware solution is tightly tuned to run in 60Hz, while the software solution takes just a little too long, and would require dropping to 30Hz. Ouch. (60Hz = 16.6666ms per frame).

Share this post


Link to post
Share on other sites
Thanks very much. That really makes perfect sense! I just didn''t know that the software could actually render faster than the hardware. That''s news to me. I''ll try SetSoftwareVertexProcessing and see how that works for me.

I''ve been trying different things out, and it seems that I can use hardware for the processing if I don''t use per-vertex size, but use RenderState instead to set the size. So, for one video card, size in the vertex structure works, and for another it doesn''t. Is this perhaps because one card supports it and the other doesn''t support per-vertex size? Is it linked to the version of directX supported by that card? I figured if it supports different sizes through the use of SetRenderState(POINT_SIZE) that it would also support per-vertex. Is this incorrect?

Thanks!
Chris

Share this post


Link to post
Share on other sites
quote:
Original post by Supernat02
Thanks very much. That really makes perfect sense! I just didn''t know that the software could actually render faster than the hardware. That''s news to me. I''ll try SetSoftwareVertexProcessing and see how that works for me.

I''ve been trying different things out, and it seems that I can use hardware for the processing if I don''t use per-vertex size, but use RenderState instead to set the size. So, for one video card, size in the vertex structure works, and for another it doesn''t. Is this perhaps because one card supports it and the other doesn''t support per-vertex size? Is it linked to the version of directX supported by that card? I figured if it supports different sizes through the use of SetRenderState(POINT_SIZE) that it would also support per-vertex. Is this incorrect?


Vertex processing with SSE is pretty close to what the shader does, but at a faster clock rate. Its about the only part of rendering where software can make it faster... but for data that isn''t dynamic you also have the overhead of sending to the card each frame rather the just once. Software vertex processing can also be useful to use vertex shader 2.0 features on a vertex shader 1.1 card. So there are reason for, and against using software processing. Basically use it if you have to, but don''t make it a bad habit.

For pixel processing though, you''ll need a very fast CPU with a very, very, fast bus to top your GPU. Your GPU typically whips your CPU in this area... and it''s done in parallel.

Have you set all the point related render states? To what values? Does your FVF contain PSize, or does your shader output PSize? If the pointsprite demo works on both machines, it''s likely you have a bug. If the demo doesn''t work on one machine correctly, it''s likely a driver bug. The behaviour in the SDK seems fairly well defined.

Share this post


Link to post
Share on other sites
Thanks for the response. Yes, the demo works on both machines. However, the demo doesn't use per-vertex sizes. It sets the size manually. I find that odd in itself... I agree that the SDK is well defined on Point Sprites, except when creating the vertex buffer with D3DUSAGE_POINTS set, but here are parts of my code...


#define PE_POINT_SPRITE_FVF (D3DFVF_XYZ | D3DFVF_PSIZE | D3DFVF_DIFFUSE)

typedef struct
{
D3DXVECTOR3 Position;
float Size;
DWORD Diffuse;
}TPointSpriteParticleVertex;


pD3DDevice->CreateVertexBuffer(ParticleCount*sizeof(TPointSpriteParticleVertex),
D3DUSAGE_WRITEONLY | D3DUSAGE_POINTS, PE_POINT_SPRITE_FVF,
D3DPOOL_DEFAULT, &pVertexBuffer, 0);
pVertexBuffer->Lock(0, 0, (void**)&Vertex, 0);
// Update all the particles here or remove them, then copy memory

pVertexBuffer->Unlock();
.......
pD3DDevice->SetRenderState(D3DRS_ALPHATESTENABLE, TRUE);
pD3DDevice->SetRenderState(D3DRS_ALPHAREF, 0x08);
pD3DDevice->SetRenderState(D3DRS_ALPHAFUNC, D3DCMP_GREATEREQUAL);

pD3DDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, TRUE);
pD3DDevice->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_SRCCOLOR);
pD3DDevice->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_DESTCOLOR);

pD3DDevice->SetTextureStageState(0,D3DTSS_COLOROP, D3DTOP_MODULATE);
pD3DDevice->SetTextureStageState(0,D3DTSS_COLORARG1, D3DTA_TEXTURE);
pD3DDevice->SetTextureStageState(0,D3DTSS_COLORARG2, D3DTA_DIFFUSE);

pD3DDevice->SetRenderState(D3DRS_DIFFUSEMATERIALSOURCE, D3DMCS_COLOR1);

pD3DDevice->SetRenderState(D3DRS_LIGHTING, FALSE);

pD3DDevice->SetTexture(0, pTexture);

pD3DDevice->SetRenderState(D3DRS_POINTSCALEENABLE, TRUE);
pD3DDevice->SetRenderState(D3DRS_POINTSPRITEENABLE, TRUE);
pD3DDevice->SetRenderState(D3DRS_POINTSIZE, FLOAT2DWORD(0.25));
pD3DDevice->SetRenderState(D3DRS_POINTSIZE_MIN, FLOAT2DWORD(0.0f));
pD3DDevice->SetRenderState(D3DRS_POINTSCALE_A, FLOAT2DWORD(0.0f));
pD3DDevice->SetRenderState(D3DRS_POINTSCALE_B, FLOAT2DWORD(0.0f));
pD3DDevice->SetRenderState(D3DRS_POINTSCALE_C, FLOAT2DWORD(1.0f));

pD3DDevice->SetVertexShader(NULL);
pD3DDevice->SetFVF(PE_POINT_SPRITE_FVF);
pD3DDevice->SetStreamSource(0, pVertexBuffer, 0, sizeof(TPointSpriteParticleVertex));

pD3DDevice->DrawPrimitive(D3DPT_POINTLIST, 0, ParticleCount);




So, to recap, this always works on one of my machines, no matter what the vertex processing is. If I use pure hardware OR mixed vertex processing on a different machine, I can remove the per-vertex and call SetRenderState(D3DRS_POINTSIZE, 0.25f); and it works on that machine, all the way up to 64 actually, which is what I'm reading as the MAX point size from the D3DCAPS. So, both machines support it. One just doesn't support per-vertex...unless I am doing something wrong above. Please let me know if I am.

I appreciate all the help,
Chris

[edited by - Supernat02 on January 24, 2004 12:05:35 PM]

Share this post


Link to post
Share on other sites
I just found something interesting in the SDK.

It says:

The default value is the value a driver returns. If a driver returns 0 or 1, the default value is 64, which allows software point size emulation.

In another section:

The application can specify point size either as per-vertex or by setting D3DRS_POINTSIZE, which applies to points without a per-vertex size.

In that same section:

A hardware device that does vertex processing and supports point sprites—MaxPointSize set to greater than 1.0f—is required to perform the size computation for nontransformed sprites and is required to properly set the per-vertex or D3DRS_POINTSIZE for TL vertices.

Now I''m REALLY confused. This says default value of 64 is returned when 0 or 1 is returned by the driver. I''m getting 64. But at the same time, setting D3DRS_POINTSIZE works and having per-vertex sizes don''t! I think I''m just going to stick to software vertex processing and get over it.

Chris

Share this post


Link to post
Share on other sites
Okay I think I found the solution. The SDK makes it sound like if a video card supports point size in the vertex buffer, that it will support setting the renderstate as well. However, looking through the D3D9CAPS structure, I found that D3DFVFCAPS_PSIZE must be part of the FVFCaps DWORD if it is supported. So, I can only assume it may not be supported while SetRenderState is. So now I know how to detect what's supported and can move on. Yay!

By the way, I tried using SetSoftwareVertexProcessing and I get a read of address 0 error when it goes to draw the primitive. The device is still valid though, because I can create a vertex buffer with it and many other things, it just won't draw the primitive. Ever see that before?

EDIT: Nevermind, I figured out that I had to set | D3DUSAGE_SOFTWAREPROCESSING when creating the vertex buffers. All good to go!

Chris


[edited by - Supernat02 on January 24, 2004 11:25:34 PM]

[edited by - Supernat02 on January 25, 2004 1:24:32 AM]

Share this post


Link to post
Share on other sites