D3DCREATE_HARDWARE_VERTEXPROCESSING

Started by
7 comments, last by Vendayan 21 years, 11 months ago
can anyone shed some light on this flag for me a bit? The SDK documentation just says that specifying this flag will increase performance, but it doesn't say at what costs, or how to effectively use it or even detect if it's usable by the video card. I suppose the biggest questions I have are... 1) In what areas should it increase the performance of my app if and when its supported by the hardware, and how does it actually increase the performance? 2) How do can I tell by the device caps if this flag is supported by hardware? 3) How will this effect my code? Will I have to do more coding later to actually make use of hardware vertex processing or is it just one of those 'flip this switch on and DX handles the rest' kinda things? Thanks in advance- ~Vendayan
"Never have a battle of wits with an unarmed man. He will surely attempt to disarm you as well"~Vendayan
Advertisement
This flag basically requests that your vertices are transformed and lit by the GPU as opposed to CPU. It should increase performance of pretty much everything related to D3D.

From what I understand in the SDK samples, D3DDEVCAPS_HWTRANSFORMANDLIGHT cap specifies if hardware T&L is available. Use IDirect3D8::GetDeviceCaps to retrieve the caps.

You don''t need any code to use hardware T&L if it''s available.
---visit #directxdev on afternet <- not just for directx, despite the name
Yes you do!

You need to make proper VertexBuffer code!
Ofcourse thats how most people does it but still you dont HAVE to

And also you have to care about what flags youre giving to the vertexbuffers when updating dynamical shit and what flags you give it at all ..

Hums Also Not always does the HARDWARE_blablabala flag works but however the MIXED_VEr.... balbalbala works


On my Geforce2GTS with nVidia detonator (latest released one) i get only mixed supported in windowed and hardware only in fullscreen but then again nVidia driver people must have alot of cheap booze :D

Well anyways theres lots of stuff to think of.. Since theres not many card manufactors around that support hw t&l nowadays you should really check their pages (ati.com , nvidia.com) etc

+ Dont forget that never use more than 8 lights ! Or you will drop down to software processing on nvidia cards
--mega!--
so can anyone show me a couple of short code samples? Showing how to detect it''s capabilities and how to do the vertex buffer stuff?

~Vendayan
"Never have a battle of wits with an unarmed man. He will surely attempt to disarm you as well"~Vendayan
quote:Original post by stefandxm
Yes you do!

You need to make proper VertexBuffer code!

What exactly do you mean by that?

To see a detection sample, open d3dapp.cpp in the DSK samples and search for D3DDEVCAPS_HWTRANSFORMANDLIGHT. The lines around that are what you need.
---visit #directxdev on afternet <- not just for directx, despite the name
1)
a. As mentioned above, HARDWARE_VERTEX_PROCESSING shifts the vertex processing work (transformation & lighting) from the CPU onto the GPU **if the hardware supports it** .

b. A first generation hardware T&L card like a GeForce256/GeForce2 *ONLY* accelerates the fixed function vertex processing when HARDWARE_VERTEX_PROCESSING is enabled. Any attempt to use shaders on that type of card with that type of processing *WON''T* work.

c. Current generation hardware T&L cards/chips support *both* fixed function and shader vertex processing with HARDWARE_VERTEX_PROCESSING.

d. Non-T&L cards such as the Kyro series and older cards won''t do any vertex processing in hardware. For those you *MUST* use SOFTWARE_VERTEX_PROCESSING.

e. For first generation cards where you need to use vertex shaders, you should use either only SOFTWARE_VERTEX_PROCESSING or use MIXED_VERTEX_PROCESSING and switch the processing mode when necessary using the D3DRS_SOFTWAREVERTEXPROCESSING renderstate.


2) Some code from our engine - I''ve stripped it down and removed a lot of stuff to simplify it:
typedef struct{    DWORD dwCreationFlags;    DWORD dwFixedVPUsage;    DWORD dwShaderVPUsage;}ENGINEDEVICEINFO;ENGINEDEVICEINFO CheckDeviceCaps( UINT uiAdapter ){    ENGINEDEVICEINFO info;    D3DCAPS8 caps;    if (FAILED(m_pD3D->GetDeviceCaps( uiAdapter, D3DDEVTYPE_HAL, &caps )))    {      // error handling    }    // work out how much vertex processing can be done in hardware    if (caps.DevCaps & D3DDEVCAPS_HWTRANSFORMANDLIGHT)    {        // if we get to here, the chip can do some form of T&L        // test to see which version vertex shader is supported in hardware        // you''d set whatever version you need below - our app needs 1.1.        if ( caps.VertexShaderVersion < D3DVS_VERSION(1,1) )        {            // if we get here it''s a 1st gen T&L chip which can''t do programmable shaders            info.dwCreationFlags = D3DCREATE_MIXED_VERTEXPROCESSING;            info.dwFixedVPUsage = 0;    // [0==hardware]            info.dwShaderVPUsage = D3DUSAGE_SOFTWAREPROCESSING;        }        else        {            // if we get here, chip is shader capable AND T&L capable            info.dwCreationFlags = D3DCREATE_HARDWARE_VERTEXPROCESSING;            info.dwFixedVPUsage = 0;    // [0==hardware]            info.dwShaderVPUsage = 0;   // [0==hardware]            if (caps.DevCaps & D3DDEVCAPS_PUREDEVICE )            {                // in the engine this snippet is from, we don''t need to read any                // state back from the device so we set the pure device flag to                // get a *potential* performance increase on some cards/drivers                info.dwCreationFlags |= D3DCREATE_PUREDEVICE;            }        }    }    else    {        // if we get here, the card is either a new one without T&L (e.g. Kyro)        // or an older pre-T&L card. So all processing is in software.        info.dwCreationFlags = D3DCREATE_SOFTWARE_VERTEXPROCESSING;        info.dwFixedVPUsage = D3DUSAGE_SOFTWAREPROCESSING;        info.dwShaderVPUsage = D3DUSAGE_SOFTWAREPROCESSING;    }    ...    return info;} 



3) The above code snippet sets up 3 DWORDs in an info structure (the real code sets many others which let us decide things like which screen mode to start in etc). Those DWORDs are used as follows:

a. dwCreationFlags are what gets passed as the BehaviourFlags parameter to IDirect3D8::CreateDevice

b. dwFixedVPUsage is what gets passed as the Usage parameter for any IDirect3DDevice8::CreateVertexBuffer or IDirect3DDevice8::CreateIndexBuffer calls which are to be used with FIXED FUNCTION (FVF) vertex processing.

c. dwShaderVPUsage is what gets passed as the Usage parameter for IDirect3DDevice8::CreateVertexBuffer or IDirect3DDevice8::CreateIndexBuffer calls which are to be used with VERTEX SHADER/programmable vertex processing.


There are other subtle things to think about with T&L versus software (buffer sizes, buffer locking flags etc), but the above is the gist of how to use it properly

--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

Simon O'Connor | Technical Director (Newcastle) Lockwood Publishing | LinkedIn | Personal site

Hey that''s some pretty good stuff S1CA. It cleared up a bunch of issues I was having, thanks a lot.

I have a question in relation to this thread. Is it always better to use Hardware T&L when it is available? I have a GeForce 2 MX 200 and I''m thinking that my 1.5ghz processor may be able to process vertices & lights at a faster rate than the GPU even with sound and AI included.

So what''s is everyone''s opinion on the viability of HW T&L in every situation?

------------------------------------------------------------

Stupid divide by zero error. I should be allowed to tear a hole in the fabric of time and space if I want!
If you can do hardware TnL, do it. The CPU may be able to do it faster then the GPU, but firstly its a hassle to determine which is faster, and secondly if its done on the GPU, it can be done in parallel.

2p
Steve
Member of the Unban Mindwipe Society (UMWS)
As Steve said, if there''s a T&L processor there, then you may as well use it, it''d be a waste of the user''s money if they bought a fancy gfx card which never got used.

However, if the CPU is sat idle most of the time (usually it isn''t!) - then that''s a waste too. Depending on your app and depending on how much time you''re willing to spend on scalability, it may be worth trying to achieve some sort of balance between the CPU and GPU.

Most apps aren''t taxing modern GPUs in terms of vertex procesing at all according to the chip manufacturers. There''s always more you can be shoving to the GPU - the shoving and the ''more'' take CPU time. Some ideas for a machine which had some spare CPU time where you also wanted to maximise GPU use:

1. Procedural modeling and texturing - dynamically generate more polygons and more texture detail for super-rich scenes. Spread the work over a few frames if necessary.

2. Do skinning into world/object space on the CPU at a low frequency in such a way that its output is two morph-target frames of animation which can be transformed and lit by the GPU.

3. Mix lighting between hardware and software - using multiple vertex streams, have one static stream for what gets hardware lit and a dynamic stream of vertex colours from CPU calculations.


Doing more interesting stuff with sound and AI would be appealing too... I''d imagine some softsynth stuff and DSP effects would burn loads of CPU cycles for you.

--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

Simon O'Connor | Technical Director (Newcastle) Lockwood Publishing | LinkedIn | Personal site

This topic is closed to new replies.

Advertisement