Archived

This topic is now archived and is closed to further replies.

Facehat

Hardware T&L in Direct3D

Recommended Posts

Josh Neta    122
You have to use vertex buffers to take advantage of hw t&l. What you have to do is enumerate a hw t&l HAL device (much like you would for a normal HAL device). Once you have that all D(I)P calls using vb's should be done by the hw. I'm not sure of the GeForces abilities (send me one and ill tell you ). But from what ive seen/heard it will give you a significant boost if you use it properly.

Josh

Share this post


Link to post
Share on other sites
JD    208
You can do HW T&L using the DrawIndexedPrimitive call but the vertices go into system memory then are copied to video card's memory, while by using vertex buffers the data is sent across the AGP bus straight to the card's memory by way of Direct Memory Access. NVIDIA recommends that you batch or organize the primitives that will be drawn by textures, so to minimize texture state changes since it's a long operation. Also, the vertex buffers can hold maximum of 2 to the 16th power of vertices i.e.65536 since index is a 16-bit value. Your vertex buffers should hold around 50 to 100 vertices for optimum HW T&L. Also use indexed lists and not strips or fans since the geforce256 doesn't process them optimally in HW. All this info can be found on DirectX faq list. Hope this helps

Jerry D.

[This message has been edited by JD (edited December 04, 1999).]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
The main disadvantage of the GeForce's design, is that querying T&Led vertex data is very costly, since it resides in the card-side cache, and we all know that reading from memory on the card to system memory is slow. So, for operations which need the vertex-buffer data to be retrieved, a significant speedhit will occur.
Another thing is that, while triangle throughput has increased in speed, the actual pixel fill-rate is relatively low in speed. This explains why games like Quake3 don't show a significant speed-increase when using T&L.

-----
Willem H. de Boer
Programmer, Davilex Game Dev. Department
www.davilex.com

Share this post


Link to post
Share on other sites
mhkrause    122
Reading the post-TnL data from the GeForce is not costly, it's impossible.

There is no way to read the transformed data back from the GeForce. (This is why the ProcessVertices() method in Direct3D is done on the CPU, always.)

The reason for this is that the TnL is pipelined with the rendering. Attempting to read data back from an intermediate stage would cause a pipeline stall.

If you need the post-TnL data, a decent solution is to use a low-detail model, and transform it on the CPU, while sending a high-detail model to the card for rendering.

Another reason why Quake doesn't perform that much better with a GeForce as compared to a TNT2 is that the processor was spending very little time on anything else besides rendering. The CPU spends a lot of cycles idle.

Share this post


Link to post
Share on other sites
Facehat    696
What limitations does Hardware T&L currentely have in Direct3D? For example, do you have to use Vertex Buffers to take advantage of Hardware T&L? Will Direct3D automatically use hardware transform and lighting for everything it can?

Is hardware Transform and Lighting support in the GeForce good enough that it could actually boost framerates quite a bit if taken advantage of?

Any comments would be apreciated.

--TheGoop

Share this post


Link to post
Share on other sites