Archived

This topic is now archived and is closed to further replies.

simon_brown75

Achievable polygon counts?

Recommended Posts

Hi, I''m writing a D3D engine for a simple first person shooter game using DX8 and I have a question about what sort of polygon counts should be possible. My system is a GeForce 2 GTS / P3-650. I''m using hardware TnL with my Vertex Buffer in video memory, and I''m sorting my polygons by texture (at run time, not per frame) and sending them in batches. I''m also using 256*256 textures, which is pretty reasonable, and I''m using mip-mapping. The reason I mention all that is that as far as I can see, there doesn''t seem to be anything else beyond this to speed up rendering in D3D. With a polygon count of 40,000 I get roughly 30 fps in 1024*768*16-bit, yet when I hear professional 3d engine programmers talking about their engines they mention having a million polygons *per scene* with D3D. Do they mean a million polygons before their hidden surface removal code? Even if they do, they are still getting many times the performance from d3d and 3d hardware than I seem to get, yet I''m using the same complier (VC++ 6 pro), the same DX libraries, and coding how MS recommend in the docs. Is it possible to send say half a million polygons to the video card and still get decent frame rates? Thanks, Simon.

Share this post


Link to post
Share on other sites
1.) There will always be someone that pushes more polys than you ;-)

I suggest you to go to www.crytek.com and download the X-Isle demo. It features complex scenes and a polygon counter. This should give you a good idea what polygon rates you can get while having multiple textures and various special effects active. The demo is really worth the download.

Also, on nVidia.com is the OpenGL performance FAQ and various documents about the NV_Fence and NV_Range extensions. wglNVAllocateMemory and CVAs are also documentated. Check those papers out to maximize your vertex array performance. If I remember right, their demo pushes 5.7MTriangles/s while having CEM enabled.

My terrain engine a bit above 500.000Triangles/s while displaing complex scenes on a GeForce1 DDR

Tim

--------------------------
glvelocity.gamedev.net
www.gamedev.net/hosted/glvelocity

Edited by - tcs on January 25, 2001 8:53:47 PM

Share this post


Link to post
Share on other sites
Oh, sorry ;-) I just realized that you use D3D not OpenGL. While many people would say I''m talking crap, OpenGL IS faster on nVidia cards because there are a few extensions that have no D3D8 counterpart. But you still might check nVidia.com because they also have a few papers on D3D vertex buffers.

btw: The polycounts in the X-Isle demo refer to the rendered ones...

Tim

--------------------------
glvelocity.gamedev.net
www.gamedev.net/hosted/glvelocity

Share this post


Link to post
Share on other sites
Thanks for the replies so far guys, and keep them coming if anyone has anything to add.

Nebob/
40,000*30fps = 1.2 million tri''s per sec not per scene. I''ve definitely heard from reliable sources of D3D7 engines which can process >8 million tri/sec. Such an engine could run 60fps with 133,333 polys, or 30fps with 266k polys, over 6.5 times faster than my code

tcs/
Thanks for the links, X-Isle looks awesome from the shots, it''ll take a while to d/l but it looks worth it. Also i''ll see what advice nVidia offer.

Share this post


Link to post
Share on other sites
1.2 Million? Are you using lot of textures? I can achieve up to 10 Million in 32 bit colors with GeForce2 MX and C333Mhz.
You can test how much does textures reserve time by turning them off.

Share this post


Link to post
Share on other sites
Simon and 24hCoder
Can''t you guys link those demos to some webspace so that we mortals can download and try them out?

Share this post


Link to post
Share on other sites
24hCoder/
No, I''m only using a couple of 256*256 textures for the whole map at the moment (not multi-texturing). I''ve checked the memory usage, and it should be only about 8MB with the VB (2.4 MB), all surfaces (5 MB) and textures, and the card is 32 MB, so there shouldn''t be a problem there.

I''m not doing hidden surface removal though. Do you mean you''re actually getting 10 M tris/sec through the video card, or is that how many your app. is handling and the HSR code means maybe only 20% get to the video card?

Jesava/
Which demos do you mean, X-Isle, or my prog?

Share this post


Link to post
Share on other sites
The polygon throughput goes up from 1.1 M tris/sec to 1.6 M when texture-mapping is turned off, which is about what I''d expect, so it doesn''t look like that''s the problem.

Share this post


Link to post
Share on other sites
If anyone wants to have a look at my game so far you can download it at-

http://www.sbdev.pwp.blueyonder.co.uk/D3DMazeNew.zip

(cut and paste the address). The download is only 418k. The keys are in the readme.txt in the help folder.

The polygon count here is only about 4,000 as I'm redoing the map, hence the better frame rates (see original post). If it doesn't work try disabling Environmental Mapping.

Any feedback, bug reports, frame rate info or compatibility (or otherwise) info highly appreciated

Edited by - simon_brown75 on January 26, 2001 11:55:44 PM

Share this post


Link to post
Share on other sites
Looking good. ~60 fps @800x600x32 on my celeron 466, 128mb ram, rage fury pro 32mb vid card.

Never cross the thin line between bravery and stupidity.

Share this post


Link to post
Share on other sites
You can actualy get to and over 10M tri/sec. But you nead to draw a bit more than 4k tris, and optimize them a bit so that vertex cache comes in.

Share this post


Link to post
Share on other sites
If I switch off D3D Lighting (or remove all lights except ambient) I can get up to 2.6 MTris/sec. Also if I set AGP Aperture Size to 32 MB (same as VRAM) instead of 64, 128 or 256 this goes up to 3.1 MTris/sec, but this is still extremely low, particularly as I can only get up to this number in 640*480.

With a polygon count of 90,000 the best frame rate my engine will do is 35 fps.

I've tried everything, different sized Vertex Buffers, sending different sized batches of primtives. I've downloaded all the documentation from nVidia's developer resource centre, and tried everything they suggest. I've also tried Indexed primitives using index buffers, but again this made no difference.

My PC is definitely running ok, 3DMark2K score - 5491.

Any ideas?

Edited by - simon_brown75 on January 29, 2001 12:45:59 AM

Share this post


Link to post
Share on other sites
Do you have win2k??
If so try instaling Service Pack 1, DirectX 8.0, and Detonator 6.31 drivers (if you want realy high poly in DX 8.0 then you can try 7.xx drivers, but they are a bit unstable). And even if you use DX 7.0 on win2k still install DirectX 8.0.

Share this post


Link to post
Share on other sites
Sentinel/
No, I'm running Win98(rel 1)/DirectX 8/Det 3-6.34 at the moment. I'll send the source code to your e-mail address, thanks


Can triangle through-puts of 10 MTris/sec be acheived in *real* programs. nVidia have programs on their developer page that can indeed manage 14 MTris/sec, but they're not exactly comparable with a real game engine.

Interestingly the Boxster and Firetruck demos, also from nVidia, using h/w TnL run at around 3-4 MTris/sec (100,000 polys at ~30 fps), which isn't far off what I get.





Edited by - simon_brown75 on January 30, 2001 9:18:13 PM

Share this post


Link to post
Share on other sites
Nobody(I think) seems to notice the terminology in the first post. "Scene". When a proifessional mentions a scene they are normally referring to a "Scene Graph" which a hierarchy to store the object in your world. Having 10 million polygons in a scene is not uncommon and is generally linked to the memory you want to allocate to your triangle structures etc. A 10 mil scene does not imply them all being viewable at once. Most professionals seems to use a yon clipping plane hidden by fog to cull the depth of a frustum, others use the more complex "curve of the earth" to remove distant object from view. (In an indoor engine this normally comes down to level design)

Of course occlusion is something that many professional use to control the fill rate of a given view of the world(which is frequent choke point). The problem with occlusion (despite the fact that it''s tricky to implement) is that you need to balance the time spent by the CPU culling against the time spent rendering triangles.

The current trend (generically speaking ) seems to be against occlusion culling as the speed of GPU''s is growing faster that CPU speeds. This means the balance between culling vs rendering is tipping towards rendering meshs you can''t see just because it faster that working out if you can see it anyway.

How do I make a scene graph with 10 million polygons? A good start is looking at Flipcode at Harmless''s review of whats out there. There are a few other good systems he doesn''t mention like Zhang''s thesis that uses something akin to an alpha quadtree to measure visibility.

Just remember that the pro''s say. You''ll get a bigger bonus in speed by choosing the correct algorthm than by optimising the hell out of a bad algorithm.

How are you currently culling or scene, frustum culling? Perhaps from an octree? If your not investigate that method, it''ll provide a good base to move on from. Have a look at the Octant project or perhaps tcs''s terrain demo for frustum\octree use.

HTH

Chris

Share this post


Link to post
Share on other sites
Actually, as I mentioned, I''m sending every single triangle to the video card at the moment. I should have stated that in the original post. That''s really what I was trying to find out, whether the *engine* was handling 10M Tris/sec or the *video card*.

Looks like it''s time to learn about BSPs Harmless Algorithms looks like a good place to start, thanks for the link.

So if we''re talking the triangle throughput the actual video card can cope with (as opposed to the engine), then 50k-100k Tris at 60 fps is about the best that can be hoped for?

Share this post


Link to post
Share on other sites