Sign in to follow this  
Blips

Rendering 65,000 polygons

Recommended Posts

I'm working on an engine in directX. I have this terrain setup that consists of approx 65,000 polygons that is causing a bit of a performance problem. Is there any way to speed up the rendering? Each polygon is being rendered with the draw index primitive method.

Share this post


Link to post
Share on other sites
Are you drawing all 65,000 polys with one draw call? Because if you are, then it shouldn't give you any performance problems unless you have a really old 3D card.

I would assume that you have a batching problem, i.e. you have too many state changes or draw calls in your rendering process. If that's not the case, try only rendering the parts of the terrain that are inside the view frustum using something like a quadtree to create a hierarchy of terrain tiles.

Share this post


Link to post
Share on other sites
You can definitely use a terrain Level of Detail (LOD) technique or two, rendering the whole thing at once is not a very good idea when it comes to performance.
There are many algorithms, the most important are the ones targetting the vertex/polygon count, one possible algorithm that you can use is called "Chunck Level of Detail" or CLOD, which tesselates vertices based on distance from camera, ie. the actual terrain data that you keep in memory to draw are dynamic like a particle system.
There is also a technique called geo-mipmapping, which makes the texture tiles that are distant from the camera less detailed.

I have looked around a little and I got those two techniques for terrain LOD:
- Continuous LOD Terrain Meshing Using Adaptive Quadtrees
- Real-Time Dynamic Level of Detail Terrain Rendering with ROAM

The Terrain Geomorphing in the Vertex Shader article in gamedev also has useful information.

Well, I hope I helped :)

Share this post


Link to post
Share on other sites
Sure, terrain LOD makes a lot of sense. But 65,000 polys shouldn't give you any problems even when rendered with brute force. Crysis has an average of 1.2 million polys visible at any time and it runs okay-ish...

Share this post


Link to post
Share on other sites
Quote:
Each polygon is being rendered with the draw index primitive method.


You should really clarify what you mean by this. Are you making 65,000 draw calls (very bad) or are you rendering all the polygons at once with a single draw call(using vertex index buffers). The second approach should be very fast.

Share this post


Link to post
Share on other sites
I don't really know anything about 3D hardware but 65000 is suspiciously close to 2^16. Any chance that switching from a 16 to a 32-bit index buffer could be a performance issue?

Share this post


Link to post
Share on other sites
Quote:
Original post by Blips
Each polygon is being rendered with the draw index primitive method.
You're not really meaning that you render each poly separately don't you?
Quote:
Original post by Harry Hunt
Are you drawing all 65,000 polys with one draw call? Because if you are, then it shouldn't give you any performance problems unless you have a really old 3D card.
I agree 100%. In this case the notion of "very old card" referres to pre-GeForce4 cards or maybe even pre-GeForce.
This obviously assumes the vertex stage is actually bottlenecking, which might not be the case given the few information.

Please leave away ROAM and in general, avoid whatever it's CLOD. Those algorithms were designed with a kind of hardware which doesn't exist anymore since a long time - streaming vertex data to GPU buffers nowadays, although not so bad for small streams is generally way more expensive than throwing an extra 3000 triangles.

EDIT: uhm, beaten on time on exactly the same points.
I suppose what he really means is 64*1024.

Share this post


Link to post
Share on other sites
Well, I suggested those techniques because Blips stated that he is working on a directx engine, I doubt that he wants his engine to be limited to drawing only one scene-setting involving a terrain.
Drawing a one million polygon terrain without any LOD is... well, my information are maybe outdated when it comes to terrain rendering but I can't imagine how it's a better idea.

Share this post


Link to post
Share on other sites
Thanks for all the responses. I believe the terrain is being rendered in a loop, that draws each primitive individually with the draw indexed primitive method. I'm pretty new to directX, so I'm not to aware of any alternatives.

Share this post


Link to post
Share on other sites
Quote:
Original post by Blips
Thanks for all the responses. I believe the terrain is being rendered in a loop, that draws each primitive individually with the draw indexed primitive method. I'm pretty new to directX, so I'm not to aware of any alternatives.
YEah, that'll be extremely slow. You should never have more than 500 - 1000 DrawPrimitive calls per frame, any more than that and your spending most of your time doing nothing but submitting batches to the driver.

The best way to do it would be to create a dynamic vertex buffer, lock it at the start of your terrain rendering, and then add in triangles as you "render". Then unlock the vertex buffer and render all triangles in one go. That should be substantially faster.

Share this post


Link to post
Share on other sites
Quote:
Original post by SAL1
rendering the whole thing at once is not a very good idea when it comes to performance.
Actually, with only 65k tris it is a good idea, even if he`s running one of the prehistoric cards like GF2. He still can get a very smooth framerate with scene consisting of 65k tris.

Quote:
Original post by implicit
Any chance that switching from a 16 to a 32-bit index buffer could be a performance issue?
Close to none. Not even on ancient cards that as first supported the 32-bit Indices. I made the comparison long time ago, and the difference was almost zero.

Quote:
Original post by krohm
In this case the notion of "very old card" referres to pre-GeForce4 cards
We can be more specific here - actually it`s pre-Geforce2 cards. When I babbled with my first terrain renderer on GF2 and 700 MHz Duron, it had 131k tris and it was fluent. Even GF1 can do well with 65k tris. So that leaves us with TNT-class cards which are 10 yrs old. That is the card that may struggle with 65k tris, provided it still runs after 10 yrs of service (including the 266 MHz computer it was bought into).

Quote:
Original post by SAL1
Drawing a one million polygon terrain without any LOD is... well
Is OK, if you`re doing it on non-antique HW. Hell, even GF3 could do a scene with 1M poly and still run smooth. Any card after that (maybe with the exception of 64-bit versions) should handle 1M poly terrain with ease. Just for the record, on 7950GT you can play with terrains that have 2-10M tris with no huge impact on fluent framerate, though when you get over 7M it`s slowing down, but it`s still fluent.

Blips : try to render the whole thing in one DIP call and let us know the FPS you get. And don`t forget to disable the VSYNC. BTW, what CPU/GPU you have ?

Share this post


Link to post
Share on other sites
Quote:
Original post by VladR
Quote:
Original post by implicit
Any chance that switching from a 16 to a 32-bit index buffer could be a performance issue?
Close to none. Not even on ancient cards that as first supported the 32-bit Indices. I made the comparison long time ago, and the difference was almost zero.
Nitpick: It'll make a difference if you're bandwidth limited (I.e. you have a lot of textures and suchlike that you're constantly uploading to the card). Although even then I doubt it'll make much of a difference, an index buffer of 65536 is 128K as 16-bit indices, going to 32-bit indices just adds another 128K - the same as a 128x256 texture.

Share this post


Link to post
Share on other sites
Sure, it might be visible, but only if your current FPS was hovering around 2000-4000 and you switched the IB with some key, then you might notice a slight change. But if you`re already at/under 60 fps, I highly doubt you can see a difference of 0.1 fps, if at all.

This got me thinking. I might test this today at home if I don`t forget, since I`ve got a flag which creates either 16 or 32 bit Indices for legacy HW. I could make a quick hack and create another IB and switch between them with a press of a key.

I`ll do this with just 1 chunk (256x256) which renders at about ~4000 fps, so if it isn`t noticeable with fps of 4 thousands, it isn`t visible at all.

Share this post


Link to post
Share on other sites
Quote:
Original post by VladR
I`ll do this with just 1 chunk (256x256) which renders at about ~4000 fps, so if it isn`t noticeable with fps of 4 thousands, it isn`t visible at all.
That's not really a valid test - at that frame rate, there's too much noise for it to be noticible. What you really want is ~100 FPS (no v-sync), rendering a much larger amount of data.

But I do agree, we're going to be talking very little difference here [smile]

Share this post


Link to post
Share on other sites
Well, if you disable everything else in the engine/game, the fps is at constant 4000, oscilating very little (just give or take a few frames). But yes, it would be a better idea to render more chunks with it, that way it should be more visible.

I could load up a 15x15 terrain, that should be enough chunks (225), though at that count, the CPU is becoming flooded , so the results might be skewed.

Any other idea how to test it more precisely ?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this