Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

BlueChip

Problem: Texture kill my FPS... it's bad

This topic is 5687 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi =) I thought that many many triangles were the cause of a low FPS.. and another little cause were large textures. instead I''ve a low performance with small texture... I don''t use a particular filter (only linear filter).... perhaps I must use more filters? My textures are 100X100 bitmap... they aren''t big.. Can you give me an advise? PS. I render meshes grouped for type (i.e first all trees, then all house, then all rock... ecc) thanks!!!

Share this post


Link to post
Share on other sites
Advertisement
First off you should make your texturesize power of 2 (ie. 64x64, 128x128, 256x256 etc). At least that would make the memory management more efficenly, though I doubt it is the casue of your performance-hit.

What kind of hardware are you running, and how many object do you actually render? And is it smooth without textures?

MenTaL

Share this post


Link to post
Share on other sites
A few more things:

1) Group your rendering by texture so that you avoid calling SetTexture more than once for the same texture if at all possible.

2) Use mip mapping, this helps a lot if the texture has to be minified.

3) Don''t render too few triangles per Draw* call - if each Draw*Primitive renders less than say 100 polygons, it''ll hurt your performance.

--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

Share this post


Link to post
Share on other sites
Can you elaborate why calling draw* with less then 100 poly's would hurt performance? And how bad of a hit is it using vertexbuffers ( since you only draw 2 triangles per quad) for drawing UI's for example ?

[edited by - Oordeel on March 24, 2003 10:55:41 AM]

Share this post


Link to post
Share on other sites
quote:
Can you elaborate why calling draw* with less then 100 poly's would hurt performance? And how bad of a it is it using vertexbuffers ( since you only draw 2 triangles per quad) for drawing UI's for example ?


1) There is a CPU cost in calling the API, internal processing in the D3D runtime, device driver interaction etc. If you're passing less than 100 polys per call, then the CPU "setup" costs can be much higher than the time it takes for the hardware to actually render those polygons.

2) Each draw call gets turned into a "command" ("draw these, using this"), these go into a list, usually FIFO. The graphics chip sucks these commands out of the FIFO for processing. The size and nature of the FIFO is dependent on the driver and chip - but they are often fixed in length. Once the FIFO is full, no new commands can be added until the first in the list has been processed - i.e. read by the chip.

3) The combination of the above two things also means that as well as the CPU stalling until there is room in the FIFO (and effectively killing parallelism - which is key to good performance!), the GPU will be starved of data so rendering much less than it could be per frame.

4) For some things, small batches may be unavoidable. However, say for your UI, many of the things can still be bathed together into a single draw call - for example all characters using the same font texture could be rendered with one call. Using and recycling one dynamic VB for all UI elements of the same vertex format is good too.


The above is assuming of course that you're drawing more than say 100 polys per frame in total - you only see the difference when you're shifting decent amounts of polys per frame. i.e. 100,000 polygons rendered in 20 5,000 poly batches should give you better performance than the same 100,000 polygons rendered in 1,000 100 polygon batches.

IIRC there's an old Excel spreadsheet on the nVidia website (developer.nvidia.com) from the GeForce2 days where they've profiled a)batch sizes, b)vertex formats and plotted the performance effects.


--
Simon O'Connor
Creative Asylum Ltd
www.creative-asylum.com

[edited by - S1CA on March 24, 2003 11:51:46 AM]

Share this post


Link to post
Share on other sites
oordeel, I am in the same boat as you. I have a rather sophistocated 2D engine I have written, but it does not use batching (drawing multiple quads at once).

One of the main reason I have not used batching in my engine is because many of the games I use have quads that need to be sorted and drawn without the ZBuffer (they have alpha and color keys). Since they are sorted by position rather than by texture, I have to constantly reset the texture. This kills batching, because you have to stop, reset the texture, then draw the next set of sprites, and so on. For 2D, not batching is still somewhat acceptable, however, I anticipate a lot of people''s opinions being in cotradiction with mine.

That said, if you can sort by texture to limit the number of times you have to interrupt your drawPrimitive function, your performance will increase greatly. I believe the optimum is between 1,000 and 2,000 traingles on newer cards.

Bluechip, let me see if I can give you suggestions, or at least things to think about:

The size of the texture can often be as important as how large the texture is on the screen. If I have 100 sprites in the distance with small textures, the framerate will drop significantly when I move closer (or make the sprites larger). The reason is because if you are not using a ZBuffer (where the program checks to see if a pixel should be drawn or not depending on what has been drawn so far), then pixels will be drawn and redrawn for each sprite. So even though you are using 100X100 textures, if they are being redrawn on the entire screen, your graphics card will choke.

Also, how many sprites are you drawing? Try playing with different zooms of the sprites and numbers. Also try textured and untextured. Perhaps your graphics card isn''t that great, and it''s simply to blame. There are so many problems that can be caused from both how you code and your computer''s hardware. Hope this all helps,

--Vic--

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
If you check the latested GDC 2003 papers at nVidia''s developer site, you see one called Batching something.

Basically, they''ve really really really analysed the performance hit of drawing only a few polys at a time.

The results vary depending on CPU but a loss of 100-300 times in performance is expected when using very small numbers of polygons.

They then go on to say, you should aim for approx. 300 DrawIndexedPrimitive() calls per frame so as to not waste all your CPU on initializing draw calls.

They also mention that 25K DrawIndexedPrimitive() calls per second will completely saturate a 1GHz CPU, such that it''s doing nothing else except tightly loop and issue batches to draw... no animation, no AI, just setup of draw calls. If you assume a frame rate of 60Hz, 25000/60 = 416... so if you have no AI or gameplay at all, and a 1 GHz CPU, expect 416 to be your maximum number of draw calls to maintain your framerate.


The number one thing you can do for your game is to go to nVidia''s site and start reading whitepapers, so you''ll know your technical limitiations.

Share this post


Link to post
Share on other sites
Hi.... thanks very much for these answers
here there are more information...

******* bergfald ***********
in my 3d enviroment I use:

100 trees - 230 triangle - 2 texture BMP [64X64,100X100]
25 terrain square - 2 trinagle - 1 texture JPG [756X512]
1 house - 450 triangle - 5 texture bmp [all about 256X256]
20 wall - 12 triangle - 1 texture JPG [756X512]
1 skydome - 64 triangle - 1 texture JPG [256X256]
1 carot - 6000 triangle - 2 texture bmp [all 100X100]

I don''t thing that there are too many triangle [about 29804]...
is it right?

I''ve a Celly2 566@892, 256 MB of RAM, and a Geffo DDR, and I get from 20 to 30 FPS.

***************** S1CA **************************
I use X file.... so I put all the meshes equal in a list, e for each element in the list I call the render method.
then I take next list and do it again....
It''s a silly work?
There is a better than this?


**************** Roof Top Pew Wee ********************
Now I don''t use sprites ( you mean a 2d object, right?) and I always use a Z-buffer (the only exception is the skydome).



Thanks again folks ^___^

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Hmm.. some of your textures are quite big and not power of 2. Especially [756X512] is very bad! I think any API has to create a texture of the size [1024x1024] to store that.. and I don''t think that''s good for your texture cache. Use the nVidia Stats driver with the capture tool (in the download section of the registered member area) to find out more about your problems. Have a look at the docs there too figure out how to detect CPU-GPU stalls and critical function calls in your code.

Good luck,
narbo

Share this post


Link to post
Share on other sites
Interesting thread. Let me ask you this. Hypothetically, say you have 60,000 vertexes and you''re hardware caps show that your maximum count is 65535 (GF2). Would it be better to send them all in one DrawIndexedPrimitive call, or would it be better to make multiple calls using 2000-3000 per call. For the sake of argument, assume that they are all being rendered using the same texture.

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!