Sign in to follow this  
EternityZA

my engine performs badley. WHY!?

Recommended Posts

Hey. i've been working on my engine for a while now and everythings coming together nicely. its all shader based and it implements multipass rendering and hierarchical frustum culling. but now even if i take out every pretty effect and i simply render a large amount of stationary textured models in a single rendering pass my framerate is stil not reaching 60fps on my reasonebly high spec pc (the fact that everything is stationary means that my engine only calculates the world transformations of everything once and then never again) i sort my draw calls by shader then by VBO and then by texture (in above mentioned scenario theres only 1 shader since only 1 pass is needed and all models are placed in the same VBO and all textures in the same texture atlas so it never needs to rebind anything) to render the above mentioned glDrawArrays gets called about 6000 times (once for every static mesh). this is probely not enough information for you to pinpoint my problem but i was hoping for some suggestions on what you think could be the problem. Or maybe advice on how i should proceed in finding my problem becuase it feels like ive hit a brick wall on this. Thnx in Advance!

Share this post


Link to post
Share on other sites
Hey thnx for the quick replay.

just one thing

i know nothing about profilers. im stil kinda new to all of this and 'profiler' is a magick word ive only seen a couple of times in random threads here on gamedev. so can you maybe advise me something specific? does the language matter? my engine is written in java and it uses the LWJGL opengl binding.

Thnx!

Share this post


Link to post
Share on other sites
Two things you might want to consider. First, use glDrawElements instead of glDrawArrays; it may be significantly faster on newer hardware. The other thing to consider is your sort. If you have 6000 elements and you are sorting them before every draw call, this racks up time quickly. To give you an example, I wrote a program that creates a vector of 6000 floats, initializes them randomly, and then sorts them 6000 times (note that they remain sorted after the first sort, but the std::sort algorithm must still check with each call to make sure they are in order). One call to sort on a vector of 6k elements is about 0.00888545s on my 3.0GHz Core 2 machine. This means that 60 calls to sort would be about 0.5s ! That's way too much. I don't know if you are sorting your meshes every frame, but if so you may want to consider a better strategy. Usually, breaking them up hierarchically so you have less to sort, and possibly having a flag if the order is not changed, may help a lot. However, like SiCrane said, a profiler will help identify you problem specifically.

Share this post


Link to post
Share on other sites
wel im not going to try and explain my method of sorting since it'l take to much time but it does involve a 'clean' flag and it is 'hierachical' in a certian sense and the problem does not lie there (at least in the scenario in my OP sorting only happens once and never again)

i can try using glDrawElements maybe but even if that makes it any better i dont think its going to make the huge difference im looking for (46fps -> 60fps)

I'l check out that list of profilers (and i am using eclipse :P)

Thnx!

Share this post


Link to post
Share on other sites
as has been mentioned find the bottleneck

A/ make screensize smaller, does it improve fps much?
B/ comment out the glDrawArrays call (so u dont see much), does it improve fps much?

Share this post


Link to post
Share on other sites
zedz's suggestions are good ones. The FIRST thing you need to do is figure out whether your program is CPU-bound or GPU-bound. The optimization tricks you use will be almost entirely different.

Share this post


Link to post
Share on other sites
Quote:
Original post by EternityZA
glDrawArrays gets called about 6000 times


This sounds way too high to me. Try making your models 6 times as complex and render 1/6th as many and see if you get a boost. You may need to look into batching.

There is a very methodical way you should go about testing though, as suggested, to find your actual bottle neck.

Share this post


Link to post
Share on other sites
I'm used to D3D, but if 6000 calls to glDrawArrays() is anything like 6000 calls to DrawPrimitive(), that would be a definite problem. GPUs can handle hundreds of DrawPrimitive() calls per frame, but not thousands because of the overhead a single call inflicts. To work around this, you'd send geometry to the GPU in batches (eg. by combining static geometry into a single Vertex Buffer/VBO and drawing it all in one go).

Share this post


Link to post
Share on other sites
Quote:
Original post by Cygon
I'm used to D3D, but if 6000 calls to glDrawArrays() is anything like 6000 calls to DrawPrimitive(), that would be a definite problem. GPUs can handle hundreds of DrawPrimitive() calls per frame, but not thousands because of the overhead a single call inflicts. To work around this, you'd send geometry to the GPU in batches (eg. by combining static geometry into a single Vertex Buffer/VBO and drawing it all in one go).


No, GL doesn't have that issue. GL is implemented in user space while D3D drivers are in kernel space (WinXP).
On Vista, they supposedly changed things for D3D.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this