Rendering more gives higher FPS

Started by
5 comments, last by RobM 11 years, 2 months ago
The way I currently draw text in my engine (for stats, etc) is to draw a screen-aligned 2-triangle quad poly per line. I have a 256x12 texture which has my characters laid out in a grid format. My text shader takes as input a string of ASCII numbers (an int array currently) and per pixel, calculates which character in the string it's on and then calculates the correct pixel to take from the font texture. All works beautifully, or so I thought...

My current test scene is very simple, I've only got a ground quad for this test with a simple diffuse shader but the view direction has the ground quad covering most of the screen.

If I draw one line of text I get 160 FPS, but if I add another line of text, I get 260 FPS. I know FPS isn't linear in relation to frame time, but this is a massive difference. If I add more lines of text (e.g. more drawindexedprimitive calls) it gradually comes down again but only by a small fraction (like 1 FPS or thereabouts). If I move my camera up so the ground quad isn't covering most of the screen my FPS goes up beyond 1700 FPS as expected.

I am literally only rendering 2 things and My FPS calc is very simple so I doubt that would be why. In my render loop (based on Gaffer's fix your timestep) I increase my rendered frames count and then just divide that by the game time. It's an average over the whole game time which perhaps isn't quite as accurate as one that counts frames on a second by second basis, but when I start the app each time, the FPS is immediately low or high (depending on how many lines of text I draw).

Would this have anything to do with context switching on the GPU? Normally I wouldn't bother myself with profiling at this stage but, the fact that the FPS went up when it should have gone down and because of the amount, is very confusing.

Incidentally, if I fill my screen with animated models and other meshes, the FPS goes down as expected so it's almost certainly not that. My render loop is pretty solid as far as I can see, all my animations play in perfect time and my physics stepping works correctly too.

Any thoughts? (I'm using DX9 with c++ by the way).

By the way, my text renderer will eventually render multiple lines of text in one quad rather than a quad per line - I just did it with one line to do a proof of concept.
Advertisement

are you rendering the text first and then the ground quad? does the text output depth? what is your platform? some mobile powerVR?

can you try to render at a much smaller resolution, if this is reproduceable?

I suspect your ground quad might be very slow and covering the pixel speeds rendering up, maybe no mipmap in your texture?

screenshot ;)

Have you a dual-GPU settup (Intel + other) and smart GPU selection ?

are you rendering the text first and then the ground quad? does the text output depth? what is your platform? some mobile powerVR?
can you try to render at a much smaller resolution, if this is reproduceable?

I suspect your ground quad might be very slow and covering the pixel speeds rendering up, maybe no mipmap in your texture?

screenshot ;)


Thanks Krypt0n. The ground quad is definitely rendered first and I am rendering depth with the text - that's not something I had considered.

It's currently running on my laptop machine on Windows 7, Intel i7 2.2GHz, nVidia GTX560M (1.5GB).

I can try at a smaller resolution when I get home later (it's precisely reproducable). I'm pretty sure the ground texture has mipmaps, but I'll check that later too - all great tips, thanks.
I've tried this at different resolutions, FPS is still much higher (with 2+ lines of text). I've tried switching off depth output on the font shader (zen able=false) and it made no difference, although it also makes no difference whether the ground object is rendered or not or, if it is, whether the ground is beneath the text not. I also noted that the ground texture had mipmaps.

This is very confusing. My render loop does something like this:

[source]
while (true)
{
// usual fixing time step stuff (world updates, etc)
...
renderTime = timer->Stop();
renderingEngine->RenderFrame(renderComponentQueue);
renderTime = timer->Stop() - renderTime;
renderedFrames++;
}
[/source]

A lot of code has obviously been removed for brevity but the point I wanted to make was that the renderTime timer actually takes a lot longer inside the RenderFrame(...) method if there is only one text object being rendered than it does if there are 2+. By a lot longer I mean having just one text line takes 2-6ms (obviously wrong), but 2 or more makes it around 0.3ms (much more like it).

It's worth noting that at the moment, each of the two lines of text use their own instance of a font shader rather than just one shared shader. This is currently just because I haven't got round to changing it yet. That being said, I'd have thought the RenderFrame method would take even longer again with 2+ objects as it has to swap shader..

Any more ideas guys?

I've tried this at different resolutions, FPS is still much higher

is it still 160 vs 260 fps? half resolution is way faster, I would expect it's not related to context switching.

how does your timer work? is it some highly accurate timer or something like gettickcount?

I would suggest to not take the render time around 'renderframe', but start the time at frame 0, stop it at frame 100, divide 100 by the seconds you got.

I don't know what else you're doing beside 'renderframe', but there is a chance your drawcalls have not even be send out to the gpu when this function returns and in that case, while the actual rendering is going on, you are doing 'world update' etc.

it's quite common that driver queue up to 5 frame, so to really get the average fps, you need to measure across several frames.

if some results are fishy, it's common the measurement is buggy :), so no offense, but that's something to verify.

is it still 160 vs 260 fps? half resolution is way faster, I would expect it's not related to context switching.


No, but the difference is roughly the same.

The timer is the normal queryperformancecounter.

I've tried it over several frames, doing an overall FPS, an FPS each second and over 5 and 10 seconds. It's still the same.

It's interesting that you pointed out that the GPU will queue calls. If this was the case, wouldn't rendering 2 text lines be slower than rendering one?

This topic is closed to new replies.

Advertisement