Jump to content

  • Log In with Google      Sign In   
  • Create Account

SpriteBatch billboards in a 3D slow on mobile device


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
7 replies to this topic

#1 OpaqueEncounter   Members   -  Reputation: 155

Like
0Likes
Like

Posted 25 May 2014 - 11:27 AM

I used this method http://blogs.msdn.com/b/shawnhar/archive/2011/01/12/spritebatch-billboards-in-a-3d-world.aspx to create a 3D billboard renderer using SpriteBatch. It works perfectly as described and on a modest desktop (with Intel HD graphics) can renderer 10,000s of billboards or particles easily.

 

On a mobile device (Windows Phone) the framerate drops sharply past a certain, not so large, point. My test (on all devices) is this:

 

- Render a primitives (sphere, cube, etc) into a render target

- Pass the render target to the method above

- Start increasing the number of billboards until the framerate drops.

 

On an x86 desktop or an ARM tablet (Surface) the framerate holds into the thousands. On the phone, it instantly drops from 60 to 30 (looks like disabling VSync has no effect on that device) as soon as you pass a certain point (~200?). The funny thing is that I can get the framerate to go up to 60 again by making the billboards half the size. Same goes when making the render target half the size.

 

Using a stopwatch, I determined that the time spent on the CPU is nowhere near the 16.67ms threshold. VS2013's frame analysis is unavailable on the Windows Phone, so that's useless.

 

Can anyone explain as to what is going on here? Is this simply the limitation of a low-power GPU (the Adreno 225 in this case)? If so, what exactly is bogging it down? The fill rate? The blending? (I tried all blend states from Opaque to NonPremultiplied, no effect on performance).


Edited by OpaqueEncounter, 25 May 2014 - 11:28 AM.


Sponsor:

#2 C0lumbo   Crossbones+   -  Reputation: 2496

Like
2Likes
Like

Posted 25 May 2014 - 12:43 PM

I would think that most likely you are fill-rate bound. In the absence of a GPU profiler, the easiest way to confirm whether or not you are fill rate bound is by setting up a scissor rectangle so that only a small area of the screen is visible. For your particular simple case, maybe just make the particles smaller instead of add a scissor rectangle.

 

If it's not the fill rate, maybe it's the cost of the vertex processing.



#3 Adam_42   Crossbones+   -  Reputation: 2617

Like
2Likes
Like

Posted 25 May 2014 - 05:17 PM

It sounds very much like you're pixel bound if reducing the size improves the performance. Phone GPUs are horribly slow compared to even a basic PC GPU. Your options are:

 

1. Simplify the pixel shader. Ideally it'd be a single line of code doing a texture fetch for billboards.

2. Render at a reduced screen resolution, with MSAA on.

3. Render less pixels. For example use extra polys (e.g. octagons instead of quads) to render less transparent pixels. For circles this saves up to about 20%.

 

Check out http://aras-p.info/texts/files/FastMobileShaders_siggraph2011.pdf for some more info on how phone GPUs perform.



#4 OpaqueEncounter   Members   -  Reputation: 155

Like
0Likes
Like

Posted 25 May 2014 - 06:31 PM

I would think that most likely you are fill-rate bound. In the absence of a GPU profiler, the easiest way to confirm whether or not you are fill rate bound is by setting up a scissor rectangle so that only a small area of the screen is visible. For your particular simple case, maybe just make the particles smaller instead of add a scissor rectangle.

 

If it's not the fill rate, maybe it's the cost of the vertex processing.

 

 

Not vertex processing for sure since the aforementioned method does all that on the CPU. That, I managed to measure to ensure that it's not a bottleneck. And yes, reducing what is being drawn on screen increases the framerate.

 

It sounds very much like you're pixel bound if reducing the size improves the performance. Phone GPUs are horribly slow compared to even a basic PC GPU. Your options are:

 

1. Simplify the pixel shader. Ideally it'd be a single line of code doing a texture fetch for billboards.

2. Render at a reduced screen resolution, with MSAA on.

3. Render less pixels. For example use extra polys (e.g. octagons instead of quads) to render less transparent pixels. For circles this saves up to about 20%.

 

Check out http://aras-p.info/texts/files/FastMobileShaders_siggraph2011.pdf for some more info on how phone GPUs perform.

 

The shader used in that method is BasicEffect, in which I disabled absolutely everything (even vertex color). I am already running at the lowest resolution feasible.

 

To render less pixels, I also tried to replace BasicEffect with AlphaTestEffect.

 

It seems that if this is a fillrate issue, the only thing really left is to skip drawing some of those billboards. Luckily, it happens to be that quite a few of them are blocked most of the time. I am not really sure where to start here if this is a solution. Frustum culling is not really the answer here and occlusion querying is unavailable on CPUs like Adreno 225 and less, which I plan on targeting.

 

Any suggestions?



#5 C0lumbo   Crossbones+   -  Reputation: 2496

Like
0Likes
Like

Posted 25 May 2014 - 11:16 PM

Other than Adam's suggestions #2 and #3 there isn't really anywhere else to go other than trying to achieve the same effect with fewer, more opaque particles.

 

Unlikely, but is there scope for improving your texture at all? e.g. If it's a large 8888 non-mipmapped texture, then you would see gains from switching to a smaller mipmapped compressed texture.



#6 OpaqueEncounter   Members   -  Reputation: 155

Like
0Likes
Like

Posted 26 May 2014 - 09:06 AM

Other than Adam's suggestions #2 and #3 there isn't really anywhere else to go other than trying to achieve the same effect with fewer, more opaque particles.

 

Unlikely, but is there scope for improving your texture at all? e.g. If it's a large 8888 non-mipmapped texture, then you would see gains from switching to a smaller mipmapped compressed texture.

 

I actually generate the texture like I described above (render models into a render target). I'll play around with lower quality pixel format, but I guess if there are no other suggestions then I'm stuck with it.

 

The only thing I don't understand is, why is it that reducing the render target size helps if this is a fillrate issue? Or is fillrate a bit more broad than I assume it to be? (Sampling a larger image contributes as well?)



#7 Adam_42   Crossbones+   -  Reputation: 2617

Like
0Likes
Like

Posted 26 May 2014 - 02:15 PM

The limitation could be memory bandwidth, which is also really low on mobile platforms.

 

You should also try generating mip maps for your texture, not having them can hurt performance significantly.



#8 OpaqueEncounter   Members   -  Reputation: 155

Like
0Likes
Like

Posted 26 May 2014 - 03:44 PM

The limitation could be memory bandwidth, which is also really low on mobile platforms.

 

You should also try generating mip maps for your texture, not having them can hurt performance significantly.

 

EDIT: Well, I actually did try running GenerateMipMaps every frame and the framerate did go up, so that's that. :)


Edited by OpaqueEncounter, 27 May 2014 - 09:15 AM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS