Sprite batch rendering

Started by
7 comments, last by aganm 6 years, 4 months ago

I have this 2D game which currently eats up to 200k draw calls per frame. The performance is acceptable, but I want a lot more than that. I need to batch my sprite drawing, but I'm not sure what's the best way in OpenGL 3.3 (to keep compatibility with older machines).

Each individual sprite move independently almost every frame and their is a variety of textures and animations. What's the fastest way to render a lot of dynamic sprites? Should I map all my data to the GPU and update it all the time? Should I setup my data in the RAM and send it to the GPU all at once? Should I use one draw call per sprite and let the matrices apply the transformations or should I compute the transformations in a world vbo on the CPU so that they can be rendered by a single draw call?

Advertisement
On 12/1/2017 at 8:24 AM, Michael Aganier said:

Each individual sprite move independently almost every frame and their is a variety of textures and animations. What's the fastest way to render a lot of dynamic sprites? Should I map all my data to the GPU and update it all the time? Should I setup my data in the RAM and send it to the GPU all at once?

While I have not work in OpenGL in while I know you can do this in a couple of ways

The best way is to definitely batch up your sprites together by things like textures, shaders, and other state changes. Also mapping directly into the buffer will help. There is no reason to place all your sprite data into some intermediate structure like a vector just to copy it all out into your vertex buffer later if you can help it

Since your sprites are dynamic you could do one of the following:
1. You could Ping Pong 2 buffers. Basically you create 2 vertex buffers and swap between them. One is used as the drawing buffer and while this buffer is in use you place your sprite data in the other one. Break down: Draw using Buffer A, while the GPU is drawing from Buffer A you fill Buffer B with new sprite data. Then swap the buffers (Buffer B becomes the draw buffer, Buffer A becomes the data fill buffer) when you need to draw the data in Buffer B

2. You can Orphan the buffer. Basically you pass NULL into glBufferData and this tells the driver to give you a fresh block of memory to use. This also allows the GPU work with the previously issued commands and memory while you fill the new block that was handed to you

Here is some more info on this
 

On 12/1/2017 at 8:24 AM, Michael Aganier said:

Should I use one draw call per sprite and let the matrices apply the transformations or should I compute the transformations in a world vbo on the CPU so that they can be rendered by a single draw call?

Using one draw call per sprite is really going to kill performance, especially if you are looking to render lots and lots of sprites on the screen at one time. You can definitely pretransform your sprite's vertices using the sprite's world (model) matrix and then place those pretransformed vertices into your vertex buffer. Once the vertex buffer is full or there is some kind of state change (texture change, shader, etc) flush the buffer and issue a draw call. The less draw calls better for performance 

Have you got a screenshot and some more info on what the sprites are, whether you are using a constant viewpoint, whether the sprites are changing size, rotation etc?

1 hour ago, lawnjelly said:

Have you got a screenshot and some more info on what the sprites are, whether you are using a constant viewpoint, whether the sprites are changing size, rotation etc?

I can describe the usecase.

It's for a turn based and real time hybrid strategy game. The viewpoint changes all the time, but it's an orthographic projection if this can simplify things. In turn based, sprites are mostly static. In real-time, they are mostly dynamic. The sprites are scaled to simulate camera zoom. Most don't rotate, but some do.

So to confirm it is 2D? So when you say the viewpoint changes, you mean you can scroll the landscape up and down and left and right, but not rotate the landscape? The more specific you are the more likely you will get useful answers, otherwise people will only be able to give general advice.

My inkling feeling is that you are maybe shooting for too many moving units for your current capabilities. I think most seasoned veterans would think twice about trying to achieve 200k independent units on screen (or even 200K in game, with 99% culled).

noodleBowl has pretty good general advice imo. In addition you might want to google point sprites, imposters, which might give you some ideas, and think about how to do lazy updates (where you don't update units every frame), and grouping your units so you aren't processing them individually. You might also be able to move more of the AI processing to the GPU, but that is not something I have experience with.

On 04/12/2017 at 7:26 AM, lawnjelly said:

So to confirm it is 2D? So when you say the viewpoint changes, you mean you can scroll the landscape up and down and left and right, but not rotate the landscape?

Yes, exactly. You asked for a screenshot. I think this is gonna explain better than I can with words. You can see stats in the little green window. Clouds and grass are counted in the particles total and the units are counted in the entities total.

good_ol_war_00.jpg

good_ol_war_01.jpg

That's much better. :) It might be a good idea to post the pics in the collision thread too, as rendering and logic is pretty interlinked.

Some things jump out at me straight away. First of all grass and your background, that can be pre-rendered and drawn as one quad. Depending how big your background is, you can either pre-render the lot, or have a scrolling background and render just the vertical and horizontal slices as you move around, then render as 4 quads. This may not seem to matter now but it will help a lot once you decide to add other stuff to your landscape.

Given the number of units involved and the need to match unit heights to landscape height it may be wise to keep the direct top down view. That is not to say you couldn't do an angled view, you could e.g. precalculate the landscape heights per texel of the background, however this might be more tricky to work with the unit rendering, see below.

For the units it looks like a hierarchical approach would be a good fit, for the rendering, and the game logic. If you can think of separating things into cohorts, you can then run your logic per cohort and rendering also. You could make use of either some pre-rendered cohorts or use render to texture at runtime for each cohort and split the rendering over several frames.

When you zoom in a lot you can then switch to individual processing / rendering of each soldier, when there aren't so many to deal with on screen.

(edit) Another totally different option is to run it like a cellular automaton. But I'm not sure how well that will handle the troop formations.

These are just some first thoughts I'm sure others will have ideas. :)

58 minutes ago, lawnjelly said:

First of all grass and your background, that can be pre-rendered and drawn as one quad. Depending how big your background is, you can either pre-render the lot,

I didn't think their was that much grass, but upon verification, there's about 12000 draw calls only for the grass in the second picture. I might just make a texture that represents a large area of grass instead of individual grass pieces. If I was to pre-render the whole background, it would take about 400 1024x1024 tiles at close-up resolution.

 

58 minutes ago, lawnjelly said:

You could make use of either some pre-rendered cohorts or use render to texture at runtime for each cohort and split the rendering over several frames. When you zoom in a lot you can then switch to individual processing / rendering of each soldier, when there aren't so many to deal with on screen

I like this idea. I have a similar concept in the game which when zoomed out enough, it stops rendering the units and renders a symbol representing the whole regiment instead. But your idea of replacing individual units by pre-rendered cohorts, I hadn't thought of that. At that point, the number of draw calls would be extremely low and I would just need to optimise the logic. Thank you.

This topic is closed to new replies.

Advertisement