Big problem sorting sprites

Started by
8 comments, last by Katie 12 years, 6 months ago
So Ive made a little 2D engine using OpenGL.

It uses VBO and VBA for storing batches of quads before sending those to the GPU


my current problem is that my game requires at minimum 3 different textures to be able to render a whole scene ( this cannot be changes so please dont suggest to only use one texture instead )

Ive manages to overcome this problem by using 3dtextures the problem is that the extension that allows me to do this is only available on DX11 hardware which is unacceptable for our intended target platforms

So what Im looking for is a way to be able to switch textures withing a single glDrawElements call ( Which Im thinking is impossible )

Thanks
Advertisement
So u want to bind 3 textures and than use one call to draw? IMO its not possible.

why don't u pack all 3 textures in one big texture? Than u create 3 Texture Buffers with the different coordinates and switch this one than ...

I open sourced my C++/iOS OpenGL 2D RPG engine :-)



See my blog: (Tutorials and GameDev)


[size=2]http://howtomakeitin....wordpress.com/


So u want to bind 3 textures and than use one call to draw? IMO its not possible.

why don't u pack all 3 textures in one big texture? Than u create 3 Texture Buffers with the different coordinates and switch this one than ...


Because the amount of needed textures can vary (the minimum being 3)
Simple answer; you can't.

Thing which are bound to the pipeline can not change during the execution of a draw command.

Less simple answer;
As you've already hinted at texture arrays (not 3D textures, those are different and have worked since the GF3 days) allow you to select a texture by 'layer' but as you say this is only on recent hardware.

You can simulate it however by having multiple samplers and then with each instance you have an id which indicates, via a simple 'if' statement which sampler to read for the texture data. As all the pixels in an execution will head the same way then the overhead will be minimal (cost of the if statement). You might even be able to use an array of samplers, each with a texture bound, and use the id value to reference the correct sampler directly, however I'm not sure when glsl/hardware support for that came in.
Probably the best solution to sort the vertex list by texture and render it in three goes, but doing that on the host might be more expensive in terms of host CPU and transfer.

It might actually be faster to render the VBO several times, binding a different texture each time. Set a uniform to tell the vertex shader which texture is bound. If it's the "wrong" one, it can just output (0,0,0). That'll be degenerate quads/tris and there should be no frag runs at all.

You should be able to do this in the vshader without an if statement by suitable application of the built-in functions. (Using divisions/floor/ceil etc to arrive a number which is either 0.0 or 1.0 depending on if this pass is invalid or valid and then multiplying the final output by that)

The reason for doing this is that branches are *extremely* expensive operations in the large parallel nVidia/ATI architectures; it's actually faster in many cases to do more work and multiply by zero at the end than to branch around the instructions.

Things like array-of-sampler are annoyingly driver/card dependent. Try them and if the compiler says yes, then hurrah, but you need a fallback plan.

Probably the best solution to sort the vertex list by texture and render it in three goes, but doing that on the host might be more expensive in terms of host CPU and transfer.

It might actually be faster to render the VBO several times, binding a different texture each time. Set a uniform to tell the vertex shader which texture is bound. If it's the "wrong" one, it can just output (0,0,0). That'll be degenerate quads/tris and there should be no frag runs at all.

You should be able to do this in the vshader without an if statement by suitable application of the built-in functions. (Using divisions/floor/ceil etc to arrive a number which is either 0.0 or 1.0 depending on if this pass is invalid or valid and then multiplying the final output by that)

The reason for doing this is that branches are *extremely* expensive operations in the large parallel nVidia/ATI architectures; it's actually faster in many cases to do more work and multiply by zero at the end than to branch around the instructions.

Things like array-of-sampler are annoyingly driver/card dependent. Try them and if the compiler says yes, then hurrah, but you need a fallback plan.

I guess what Ill be doing is sort all my draw calls FIRST then start a new batch every time the next quad in the queue uses a different texture than the currently active?

The reason for doing this is that branches are *extremely* expensive operations in the large parallel nVidia/ATI architectures; it's actually faster in many cases to do more work and multiply by zero at the end than to branch around the instructions.


I'm going to disagree here totally.

They are very costly IF different pixels/fragments in the same group take different paths, however if everyone takes the same path then the cost practically vanishes (you take some overhead from having to compute the 'if' but the branching itself is performed by the sequencer based on the information calculated), which will be the case in this instance.

Branching in shaders is FINE and has been fine on any hardware produced in the last 5 years PROVIDED you don't branch at a high frequency in the same group of pixels.

(I've done work on the 360/PS3 where branching in the pixel shader has IMPROVED the frame rate/dropped the draw time as the majority of fragments when one way or the other with a few 'border cases' needing to take both paths, although one path was a trivial 'float4 = float4(0,0,0,0)' path.)

[quote name='FlyingDutchman' timestamp='1318078882' post='4870480']
So u want to bind 3 textures and than use one call to draw? IMO its not possible.

why don't u pack all 3 textures in one big texture? Than u create 3 Texture Buffers with the different coordinates and switch this one than ...


Because the amount of needed textures can vary (the minimum being 3)
[/quote]

How can the amount vary? Are the textures created during the Gameplay or what? I don't understand the problem to be honest.. Normally you know ok, i load 10 meshes a 10 textures so i will need a array of pointers to 100 textures in memory.. and than just before the glDrawXY call switch the pointer

I open sourced my C++/iOS OpenGL 2D RPG engine :-)



See my blog: (Tutorials and GameDev)


[size=2]http://howtomakeitin....wordpress.com/

dont worry Ive solved my problem now thanks for trying to help tho

also thanks to everyone love gamedev as always
"They are very costly IF different pixels/fragments in the same group take different paths,"

Sorry, yes. I was unclear on this; the expense comes from having to park contexts and come back for them to run the program to completion later, so if none get parked then there isn't a problem.

This topic is closed to new replies.

Advertisement