Sorting by Texture?

Started by
8 comments, last by DirectXFreak 18 years, 3 months ago
Greetings, I am going to write an overhead 2D game, but with a fixed 3D camera. I had considered using ID3DXSprite, but it's strictly 2D, and doesn't support world coordinates. Anyway, I've written my own sprite engine that contains a dynamic vertex and index buffer. The player may call onBegin()/onEnd(), just like ID3DXSprite, and Draw(...) within them. Now, I have the system rendering just fine, however I've come accross an issue with sorting my quads by texture. Basically the system goes through the following steps for each Draw(...) call: 1) Create a CSpriteInstance object and fill in its parameters such as texture, world matrix, and color. Add this object to a list. Then, within the onEnd(...) function: 1) Sort the list by texture 1) Loop through each instance 2) while we're on the same texture, add a quad to the dynamic buffers. Otherwise, flush the buffer, and set the new texture. So, this implimentation runs quite fast, except when I try to sort the list. Instead of 60fps, it gets cut to 30fps, while drawing about 500 quads. I tried using std::list::sort(), but quickly found out what a terrible idea that is; Next I tried std::multimap or the data structure, and it works twice as fast as the std::list (30fps), but still didn't reach the unsorted 60fps. What's a more efficient (and preferably easy to code) method of accomplishing this task? Thanks
--m_nPostCount++
Advertisement
Well because you are sorting, you're not going to run as fast as when you aren't sorting. A frame rate cut in half seems quite severe, but one optimization you could do is caching, aka keeping track of what's changed so you're not sorting your entire scene for every frame. Of course your method of sorting could be at fault as well, but I'm going to assume that you're smart enough to not be using bubble sort.

What sorting algorithm are you using and how big is a given scene?
Hey,

I used std::sort(...) and the speed hit isn't nearly as much.
I think it'll be fine,
thanks.
--m_nPostCount++
...
I'm confused. If the sorting version of whatever isn't as fast as the unsorted, why are you sorting?
I like the DARK layout!
Probably cause hes using a Vertex Buffer, which will allow him to draw multiple polys with the same texture with one draw call. I have successfully pulled this off in VB6 using DirectX8, but if you are using C++, you are gonna have to convert the code to C++ and possibly DX9 if you are using it. Here's a link to my 2D DX8 Tutorial:

http://www.vbforums.com/showthread.php?t=382077

It'll be located in the Tile Engine folder. Good luck!
Hey, thanks Jacob Roman, I'll check out the code =D
--m_nPostCount++
Is that 30 fps in a release build? Did you try adjusting optimization settings?

You might want to try using a map(or hash table) of vectors, and doing something like TheMap[TextureID].push_back(ThisThingy)
You could even use the last few frames to know about how big each vector needs to be to help avoid unneeded allocations.
Using a list or map directly to store the items will likely result in the data being spread all over the place, which is very bad for cache coherency. Using a vector would be better, but you need things divided by the texture ID which means a map makes sense. Having a map that does the dividing and a vector that does the actual storing seems like a win-win situation, as long as you don't have too many texture IDs. If practically every object uses it's own texture, using a nested structure like this would just increase overhead.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
I use this guy's merge sort for lists, adapated for Amiga style double-linked lists. Simon apparently works for ARM, as in the processor people, so he likely knows his stuff. In Raze's Hell this sort used about 0.01% of the CPU time. This was with a non-inlined comparison... it was doing a function call to compare, like qsort() does, to account of all the various things we needed to sort by. Non of the objects sort order was cached, it was rebuild from scratch each frame. We found this speed to not only be acceptable, but amazingly quick.
Thanks guys, you've helped a ton. I'm considering using a map with vectors as Extrarius proposed.

--m_nPostCount++

This topic is closed to new replies.

Advertisement