Architecture for Isometric Style Game

Started by
6 comments, last by Alberth 8 years, 6 months ago

I am developing an isometric style game. My main game loop is

  1. Process Input
  2. Update
  3. Draw Screen

I have one c++ vector that contains all the tiles and objects that I will draw. I have them in this structure for one reason: Since this is isometric style, I need to sort this structure by depth so that, when drawn, objects properly overlap each other. This much works fine.

I am implementing more features now that involve several moving objects on the screen. Here's where I started crashing.

I was crashing because I tried to move objects in the vector while the drawscreen routine was reading from it (or sorting it). My solution was to perform all movements in the update function so that by the time drawscreen executed, all manipulations have finished.

This appears to have caused a bit of a performance hit that I have yet to explain. But before delving in too deep in debugging this, I wanted to run by current setup by you guys to see what you thought. I have no idea how this stuff should be done, kinda just winging it as I go.

Does it make sense to keep all objects (tiles, enemies, players) in a single structure? It seems like a bad idea.. I would have rathered a different structure for each type, but that makes it much less efficient to sort and draw them in the right order.

Is there anything else that stands out as being weird or wrong?

Thanks

Advertisement

It is best to keep update and rendering options separate, so that was a good change.

Do you make a lot of use of partial transparency in your sprites, or are they all alpha masked instead? If masked, then you might be able to sort your scene front to back with depth testing enabled to take advantage of early rejection.

Do you pack your sprites into texture atlases? Doing so will allow you to compose your draw calls into batches, rather than a single draw call per object.

There are quite a few optimizations that can be made if you study the ways that more modern scene architectures work.

Do you literally use a std::vector?

If so, adding something near the start of the vector will cause copying (for moving to make room) of all elements behind it, which depending on the length of the vector can be very costly.

You may want to use a different data structure instead. I use a std::multiset structure that I fill with the things that need to be drawn at the screen (my world is bigger than the screen). As sorting order (The '<' relation), I use the order of rendering. After filling the map, I iterate from small to big through the elements, and due to my sorting order I thus draw things in order I want.


/**
 * Data temporary needed for ordering sprites and blitting them to the screen.
 * @ingroup viewport_group
 */
struct DrawData {
    int32 level;                 ///< Slice of this sprite (vertical row).
    uint16 z_height;             ///< Height of the voxel being drawn.
    SpriteOrder order;           ///< Selection when to draw this sprite (sorts sprites within a voxel). @see SpriteOrder
    const ImageData *sprite;     ///< Mouse cursor to draw.
    Point32 base;                ///< Base coordinate of the image, relative to top-left of the window.
    const Recolouring *recolour; ///< Recolouring of the sprite.
    bool highlight;              ///< Highlight the sprite (semi-transparent white).
};

/**
 * Sort predicate of the draw data.
 * @param dd1 First value to compare.
 * @param dd2 Second value to compare.
 * @return \c true if \a dd1 should be drawn before \a dd2.
 */
inline bool operator<(const DrawData &dd1, const DrawData &dd2)
{
    if (dd1.level != dd2.level) return dd1.level < dd2.level; // Order on slice first.
    if (dd1.z_height != dd2.z_height) return dd1.z_height < dd2.z_height; // Lower in the same slice first.
    if (dd1.order != dd2.order) return dd1.order < dd2.order; // Type of sprite.
    return dd1.base.y < dd2.base.y;
}

/**
 * Collection of sprites to render to the screen.
 * @ingroup viewport_group
 */
typedef std::multiset<DrawData> DrawImages;

DrawImages draw_images;
draw_images.clear();

// Do lots of draw_images.insert(), adding everything that needs to be drawn.

for (const auto &iter : collector.draw_images) {
    const DrawData &dd = iter;
    const Recolouring &rec = (dd.recolour == nullptr) ? recolour : *dd.recolour;
    _video.BlitImage(dd.base, dd.sprite, rec, dd.highlight ? GS_SEMI_TRANSPARENT : gs);
}


Do you make a lot of use of partial transparency in your sprites, or are they all alpha masked instead? If masked, then you might be able to sort your scene front to back with depth testing enabled to take advantage of early rejection.

I've been using partial transparency in the sprites. Art isn't really my strong point so I'm considering everything I have to be placeholder, but I've done nothing fancy with it. Just made some stuff in paint.net as a .png with transparency and loaded em in.

The scene is sorted back to front before every draw. You've lost me on the part about "early rejection" though..


Do you pack your sprites into texture atlases? Doing so will allow you to compose your draw calls into batches, rather than a single draw call per object.

I do (although I didn't know it was called a texture atlas). I have a single .png containing my ground textures in a grid. On initialization, I create a texturemanager class to load in the individual sprites to prepare them for use (though I suppose if I wanted it to be more memory efficient I could load them in on demand.. but I think that would just trade off processing power for memory).


Do you literally use a std::vector?

I do. At the time it was between an array and a vector and I found vector was more useful for what I was doing.

I'm a bit lost looking at your code, I guess I'm not advanced enough in C++ to understand that at a glance. I don't know why you'd need more than one way to sort it .. I think it comes down to your level/slice being a more advanced implementation than I am doing. I consider everything to be on the same "level".

For every object in my vector, I use the coordinates to calculate the "base" coordinates and the "draw" coordinates. The "base" coordinates represent the coordinates at which, if an object were to overlap, which one would be drawn on top.

Here's a crude representation of what I did: 5Zxjxul.png

So the green dot represents the actual coordinates stored on the object. I've also stored the height and width of the "foot" or "base" of the object which outlines where the object touches the ground. Using those, I calculate the red dot, the "base coordinates". The y base coordinate is what I use when sorting the vector (in ascending order) because if your y coordinate is higher than another's, you need to be drawn after them (to show up in front of them).

I spent a long time (1-2 months) messing with this to get it right and it involved an awful lot of trial and error considering I was just making it up as I went.

I do like the idea of having a smaller structure only containing stuff that will be drawn to the screen. Right now my code looks like


drawScreen()
{
   sort gameObjects;
   loop through all gameObjects
   {
      If gameObject is within a set distance from the player
      {
         gameObject.draw();
      }
   }
}


But I could change it to


drawScreen()
{
   loop through all gameObjects
   {
      Find all gameObjects within a set distance from the player
      Add them to drawObjects
   }
   sort drawObjects;
   loop through all drawObjects
   {
      gameObject.draw();
   }
}

It will probably be more efficient.

Your objects' relative sorted positions should be fairly consistent frame-to-frame, right? In that case, use a sorting algorithm that works quickly with already-mostly-sorted data (like insertion sort).


Do you make a lot of use of partial transparency in your sprites, or are they all alpha masked instead? If masked, then you might be able to sort your scene front to back with depth testing enabled to take advantage of early rejection.

Is that always true? From my understanding, alpha-testing will interfere with early-z rejection, so this won't always be a performance win (and is only relevant anyway if the OP's performance issue is GPU-related).

For performance it is good to break up the draw process to the update/move process. You may want to time how many ticks it is taking to process the update loop and adjust your frame rate based on this decision. Depending on the number of cache misses your running into this could also be pulling down your performance. If your data is creating a lot of cache misses then you might want to consider using a more contiguous data structure or sorting your data so that cache pulls related neighbors for your algorithm.

The way to think about cache is imagine an accountant going through tons of boxes searching for the information he needs for a report. He can only get one crate from the warehouse at a time so he tells the fork lift driver to pick up a few extra boxes from neighboring bins. When he opens a couple of boxes that has related information in them then he has a cache hit and is more productive. But say he doesn't find anything in the boxes, he has to send them back to the warehouse and keep looking, a cache miss.

If you have a lot of misses in your cache then your performance will be slow. I know you thought about using an array but the standard vector seemed more useful. This is fine but if you adding and removing a lot from the vector you could be taking a performance hit here. You will need to do some investigating to figure out how much this is affecting your performance.

Another thought I had is if your using Visual Studio C++ then you might want to check your build. I know when I build in debug most of the std data structures perform very poorly. There is a lot of overhead in the debug build. Try and change it to release and see how it performs. I was shocked how much of a difference this made for one std data structure I was using and had a 100x performance gain.


Your objects' relative sorted positions should be fairly consistent frame-to-frame, right? In that case, use a sorting algorithm that works quickly with already-mostly-sorted data (like insertion sort).

For an isometric game the ordering is going to be *EXTREMELY* consistent frame to frame. Whatever is inside the frustum may change, but the process of rendering is walking a grid.

There was an approach that worked extremely well when hardware was based on tile-based rendering, but tends to degrade quickly on today's hardware. An isometric world is just a regular grid. You traverse the grid so the most distant row's sprites are drawn bottom to top. Then you draw the next closest row drawing the sprites bottom to top. SO if a tile has three sprites on it you draw the bottom sprite, the next higher sprite, and the highest sprite. You can go right to left or left to right based on whatever traversal pattern works easier for you, always working from graphically back row to the graphically front row, each tile bottom to top.

That worked amazingly well back when graphics cards were designed for tile-based sprite rendering because ultimately you were passing a simple array of sprite indexes.

These days when graphics cards are designed for point-cloud soup, those same tile-based sprite rendering calls that worked before tend to quickly saturate the modern system's data bus. You may end up making 5000+ or 8000+ or 10000+ individual draw calls for all those tiny little tiles, and it degrades performance.

So as an alternative these days, one magical solution is to either pre-render the static section of the world into large images or to render sections of the world at run time, and use a fun tool called a "depth image".

A depth image is basically a snapshot of the depth buffer. You render the isometric screen as a static 2D image and also capture the recorded depth at each point of the image. Then when it comes time to render you make a single draw call that does all the background (ignoring the depth test) and another call that writes the values to the depth buffer. After you've got your background static sprites rendered in one large block, you can then draw your dynamic sprites, whatever for or five or ten small images you need.

This gives you a small number of draw calls for the big background of static tiles -- perhaps four or six draw calls if you broke the mega-image into pieces -- plus the five or ten or whatever you need for your dynamic items.

We'll call that roughly 20 draw calls versus 8000 draw calls. A little more work up front, but far more efficient for the frame-to-frame rendering of isometric worlds.

I'm a bit lost looking at your code, I guess I'm not advanced enough in C++ to understand that at a glance. I don't know why you'd need more than one way to sort it .. I think it comes down to your level/slice being a more advanced implementation than I am doing. I consider everything to be on the same "level".

A slice is a horizontal row at the screen, or a diagonal line on the grid. One tile position is actually a stack of tiles, as you can have things above each other. Within a tile stack, you sort on height from low to high (low gets drawn first, so higher things stay visible). Within a single tile at a single height I draw several things, again from back to front, eg first fences in the background, then people, then fences near the front.

In my game (FreeRCT), there are actually 4 views; the user can rotate the world 90 degrees. To implement, you either need to rotate the entire map-data, or you rotate the view. I chose the latter, which means "in front of" above changes with rotation.

But I could change it to

* Find all gameObjects within a set distance from the player
- Add them to drawObjects
* sort drawObjects;
* loop through all drawObjects
- gameObject.draw();

This is one of the points I was trying to get across. This is how I draw things currently. The std::multiset uses '<' order to sort its values internally.

So std::multiset<int> will work out of the box, std::multiset<MyClass> needs a '<' operator.

Now the clever bit that I do is the 'MyClass::operator<' (written as function 'bool operator<(const MyClass &c1, const MyClass &c2)') uses the sort order you need for "sort drawObjects". Thus when I insert an object into 'std::multiset<MyClass> drawObjects', it uses the drawing order sort to place the new element in its storage. As a result, when I am done adding all drawObjects, it is already sorted in the order I want. I can go straight to drawing them.

frob is correct in stating that you don't actually need to do sort (at least not on this scale). If you know the direction of viewing, you can compute the visible tiles, and code the sorting order into a "draw-tile" routine (I'd need 4 such routines, but still very doable). Call that routine fro every visible tile, (from back to front) and done! That does imply that you can quickly find tile contents by map coordinate though.

I could code such a drawing routine (my map has all tile contents), and have plans to eventually do that. My sorting was added in the beginning of the project, before I really knew what I had to do for drawing. One of the nice things is that changing order of drawing is simple, just change the '<' implementation. That has proven beneficial for development a few times. For this reason I am not in a big hurry to replace it until development is close to completion.

Performance is horrible in my game currently, as indeed access to the GPU and video memory is too slow, as already stated by frob. This problem is a lot higher on my list of things to fix, but not at the top.

This topic is closed to new replies.

Advertisement