Jump to content

  • Log In with Google      Sign In   
  • Create Account

I'm having doubt about my map loading method and its performance.


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
17 replies to this topic

#1 Sunsharior   Members   -  Reputation: 521

Like
0Likes
Like

Posted 15 January 2014 - 03:50 PM

I have implemented a map loading logic in this 2d sidescroller i'm making and it work. But recently, i've questionned if what i did was the right way to do it and if it won't screw me in the future.

 

One of the first thing i've coded in my game engin is the map loading logic.

The way i did is read a large text file from disk and store it's content into a string.
For reference the text file is formatted in this way:

 

Header~Tile1Param1;Tile1Param2;Tile1ParamN;~Tile2Param1;Tile2ParamN;~TileNParamN;~

 

as you can see, each cell is seperated by '~' and the cell parameters by ';'. To sort everything i tokenize the whole string into tile then for each tile i retokenize to extract the parameters and finaly feed this tile to the game engine.

 

But the problem is i do several thousands tokenize to get to that end. I've done some test and load time "seems" okay with a map size of half a million tiles (750 tile by 750 tile. Each tile is 16 pixel wide). If i could improve this, i wouldn't have to limit myself to that size.

 

Beside this performance issue, i'm wondering if this is the right approche for the future? What will happens when i will use something else than a txt file? Right now, i'm loading a map when the actor pass throught the door trigger. Should i preload all adjacent map before this time?

 

By the way, just for the curious, this is my tokenize function that i call.

//////////////////////////////////////////////////////////////////////////
//
void CMapManager::Tokenize(vector<string> &tokens, const string &str, char delim)
{
		stringstream ss(str); //convert string to stream
		string item;
		while(getline(ss, item, delim)) {
			tokens.push_back(item); //add token to vector
		}
}

I am very sorry if my english sound terrible and thank you for your time.



Sponsor:

#2 RedactedProfile   Members   -  Reputation: 169

Like
0Likes
Like

Posted 15 January 2014 - 04:40 PM

Man its been a while since I've even thought about Tile engines

 

I cant really contribute to this, as I did ALL of my tile map engines like this:

 

3,3,3,3,3,3,3,3,3

3,0,0,0,1,2,0,0,3

3,8,6,0,1,1,1,0,3

3,0,0,0,0,0,0,0,3

3,3,3,3,3,3,3,3,3

 

and would use a matrix like that, and layers, bottom layer for tile graphic, middle layer for entities, top layer for collision.

 

I kind of like how you've parametrized the maps though, that's very interesting


Signed: Redacted


#3 TheUnnamable   Members   -  Reputation: 803

Like
2Likes
Like

Posted 15 January 2014 - 05:13 PM

For starters, switch from text to binary :) It will be much faster, since you don't have to go through hundreds of characters, you just read a piece of data and there you are.

I remember seeing people just directly reading into the variables ( ie. foo.read(&mapWidth, sizeof(int)); ) and writing them as they are. This is certainly fast, but I'm not sure if this is good practice - issues could emerge when you copy a map from your computer to another and they happen to use different endianness. Still, I don't know how relevant this problem is, I hope somebody more experienced will answer :)

If it is of concern, you could use some fixed endianness in your files, and swap your byte order if while reading/writing if the running machine uses a different one.

 

In case it's of any help, I've written a simple library which does what I've described in my previous sentence. At start it checks what endianness the machine uses. Then you can create buffers, fill them with ( endian-correct ) data and flush those buffers to some file stream or whereever you wish. At reading you can load the whole file into a buffer and read from it, endianness will be handled. Although you have to tell each buffer what endianness they should use. ( i.e. if the buffer's endianness differs from the system's, it will flip the byte order ). Feel free to use it or just check the code :)
https://github.com/elementbound/binio



#4 Vortez   Crossbones+   -  Reputation: 2704

Like
0Likes
Like

Posted 15 January 2014 - 05:56 PM

As the previous user posted, you should really switch to a binary format. That way, you can create a structure for storing a tile information, then it's just a matter of creating an array of this and loading/saving it in one go. It will not only be faster, but will also save a lot of space.


Edited by Vortez, 15 January 2014 - 05:57 PM.


#5 Sunsharior   Members   -  Reputation: 521

Like
0Likes
Like

Posted 15 January 2014 - 06:34 PM

Thank you all for the replies.

 

Looks like the binary way would be a needed solution to my problem.

Vortez, is this what they call serialization or am i mistaken ?

 

I currently have a class CTileInfo that contain:

int row
int col
int frontrow
int frontcol
int state
SDL_RendererFlip flip //(can be an int)
SDL_RendererFlip frontflip
string walksound
string script //(script need the same tokenization before they can be processed)
vector<CVector> *vertices

I never done that before, im unsure how to proceed. I'll give google a go.



#6 SeanMiddleditch   Members   -  Reputation: 7144

Like
5Likes
Like

Posted 15 January 2014 - 06:55 PM

Binary formats are not necessarily faster and they have a bazillion other drawbacks.  Disk seek time will completely swamp any CPU processing time.  Text files can in many cases compress better than binary equivalents and so (using a ZIP asset file or zlib to compress your source assets) you can actually get faster loading time with text than binary.  To be sure, measure, measure, measure.

 

It is very common to use text file formats especially chosen to be easy to inspect and merge in source control for development and then use an asset pipeline to convert that to a "final" form for shipping.  In early development, there's no reason to even think about that final form or efficiency.  Just make it work.  Make the format easy to work with.  Your time is significantly more limited and precious than the computer's time.  I'm sure you have a 1000 things your game desperately needs more than faster tile loading right now.



#7 dejaime   Crossbones+   -  Reputation: 4119

Like
0Likes
Like

Posted 15 January 2014 - 09:29 PM

Text files can in many cases compress better than binary equivalents and so (using a ZIP asset file or zlib to compress your source assets) you can actually get faster loading time with text than binary.  To be sure, measure, measure, measure.

I think he means storing the entire map data in binary, like in copy the data instead of read the configure file, interpret it and create the data.
Not actually storing the config in a binary file.
 
edit: No, apparently he doesn't. Forget what I said.
Indeed, it may or may not give you an increased load speed.

Never witnessed a situation where it actually lost performance though.
 
 
 

Thank you all for the replies.
 
Looks like the binary way would be a needed solution to my problem.
Vortez, is this what they call serialization or am i mistaken ?
 
I currently have a class CTileInfo that contain:
int row
int col
int frontrow
int frontcol
int state
SDL_RendererFlip flip //(can be an int)
SDL_RendererFlip frontflip
string walksound
string script //(script need the same tokenization before they can be processed)
vector<CVector> *vertices
I never done that before, im unsure how to proceed. I'll give google a go.

 
I strongly advise against your use of strings.
 
You can use some sort of numeric reference instead (like an std::vector or even an [ int , string ] matrix).
 
This way, you can avoid loading and creating all the same strings multiple times, and also makes your life easier when you want to store the map in a binary format. Whenever you need the string, look it up in your matrix, and you're good to go.
 
Also, avoid the using namespace std statement.
 
I'll cook a little snippet for you in a moment.


Edited by dejaime, 17 January 2014 - 03:51 PM.


#8 Sunsharior   Members   -  Reputation: 521

Like
1Likes
Like

Posted 15 January 2014 - 10:43 PM

 

Also, avoid the using namespace std statement.

 

 

Alright thank you for the tip, i'll be sure to remove the using namespace.



#9 dejaime   Crossbones+   -  Reputation: 4119

Like
1Likes
Like

Posted 16 January 2014 - 12:47 AM

I want you to take a look at a code I've done comparing two possible approaches to the problem.
As it is too big to post here, directly, I'll link you to a gist in github: here.
It is a relatively simple code in a single file, it should be easy to compile.

This is one output for a large set of "tiles" (results varies slightly from a run to another).
 
Starting test for a [7500, 7500] wide set.
Searching for a big enough memory block (450 mb )
Creating the map manually.
Saving the map to the files.
Will now Load the map from the files.

Total Time for each of the tasks -
    Manual Creation:        1.16 s
    Save to TEXT file:      48.12 s
    Save to BINARY file:    0.61 s
    Load from TEXT file:    22.27 s
    Load from BINARY file:  0.18 s
Total Time:                 72.34 s
I guess there's no mistake after all. I was expecting a big difference, but not so big.
If anyone find where I messed up, please let me know and I'll fix it.
 
A curiosity, the files had 1.4gb in plain text format or 450mb in direct binary format (same as the RAM block), for the 7500² set.

Disclaimer: I don't know what I'm doing nor what I'm talking about.

Edited by dejaime, 17 January 2014 - 11:29 AM.


#10 Vortez   Crossbones+   -  Reputation: 2704

Like
2Likes
Like

Posted 16 January 2014 - 02:05 AM


Vortez, is this what they call serialization or am i mistaken ?

 

Humm, i never tried serialization, but it's similar. The thing with your example is that, there is too much information irrelevent to saving and loading your file. You don't need the x and y, you need to store it's width and height (to calculate the number of tiles), but since that information is only needed once, you dont need to include it in the structure either. Im not very good to explain things so let me give you an example:

struct CTile {
	int state; //? <-- not sure this should be there, but i dont know what it does...
        int texture_id;   // those complicate things a bit, you will have to make sure they always
	int walksound_id; // get assigned the same id
	int script_id;    
};

struct CTilesMap {
	int Width, Height;
	CTile *pTiles;
};

class CTilesManager
{
public:
	CTilesMap TilesMap;

	bool LoadMap(char *fname);	
	void SaveMap(char *fname);

	bool LoadScripts();
	bool LoadSounds();
...
}


bool CTilesManager::LoadMap(char *fname)
{
	FILE *f = fopen(fname, "rb");
	if(f){
		
		// Read the width and heigh of the map
		fread(TilesMap.Width,  1, sizeof(int), f);
		fread(TilesMap.Height, 1, sizeof(int), f);
	
		// Calculate the number of tiles from that
		int NumTiles = TilesMap.Width * TilesMap.Height;

		// Make sure the tiles aren't already allocated
		if(pTiles)
			return false;

		// Allocate memory to hold the tiles data
		pTiles = new CTile[TilesMap.Width][TilesMap.Height];

		// Load the rest of the file
		fread(pTiles, 1, sizeof(CTile) * NumTiles, f);

		fclose(f);

		return true;
	}

	return false;
}

Of course, this is very sketchy and i had a little trouble writting the code because the code is not mine, it's yours, but it should give you an idea about how to load binary files and structure your code. Also, scripts will need to be loaded using text file, and sound are generally binary, but should't be mixed with the map data imo, so that's why i provided 2 extra loader functions. The vertices could be calculated all at once when your done loading the map and know it's size.

 

If your a beginer it will probably seem overwhelming at first, but as i said, that's code is just to show you the general idea about how I would do it, and i wrote it in 5-10 mins so dont take it too seriously and use whatever method you like.


Edited by Vortez, 16 January 2014 - 02:32 AM.


#11 Krohm   Crossbones+   -  Reputation: 3245

Like
1Likes
Like

Posted 16 January 2014 - 02:17 AM


For starters, switch from text to binary It will be much faster, since you don't have to go through hundreds of characters, you just read a piece of data and there you are.
I have to point considering binary files to just be "the data, right there" is likely to be a good way to be in trouble. Soon. TheUnnamable points out a first example, dealing with endiannes. While a true problem, its relevance is overstated in my opinion.

What the binary file does is to guarantee each chunk of data has a known footprint, easily inferred from state. It does not guarantee the value itself is coherent with previous state. Input sanification is still required - what you gain is a much, much more compact parser, which can be ideally 5 LOC for a pure data blob.

 


With small modifications by me

Binary formats are not necessarily faster and they have a bazillion other drawbacks.

  1. Disk seek time will completely swamp any CPU processing time.
  2. Text files can in many cases compress better than binary equivalents and so (using a ZIP asset file or zlib to compress your source assets)...
  3. you can actually get faster loading time with text than binary.
  4. To be sure, measure, measure, measure.

I would like to know what those drawbacks are supposed to be as...

  1. not a binary file problem. Seek is always a problem, no matter if binary or text. But with text you have the additional complexity of parsing, especially if the format is designed to be human-usable;
  2. to a smaller file? Or in percentage? Information is information. Both files store the same amount of information, with data preferring one representation or the other depending on values themselves;
  3. sure you "can" in some circumstances. I don't recall it happening to me however;
  4. please stop with that "to be sure..." thing. Time is not free. Either you think something is worth or not.

Now, back to the original problem.

Text files are suitable for small amounts of data, such as level configuration in a tower defense game, a set of tile indices for parts of levels in a tile-based game.

If you can guarantee the syntax is simple, loading text has the inconvenience of variable-length but loading complexity will still stay low. If you care about performance, you might cheat by terminating each buffer with a null temporarily and removing that after processing the token. Or, more nicely, you might switch to a pointer-length approach where string termination is not assumed. This way, memory allocations are lower and performance goes up.

If you need even more performance, it's probably time to drop text. Filters to binary can cook the data for you in awesome ways.

Also consider json.



#12 ankhd   Members   -  Reputation: 1356

Like
0Likes
Like

Posted 16 January 2014 - 05:57 AM

Hello.

I always make a header class with info about the map and the data size. the I write the map data after the header//

 

Eg.

Class Header

{

 public:

           DWORD MapWidth;

           DWORD MapHeight;

           char MapName[1024];

          int version;

          DWORD DataSize;

       //other things needed

};

 

thats the header you write and read this data first



#13 V3ntr1s   Members   -  Reputation: 435

Like
0Likes
Like

Posted 16 January 2014 - 06:42 AM

 

 

Also, avoid the using namespace std statement.

 

 

Alright thank you for the tip, i'll be sure to remove the using namespace.

 

 

Isn't using namespace statement only bad thing if you're using it globally where you have multiple libraries?

I don't see it as a bad idea if you're writhing your library for example...



#14 dejaime   Crossbones+   -  Reputation: 4119

Like
1Likes
Like

Posted 16 January 2014 - 10:19 AM

 

Isn't using namespace statement only bad thing if you're using it globally where you have multiple libraries?

I don't see it as a bad idea if you're writhing your library for example...

 

 

As you can see, he was using it inside the file that defines CTileInfo. Chances are, if he is using object orientation, that it is some sort of header file and it is being used globally.

 

For one, it adds clarity to the code.

 

The std namespace is a special case since every single C++ library will avoid name conflicts with it. But if he, as a beginner, make this a common practice, he is really likely to run into a naming problem somewhere in the future. I already ran into a problem where a colleague called a solid colours class as Solid, just to save the extra characters and didn't use a namespace. Of course, he also tried to call solid bodies base class as Solid. That didn't work, he then needed to refactor his part to add the namespace and rename them to SolidColor and SolidBody, and we lost an hour of work waiting.

 

If you think you are immune to something like that, well, let me give you another example:

using namespace boost;

using namespace std;

C++ and Boost may have lots of naming conflicts, depending on your versions and what you are using from each of them.

 

Another point, he is a game developer and probably wants a portfolio to show, with lots of code samples. It is not a good idea to show "lazy" code instead of safe code.

 

So be safe, never use using namespace *; globally. I personally don't even use it locally. This will make your code safer, one that will not break if a library change and starts to conflict with another one in the future.

 

If you really, really, hate typing a namespace, you always have the option to use it like this instead:

using namespace std::vector;

 

This will allow you to use vector without typing std::, while saving you from a conflict if you ever create, say, a class called string.


Edited by dejaime, 16 January 2014 - 10:27 AM.


#15 Sunsharior   Members   -  Reputation: 521

Like
1Likes
Like

Posted 16 January 2014 - 11:04 AM

When i removed the "using namespace" from the stdafx file (dejaime guessed correctly i used it globally), i already found a problem where it used the "max" define from the std namespace instead of my own max. Luckily, it didn't broke anything.

 

Tonight after work i will look further into Dejaime's example and Vortez's snippet and will provide feedback.

Again, I really appreciate all yours help.


Edited by Sunsharior, 16 January 2014 - 11:05 AM.


#16 Dragonsoulj   Crossbones+   -  Reputation: 2126

Like
1Likes
Like

Posted 16 January 2014 - 11:21 AM


If you really, really, hate typing a namespace, you always have the option to use it like this instead:

using namespace std::vector;

 

I tend to just use:

using std::vector;


#17 V3ntr1s   Members   -  Reputation: 435

Like
0Likes
Like

Posted 17 January 2014 - 02:13 AM

Dejaime thank you  for clearing that up for me, I can see your point... biggrin.png



#18 Sunsharior   Members   -  Reputation: 521

Like
1Likes
Like

Posted 19 January 2014 - 08:14 PM

After changing from text to binary, doing somes adjustment and needed refactoring, everything is working fine. It's pretty fast loading now, but at the cost of bigger file size, since i can't ignore empty tile informaion the way i could when i used text. Oh well, i guess it's the price to pay. I'll probaly adjust around it. At any rate, i'll optimize it later.


Edited by Sunsharior, 19 January 2014 - 08:16 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS