Sign in to follow this  

How can I save/flush the offline data to disk and start it from there next time over?

This topic is 819 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I need to generate a large amount of map data from my game editor offline.

But the generation takes more than 6 hours to complete.

If the end of the day is reached, you know, you need to stop working and some sort.

You need to flush the data to disk.

But How can I start it next time from the check point?

What variables do I need to save for next time commencing?

I don't want to calculate everything from the start.

What strategy is good for this kind of data dumping?

Thanks

Jack

Share this post


Link to post
Share on other sites

"What variables do I need to save for next time commencing?"

This highly depends on your algorithm and code

 

"What strategy is good for this kind of data dumping?"
I assume that your algorithm has some kind of state. For example, you are converting a heightmap to a normal map. Your input is the heightmap, your output is the normal map, and your state is your position in the heightmap.

For example, you have a 65536x65536 sized heightmap and you are at [17, 8192] and want to continue tomorrow. So you save the normals you have calculated so far - aka. flush the offline data - and somewhere else - possibly a cache file - you write your position. 

We can probably help you further if you tell us more about your code/algorithm. 

 

EDIT: Also, one more thing to look out for. You want to be able to incrementally save your output. Here, an obvious solution would be to have your grid of normals. The ones you haven't calculated yet would be assumed to be zero, and the ones you've already calculated have their values. Here, your default value doesn't matter as you can easily decide if you have already calculated a value because you have the state I mentioned earlier. 

When you flush your data again, the only difference would be that you have less default values in your output. 

Share this post


Link to post
Share on other sites

Initializing the values to some default values is a good idea though, I think I'll look into that method.

The only problem is when I perform something like

void Grid::calculateActualCosts() {
for (auto& walkable : m_walkables) {
    for (auto& walkables : m_walkables) {
           AStarNode* fromNode = acquireNode(...);
           AStarNode* toNode = acquireNode(...);
           AStarNodePair pair(fromNode, toNode);
           ///
           astar(fromNode, toNode, totalCost);
           actualCosts.insert(std::make_pair(pair, totalCost));
 
    }
  }
}

Do I just put a totalCost like NaN into the file, and calculate the node pairs

that aren't initialized? but say tomorrow, I have to restart the whole loop again regardless.

How can I simplify this?

 

Update:

Do I put a conditional branch there, just say to read the file at that position to see if it has been initialized or not....

I think sooooooooo,

 

Update2:

I think I just sort the data set, and seek to the point where the data set is last saved.

 

Update3:

I've got a better idea. Let's put the whole process into a VM, and restart that from there on, How easy...

 

Thanks

Jack

Edited by lucky6969b

Share this post


Link to post
Share on other sites

Yes, the quickest and dirty-ish solution would be to use a conditional branch. Something like this: 

void Grid::calculateActualCosts() {
	unsigned last_i = 0;
	unsigned last_j = 0;
	if(readingFromFile())
	{
		last_i = getLastI();
		last_j = getLastJ();
	}
	
	unsigned i = 0;
	unsigned j = 0;
	for (auto& walkable : m_walkables) {
		j = 0;
		if(i++ < last_i)
			continue; 
		
		for (auto& walkables : m_walkables) {
			if(j++ < last_j)
				continue; 
			
			AStarNode* fromNode = acquireNode(...);
			AStarNode* toNode = acquireNode(...);
			AStarNodePair pair(fromNode, toNode);
			///
			astar(fromNode, toNode, totalCost);
			actualCosts.insert(std::make_pair(pair, totalCost));	 
		}
	}
}

Or, you could keep an std::set<std::pair<walkable, walkable>>. If a pair of any two walkables is in the set, it has been processed already and you can just do a continue, like this: 

for (auto& walkables : m_walkables) {
	if(progressSet.count({walkable, walkables}))
		continue; 
	
	AStarNode* fromNode = acquireNode(...);
	AStarNode* toNode = acquireNode(...);
	AStarNodePair pair(fromNode, toNode);
	///
	astar(fromNode, toNode, totalCost);
	actualCosts.insert(std::make_pair(pair, totalCost));	

        progressSet.insert({walkable, walkables});
}

At the beginning of the function you'd start with an empty set, and if you are continuing, read the set from a file. 

 

Your update2 also sounds valid, if it is practical to sort your data. 

At first, a VM sounds like an overkill, but if it's something you only run on your dev computer ( I assume so ), it is the fastest way to having a solution. 

Share this post


Link to post
Share on other sites

You should also consider optimizing your algorithms.

QFE.

I find it hard to believe that your program is actually doing 6 hours of work. Even Dwarf Fortress' notoriously involved world building process is a matter of minutes on a relatively modern machine.

It is often quite easy to find oneself in the position of having taken the straightforward solution to a particular problem, and finding it transformed from a O(N*N) problem to an O(x^N), or even worse.

Share this post


Link to post
Share on other sites
optimize

Apart from that, you could of course simply let the computer on over night. Or you could go the same route that every IDE goes.

 

What happens if you build a project with 20,000 files in Visual Studio (or in Eclipse) and you abort the build after 1,500 files? Next time you tell it "build all", it will not build the 1,500 files for which object files exist that have a timestamp equal to the corresponding source file.

 

Surely, any task that takes several hours can be broken down into sub-tasks that take only a few minutes and that can be saved to disk as you go. Then just a final pass is needed assembling all the pieces together (just like the link stage). If the output of one step is needed for another, you can also restore to a workable state very quickly from saved intermediate results when starting the build process again the next day.

 

If a terrain file exists that has the same timestamp like your terrain creation parameters, you need not recreate that patch of terrain. If a "walkable" file that refers to this terrain exists, and it has the same timestamp as the terrain, you need not recalculate its path. etc etc.

Edited by samoth

Share this post


Link to post
Share on other sites

This topic is 819 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this