Advertisement Jump to content
Sign in to follow this  

What algorithm would minimize repeat loading of every member of nxn grid+neighbors?

This topic is 1811 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello! I'm loading a large batch of objects and I'd like to optimize a bit. The way the load code currently works, it loads each member of an nxn grid and its 8 neighbors (if they aren't loaded already) while also offloading any loaded object that isn't a neighbor. Rather than brute-force with something like (for x in [0,n] do for y in [0, n] do "if not loaded, load index x+y*n and any not loaded neighbors"), I'd rather precompute a sequence of indices that minimizes the number of times any given tile must be loaded, then iterate across it. Since no object can be operated on without its neighbors having been loaded, I must iterate over every object. What algorithm would produce such a sequence for any given nxn array? I suspect a quad tree would be useful, but there might be simpler ways that I haven't thought of. n in the target case is 512. I'd prefer to assume that each object is very large, and therefore only a minimum of objects should be loaded. Objects are also slow to load/unload, which is why I'd prefer to minimize the "hits." Thanks for any input!

Share this post

Link to post
Share on other sites
I'm having a hard time understanding the full problem. You want to minimize the number of times you load a tile. What currently causes you to load a tile more than once in the first place?

To prevent misunderstandings, what do you mean by the following?:


When do you unload things? Are you scrolling the loaded NxN grid around in a larger grid or something? Are you loading objects only so that you can process them once, then immediately unloading them after that? Edited by Nypyren

Share this post

Link to post
Share on other sites
Sorry if I wasn't clear, I'll clarify:

There is an nxn grid, say:

A1 B1 C1 D1 E1
A2 B2 C2 D2 E2
A3 B3 C3 D3 E3
A4 B4 C4 D4 E4
A5 B5 C5 D5 E5

The current brute force algorithm loads A1, B1, A2 and B2 then processes A1. Next, it loads C1 and C2, and processes B1. Next it loads D1,E1,D2, and E2 while unloading A1 and A2, then processes C1. And so fourth.

In short, iterate over every member of the grid - load any unloaded neighbors, unload any loaded non-neighbors.

My current algorithm takes about 27,000 ms to cover a 512x512 grid. I'd like to get this down. Several orders of magnitude if possible...

edit: WHY does this forum delete newlines?!?!?!?!?!?

Share this post

Link to post
Share on other sites
I guess the algorithm I would use is:

- Load the first row.

- Load the second row.
- Process the first row.

- Load the third row.
- Process the second row.

- Unload the first row.
- Load the fourth row.
- Process the third row.


You only load each cell once, but your memory requirements are 3*n.

If those memory requirements are too high, then instead of a full row, load as much as you can in ways that minimize the amount of overlap when you move to the next loadable region.

If you load anything less than whole rows at once, I believe you will have to load at least one cell multiple times. The larger amount of cells you can keep loaded at once, the less redundant loads you'll have to make.

Another benefit of keeping lots of cells loaded at once is that you could parallel process them. Edited by Nypyren

Share this post

Link to post
Share on other sites
If you're too tight on memory to load 3 rows, here are some other ideas:

Depending on what a cell needs its neighbors for, you may be able to unload everything EXCEPT for that data after you've processed that cell.

So, like:

- Load row 1.
- Load row 2.

- Process row 1.
- Unload the no-longer-needed portions of row 1 (i.e. only keep what row 2 will need to access).

- Load row 3.
- Process row 2.
- Fully unload the remainder of row 1.
- Partially unload row 2.


If you can do this, then you actually only need to keep 1 row (the 'next' row) + 1 cell (the current cell being processed) FULLY loaded at a given time, and can leave the minimum necessary data loaded for the rest of the cells in those 3 rows:

- Load A1, B1, A2.
- Process A1, then partially unload it.
- Load C1, B2.
- Process B1, then partially unload it.

This will continue until you reach the end of row 1, at which point row1 will be partially loaded and row2 will be fully loaded.

From then on:

- Load A3.
- Process A2, then partially unload it.
- Fully unload A1.

- Load B3.
- Process B2, then partially unload it.
- Fully unload B1.

(repeat this process for all further rows)

Doing it this way might drastically fragment memory if your objects use irregular amounts of RAM. If you're using a language that can compact the heap, that could help. If your objects all use the same amount of data, no noticeable fragmentation should occur.

You should be able to do parallel processing in this case as well, as long as you never let the fully-unload action occur on a cell that needs to be used by another active task (i.e. keep the number of tasks limited to <= the number of cells per row), and ensure that the horizontal load step of each task occurs before the next queued task begins. Edited by Nypyren

Share this post

Link to post
Share on other sites

What exactly do you mean by "load" if you want to load 512x512, but you also want to assume they are so large you cannot have more than a few in memory at the same time?


If you're just processing them, you'll want to have the data you actually need to process for all 512x512 tiles in a contiguous block of your data file so you can take one disk hit instead of 250,000 disk hits. As an example, game programmers often do this with height information, pathing data, and tons of other things by just throwing the data in a texture. An added bonus is then we have that data easily accessed in shaders if we want it. You can do this with any data you want.


Beyond that, make sure you're doing block reads instead of byte-by-byte or variable by variable. Just read the whole file into a char* and sscanf off of the char*, or a std::stringstream if you don't like sscanf, or whatever your language uses if you're not using c/c++.


If that doesn't get the performance you want, you should really provide more details. Optimization is very application-specific, so it's kind of hard guessing what will get you orders of magnitude performance improvements or if that is even possible without knowing what you need to do.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using, you agree to our community Guidelines, Terms of Use, and Privacy Policy. is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!