Optimization of a method

Started by
6 comments, last by frob 10 years, 6 months ago

The method below works fine for what I want to accomplish. The problem with it is that it is way too slow so I'm hoping that someone can help me out with it. Without the method, the game loads almost immediately but with the method, it takes 2 or 3 minutes for the window to even open.


 private void CheckAdjacentCubes(List<Cube> cubes, List<Cube> cubeToDraw)
        {
            //cubes is main list of cubes
            //cubeToDraw is the list of cubes to draw after adjacency check

            foreach (Cube cube in cubes)
            {
                foreach (Cube adjacentCube in cubes)
                {
                    if (cube.World.Translation.Z + 0.5f == adjacentCube.World.Translation.Z - 0.5f)
                    {
                        backSide = true;
                    }
                    if (cube.World.Translation.Z - 0.5f == adjacentCube.World.Translation.Z + 0.5f)
                    {
                        frontSide = true;
                    }
                    if (cube.World.Translation.X + 0.5f == adjacentCube.World.Translation.X - 0.5f)
                    {
                        rightSide = true;
                    }
                    if (cube.World.Translation.X - 0.5f == adjacentCube.World.Translation.X + 0.5f)
                    {
                        leftSide = true;
                    }
                    if (cube.World.Translation.Y + 0.5f == adjacentCube.World.Translation.Y - 0.5f)
                    {
                        topSide = true;
                    }
                    if (cube.World.Translation.Y - 0.5f == adjacentCube.World.Translation.Y + 0.5f)
                    {
                        bottomSide = true;
                    }
                }

                if (bottomSide != true || topSide != true || leftSide != true || rightSide != true || frontSide != true || backSide != true)
                {
                    cubeToDraw.Add(cube);
                }

                bottomSide = false;
                topSide = false;
                leftSide = false;
                rightSide = false;
                frontSide = false;
                backSide = false;
            }
        }

I'm assuming it's because I'm looping through every single cube in the list, and with a 20x20x10 chunk that'll take a while. Any thoughts on what I could do differently?

If you see a post from me, you can safely assume its C# and XNA :)

Advertisement

Where to start.. it's not the method itself you need to optimize, it's the way you're storing the cubes. At the moment your method is identifying adjacent cubes on the fly, for each cube. But the problem is that each cube is being checked against every single other cube to determine if it is surrounded by cubes on all sides, and hence if it is occluded and should not be rendered. The main issue here is that you are not using the right data structure to store your cubes. Your current data structure (a raw array of cubes) provides no sane way to check if a cube is completely surrounded by other cubes. It's just not efficient, and no amount of micro-optimizing your current approach will make it efficient: your data structure is not giving you the information you need. What you want is a better way to represent your world such that you can very quickly answer the question "should I draw this cube" without iterating through the entire list of cubes to find out.

There are many possible approaches, the easiest (but probably least flexible) is perhaps to store your cubes inside a 3D array, with their positions inferred from their index in the array, and then adjacency check consists of simply checking if the 6 surrounding array indices contain cubes. Of course this is still not particularly efficient, you will have poor cache locality and will probably still be doing too much work. Another option is to store pointers to adjacent cubes inside your cube structure while constructing your world (and keeping track of them as you add and remove cubes) which could be interesting, and might make flood-fill type occlusion algorithms easy to implement.

But here's something to consider: do you really need 100% accuracy on the adjacency test? What if you could have a very quick CPU test that culls, say, 95% of all the cubes that will never be seen, and let the graphics card churn through the rest? Wouldn't that be a better tradeoff than going to great lengths to ensure that every last occluded cube is never drawn, at an immense computational cost to the CPU? Think of frustum culling, which is a related - though quite distinct - optimization technique. Does the CPU cull every off-screen triangle itself? No, that would be stupid, it only roughly checks if large blocks of the world are off-screen, and if it misses some that's fine, all that matters is that it culled most of the off-screen triangles at a very low cost. Of course frustum culling does not do the same thing as occlusion culling (the former reduces draw calls whereas the latter tries to get less triangles to the rasterization stage) but the basic idea is the same: it doesn't have to be perfectly accurate, just correct and good enough for your game to run faster.

With that in mind, what would be a fast but approximate approach to determining occlusion for a cube world? One way which springs to mind is to store your cubes in a special octree, which tracks which nodes are completely full with cubes, and that way you can take a look at the topmost nodes of the tree, discard huge chunks of cubes if they are occluded by other huge chunks of the same size. For instance, think of the inside of a mountain, there is probably a 5x5x5 chunk inside it which is surrounded by other 5x5x5 chunks on all six sides, and so you can immediately discard those 125 cubes as occluded and not draw them at all - that is the idea I am trying to convey. You can then look at the smaller nodes one level deeper, and do the same thing. Eventually, if you carried on this process down to the deepest level of the octree, you would be left with only the surface of your cube world, with no occluded cubes drawn at all. It would probably be faster than your current approach, but you will notice that it is likely not worth it going through the entire tree, as going deeper gets progressively more expensive. It's up to you to decide where to stop to achieve optimum balance between fast CPU-side culling and letting the GPU deal with leftover cubes, likely through empirical testing.

Note I haven't tested the above approach, and there are likely much better alternatives out there for voxel-style worlds like yours, and you should probably not use the above algorithm literally. It was just to illustrate that using a better data structure which maps well to the problem you are trying to solve is the way to go, and that sometimes going for perfection can actually make things run slower. And, yes, it's a bit more involved than storing cubes in a list and doing it the naive way, but if everything was simple enough to be done efficiently with only arrays and for loops there would be no need to come up with advanced data structures! :)

(btw, you are comparing floats in your code, this is not very robust: if your positions can only be integers, then make it so)

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

For one, I’ve heard that “foreach” is extremely slow.

For another, you are checking every single side of the cube even though it will get added to the list if any sides are not touching. In other words, there is no reason to continue checking sides as soon as a single side is found not to be touching.

 private void CheckAdjacentCubes(List<Cube> cubes, List<Cube> cubeToDraw)
        {
            //cubes is main list of cubes
            //cubeToDraw is the list of cubes to draw after adjacency check

            foreach (Cube cube in cubes)
            {
                foreach (Cube adjacentCube in cubes)
                {
                    if (cube.World.Translation.Z + 0.5f != adjacentCube.World.Translation.Z - 0.5f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                    if (cube.World.Translation.Z - 0.5f != adjacentCube.World.Translation.Z + 0.5f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                    if (cube.World.Translation.X + 0.5f != adjacentCube.World.Translation.X - 0.5f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                    if (cube.World.Translation.X - 0.5f != adjacentCube.World.Translation.X + 0.5f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                    if (cube.World.Translation.Y + 0.5f != adjacentCube.World.Translation.Y - 0.5f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                    if (cube.World.Translation.Y - 0.5f != adjacentCube.World.Translation.Y + 0.5f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                }
            }
        }

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

You could implement the voxel storage as a tree, each leaf is connected another leaf, only if that block is adjacent. Then when you add or remove block dynamic you could just go the the leaf where the near block is at and attach it or just add a new leaf if it doesn't have adjacent blocks. You could also use divide and conquer and divide those chunks in small pieces and calculate them parallel. The method it self can be optimized if you pre-cache the position


if (cube.World.Translation.Z + 0.5f != adjacentCube.World.Translation.Z - 0.5f)
{
  cubeToDraw.Add(cube);
  break;
}
if (cube.World.Translation.Z - 0.5f != adjacentCube.World.Translation.Z + 0.5f)
{
  cubeToDraw.Add(cube);
  break;
}

auto bZ1 = cube.World.Translation.Z - 0.5f;
auto bZ2 = cube.World.Translation.Z + 0.5f;
auto cZ1 = adjacentCube.World.Translation.Z - 0.5f;
auto cZ2 = adjacentCube.World.Translation.Z + 0.5f;

if (bZ1 != cZ2 || bZ2 != cZ1 || ....)
    cubeToDraw.add(cube)

But like Batterius said, its the design rather than the method.

Feel free to comment and star my project! <https://github.com/Ghrum/Ghrum>

Well i went with the 3d array after researching octrees for a bit because they confused me. I understand how they work, but I still have a few questions about them. Is the whole map a giant octree, or do I need one for each chunk? And even with an octree, I'm not quite sure how to use it to check for adjacency of my cubes. I haven't put any into action yet, but I figure I'll give it a shot today.

Anywho, the following bit of code doesn't work and I don't know why. The CubesToDraw list has no values in it at all. This is my attempt at using a 3d array :P


 private void CheckAdjacentCubesArray(Cube[, ,] cubes)
        {
            for (int x = 1; x < chunkWidth - 1; x++)
            {
                for (int y = 1; y < chunkHeight - 1; y++)
                {
                    for (int z = 1; z < chunkDepth - 1; z++)
                    {
                        if (cubes[x, y, z].World.Translation.Z + .5f == cubes[x, y, z + 1].World.Translation.Z - .5f)
                        {
                            backSide = true;
                        }
                        if (cubes[x, y, z].World.Translation.Z - 0.5f == cubes[x, y, z - 1].World.Translation.Z + 0.5f)
                        {
                            frontSide = true;
                        }
                        if (cubes[x, y, z].World.Translation.Y + 0.5f == cubes[x, y + 1, z].World.Translation.Y - 0.5f)
                        {
                            topSide = true;
                        }
                        if (cubes[x, y, z].World.Translation.Y - 0.5f == cubes[x, y - 1, z].World.Translation.Y + 0.5f)
                        {
                            bottomSide = true;
                        }
                        if (cubes[x, y, z].World.Translation.X + 0.5f == cubes[x + 1, y, z].World.Translation.X - 0.5f)
                        {
                            rightSide = true;
                        }
                        if (cubes[x, y, z].World.Translation.X - 0.5f == cubes[x - 1, y, z].World.Translation.X + 0.5f)
                        {
                            leftSide = true;
                        }

                        if (bottomSide != true || topSide != true || leftSide != true || rightSide != true || frontSide != true || backSide != true)
                        {
                            CubesToDraw.Add(cubes[x, y, z]);
                        }

                        bottomSide = false;
                        topSide = false;
                        leftSide = false;
                        rightSide = false;
                        frontSide = false;
                        backSide = false;
                    }//end z
                }//end y
            }//end x

        } 

I kept the bool variables in there just because most of the method was copy and paste. I don't know why CubesToDraw doesn't get anything added to it, but I've stared at it long enough that I've decided to just figure out how an octree could work in my case

If you see a post from me, you can safely assume its C# and XNA :)


if (cube.World.Translation.X + 0.5f == adjacentCube.World.Translation.X - 0.5f)
...
if (cube.World.Translation.X - 0.5f == adjacentCube.World.Translation.X + 0.5f)
...

Maybe:


cube.World.Translation.X - adjacentCube.World.Translation.X == -1.0f 
cube.World.Translation.X - adjacentCube.World.Translation.X == 1.0f

if(absf(cube.World.Translation.X - adjacentCube.World.Translation.X) == 1.0f)

I dont know will it work with floats corectly


private void CheckAdjacentCubes(List<Cube> cubes, List<Cube> cubeToDraw)
        {
            foreach (Cube cube in cubes)
            {
                foreach (Cube adjacentCube in cubes)
                {
                    if (absf(cube.World.Translation.X - adjacentCube.World.Translation.X) != 1.0f ||
                        absf(cube.World.Translation.Y - adjacentCube.World.Translation.Y) != 1.0f ||
                        absf(cube.World.Translation.Z - adjacentCube.World.Translation.Z) != 1.0f)
                    {
                        cubeToDraw.Add(cube);
			break;
                    }
                }
            }
        }

For one, I’ve heard that “foreach” is extremely slow.


Do you have a source for that? Would be interested to see it.
if you think programming is like sex, you probably haven't done much of either.-------------- - capn_midnight

For one, I’ve heard that “foreach” is extremely slow.


Do you have a source for that? Would be interested to see it.

Foreach requires a small amount of space for a hidden temporary object to yield over iteration, and that amount of space depends on the things being iterated over and the implementation (MS vs Mono). It isn't usually too bad or even noticeable, unless you are in an extremely constrained environment.

In one game engine being ported to handhelds we discovered that a foreach() over one of our largest collections of game objects unexpectedly required around 100KB of space. By itself that isn't much, until you consider that you could potentially iterate over the collection with a regular for() loop using just a single integer to index into the collection.

Another consideration is that if your game implements a "pause the world" type of save game system so that all C# code is frozen mid-execution, the internals of foreach can be difficult to serialize when the execution environment is written to disk. In our engine we discovered that we could only save a single foreach loop, if two foreach loops were nested the deserialized world would be corrupt.

As for the algorithm, beyond the earlier suggestions of spatial partitioning and early exit, another improvement that can be made is that the current algorithm tests A against B, and then tests B against A. You can reduce the processing by half because if one is true, both are true, if one is false then both are false.

This topic is closed to new replies.

Advertisement