Basic Operations (add, sub., muilt., etc) - Speed?

Started by
13 comments, last by Spoonbender 18 years ago
Yeah, that's what I did (sort of), before I switched to D3D. I still have the same code, but I haven't really tested it out. Linkage

EDIT: Holy Schnee, I see what you mean with AND. Gotta change that.
Projects:> Thacmus - CMS (PHP 5, MySQL)Paused:> dgi> MegaMan X Crossfire
Advertisement
A better way to speed up your collision detection is to change the algorithm. In a 3D game, checking each and every triangle against each and every other triangle would be way too time consuming, its roughly an n^2 algorithm where n is the number of triangles. A more efficiant approach is to only check triangles from objects that are close enough to possibly collide. This is called a "broad scope" phase and in 3D games usually means that a sphere (or other simple shape) around each object is tested against other object spheres. Only if the spheres are colliding can any triangles colide, which you would then check. If the spheres do not collide, no triangles can collide, so they can all be skipped. This might be several thousand trangle collision detections saved against hundreds of other objects each consisting of their own thousands of triangles. This is a HUGE calculation savings! More than you could ever hope to save my optimizing an all-triangles method.

Similarly, in 2D, you can first check a simple shape around objects for collision BEFORE you ever need to worry about pixel-level collision. Depending on the type of game, this shape is most often a circle(a 2D sphere,) a Box (axis-aligned Bounding Box) or both. If these simple 2D shapes do not overlap, non of their pixels can overlap either, saving you hundreds (or thousands, depending on the size of the sprite) of pixel-perfect collision detections.

These are the Cardinal rules of optimization:
1)Do the work you need the smartest way possible.
2)Do the work you need only! Never any more.
3)Do the work you need as little as possible.
4)Do the work you need as fast as possible.

Each of these is worth approximately one order of magnitude greater than the next, in ascending order. That is, #1 provides more optimizational value than 2, which provides more than #3, etc. Start at the top of the list and work your way down. A single optimization at level 1 is worth 100 optimizations at level 4.

throw table_exception("(? ???)? ? ???");

Quote:Original post by deadimp
Also, would there be any speed difference between things such as "a=a+1", "a+=1", and "a++"?

Just wanted to point out that "a++" is not equivalent to the other two.
Quote:Original post by Fred304
Quote:Original post by deadimp
Also, would there be any speed difference between things such as "a=a+1", "a+=1", and "a++"?

Just wanted to point out that "a++" is not equivalent to the other two.


As a statment, they are equivalent. For expressions, you are right. [grin]
If you really want to know how long these operations take, check with AMD and Intel's CPU documentation. They list the latency of every instruction.

But as others have said, this is definitely premature optimization (and microoptimization at that).

Let's do a bit of math.

You have a CPU. It's probably around 3GHz (A bit less if it's AMD, a bit more if it's Intel)

That means it can do 3 * 10^9 cycles per second. Each cycle, it can perform up to three instructions.

So in one second, it can do 9 * 10^9 instructions (theoretical peak).

Now, most instructions have a latency of 1-4 cycles.
So, what if you do pick a 4 cycle instruction instead of one that only takes 1?

Well, you waste a whopping 3 cycles *if*, and only if, the following instructions depend on the result from this instruction. (But usually that won't happen, because the compiler tries to schedule instructions to avoid these dependencies, and even if it fails, the CPU itself attempts the same). So most likely, it won't slow you down at all)
3 cycles? That's about a nanosecond. And this is in the worst case. You could do this a million times in a second, and it still wouldn't make a noticeable difference (it'd slow you down by one millisecond)

Oh, and just because everyone else skipped past it, RAZORUNREAL had a valid point too. *If* you iterate through all pixels (or anything else in a 2d array), make sure you take one row at a time, rather than a column. Why? The CPU cache likes that *much* better. Can easily double your performance.

This topic is closed to new replies.

Advertisement