Why is collision cast expensive?

Started by
32 comments, last by EWClay 11 years, 1 month ago

this is because (unlike a traditional BSP), my approach doesn't "choose" the plane, it calculates it.
basically, all you really need is an averaged center-point, and a vector describing how the "mass" is distributed relative to the point.

Ah, I see, that sort of calculation should be relatively cheap to what I was imagining. Thanks for explaining.

this is part of why/how it is "quick and dirty"...

even then, it isn't usually as much of an issue at present, as memory bandwidth has increased considerably over the past several years (relative to CPU speed increases), making the limit harder to run into (currently typically only really happens during bulk memory-copies and similar AFAICT, rather than in general-purpose code).

it was much worse of a problem 10 years ago though.

This is good to know. A lot of what I know about game programming is, unfortunately, dated to roughly 10 years ago. It doesn't help that I was reading a bunch of articles from Intel recently to catch up, who may be blowing the bandwidth issue out of proportion (I don't know, just a guess).

yeah.

granted, it probably depends a lot on the code.

but, 10 years ago, it was fairly common to hit the limit if things didn't all fit in cache, in general, such as when working with arrays, ...

now it generally requires doing lots of SIMD operations or similar, or running all cores at high load, ..., since normal scalar code doesn't usually run fast enough.

basically, while CPU clock speeds have increased by a factor of around 2-3, memory speeds have increased by a factor of around 8-10.

granted, it is a little worse off if one considers it per-core, or includes lots or multithreaded SIMD-based code, which can also hit the limit, but multithreaded SIMD-heavy code is still a relative minority of the code in use, and at least from the POV of "generic" single-threaded scalar code, things have gotten better...

granted, Intel tends to assume a lot more agressive use of SIMD than often seen in practice, where at least AFAICT, most of us (?) are mostly using SIMD as a nifty feature to speed up 3D vector math, rather than writing piles of highly vectorized code (probably because vectorization is generally a huge PITA...).

granted, in any case per-scale it is worse than 20 years ago (where the CPU speeds and RAM speeds were much closer).

Advertisement
I'll have another go at explaining why ray casts in many game engines are asynchronous and why that's important for performance.

Let's assume that the physics is already multithreaded, as this is often the case. Let's assume also that game logic and AI are not multithreaded, because that's quite difficult to do correctly. So game logic and AI are taking up a big portion of the frame time, and any calculations that can be offloaded to another thread will be a win.

Now consider ray casts. Game logic and AI may require thousands of them. Line of sight checks for AI. Damage tests for weapons. They are performed by the physics engine, which is thread safe already.

So you tell the game programmers they can use all the ray casts they want as long as they request them early and use the result later. A queue builds up all the requests for raycasts in a frame. Then, at a suitable point they are all processed. The main thread is notified that the results are ready. Everyone is happy, and the game programmers don't complain about slow ray casts anymore.

The article I linked to is a classic. It isn't academic and dreamy. It's about reality. The next generation of consoles will have eight cores (http://www.eurogamer.net/articles/df-hardware-spec-analysis-durango-vs-orbis). The PS3 already has that many. PCs won't be far behind.

Concurrency isn't "a tool that has its place", it runs through the whole design of a modern game engine, and will be even more important as time goes on.

I'll have another go at explaining why ray casts in many game engines are asynchronous and why that's important for performance.

Let's assume that the physics is already multithreaded, as this is often the case. Let's assume also that game logic and AI are not multithreaded, because that's quite difficult to do correctly. So game logic and AI are taking up a big portion of the frame time, and any calculations that can be offloaded to another thread will be a win.

Now consider ray casts. Game logic and AI may require thousands of them. Line of sight checks for AI. Damage tests for weapons. They are performed by the physics engine, which is thread safe already.

So you tell the game programmers they can use all the ray casts they want as long as they request them early and use the result later. A queue builds up all the requests for raycasts in a frame. Then, at a suitable point they are all processed. The main thread is notified that the results are ready. Everyone is happy, and the game programmers don't complain about slow ray casts anymore.

some of this depends a lot on the game engine architecture though.

for example, in my engine, ray-casts aren't handled by the physics engine, but rather the "server end" logic (which basically deals with things like managing the scene-graph, communicating with the client, and basically providing the environment for the game-logic to do its thing).

so, the physics engine basically sits off by the side, with the server-end mostly shuffling data between the physics engine and the main scene-graph, ...

(I am using a custom physics engine with an OpenGL-like API design).

I would guess all this likely presumes an event-driven approach to game-logic though, like say, triggering an event when the results of the trace get back, ... rather than performing a trace and expecting to get back the results immediately (currently more how it works in my case), ...

The article I linked to is a classic. It isn't academic and dreamy. It's about reality. The next generation of consoles will have eight cores (http://www.eurogamer.net/articles/df-hardware-spec-analysis-durango-vs-orbis). The PS3 already has that many. PCs won't be far behind.

Concurrency isn't "a tool that has its place", it runs through the whole design of a modern game engine, and will be even more important as time goes on.

yes, interesting...

the problem though is that it is currently a bit of a challenge gradually migrating away from the more traditional "single giant thread which does everything" style of software development.

as-is, it more ends up looking like "several giant threads which do everything...".

Yes, I'm making a few assumptions:

1. That the physics engine does the ray casts, and it is already multithreaded.

2. That there is a performance issue on the main thread.

3. That ray casts account for a significant amount of processor time.

If any of those don't apply, it's not an appropriate technique. And it does require some reorganisation of the code, so it's not free.

The ideal, really, would be for all systems to be multithreaded perfectly. But this is an unsolved problem, so we do what we can.

This topic is closed to new replies.

Advertisement