Physics bottlenecks

Started by
8 comments, last by Elmer Fudd 17 years, 9 months ago
I'm curious as to where processing power is lacking these days, and where the greatest gains could be made by moving functionality into hardware, rather than using the CPU(s). I'm guessing this is in the 'realistic physical modelling' area, e.g. Motion calculations of complex objects(e.g. trees, plants, sails in a varying breeze). Particle motion calculation (liquid, smoke, etc). Motion calculation for any other high polygon count objects. Is this true or is any motion calculation offloaded to the graphics card? Or does the graphics pipeline for video cards start at geometry tranformations from 3D scene to 2D representation? Thanks a lot!
Advertisement
Have a look at PhysX physics engine.
---------------------------------------------------Life after death? No thanks, I want to live NOW --- Sturm 2001
1) The graphics card only deals with the programs that are sent to it. These are the different shaders, and fixed function
commands that tell the graphics card what to do with the geometry and textures. There are no other calculations that can
be done with the GPU. That said, you can make texture and geometry calculations that have and end product that you can
later interpret as a non-graphical calculation (through another shader, or by reading the data back to main mem)
Thus you can do fluids(plasma pong) particles(Havok), or even sort data(GPUsort).

2) Physics seems to take up the following computational realms.
a) massive numbers of items (think parallel computation)
b) joint constraint/collision response itterations (# of collisions quickly slows the sim)
c) fast spatial grouping (need to know what is "near" any object inorder to quickly collide things)

So, this means that for small simulations you dont have many issues to deal with. But, as soon as your simulation begins
to grow you can run into problems at any level. Well formed algorithms can give you excelent results for 'c' so this is
often of little concern. Well formed simulations can keep collisions/joint complexity to a minimum for most
rigid bodies, limiting the effect of 'b'. So the main bottleneck becomes the shear number of objects in a scene.
The massive parallel vector computations that a GPU or PPU can preform quickly helps out with 'a'. Do you need the
special hardware? or will a CPU do? I think there has been plenty of debate about this, and it is one of those
lets wait and see sort of things.

I should note though, that there are setups you can make for a physical simulation that cause HUGE
problems with 'b' and 'c'. SPH/N-Body simulations quickly grow out of hand (equivelent of lots of joints).
The famous "box stack" runs smooth untill disturbed, where the resultant # of collision responses can quickly
bring the simulation to a stop.

Look at ODE, Havok, or PhysX and see what they do with their sims. As with any program, the amount of work you can
remove, the faster the work gets done. There are lots of tricks, like "scenes"/"spaces" and
disabled objects that let you seperate out the parts of the scene that dont have to be included in the
main loops.
Thanks folks for your help. What I'm really trying to figure out is would and FPGA (programmable logic device) be of benefit in a gaming PC, and furthermore, would it be feasible (from a business point of view).

I've seen FPGA's used to prototype real time ray tracing graphics hardware, and can't help thinking that having configurable hardware available could produce stunning results.
Considering the speed of most FPGA's is considerably lower than that of a standard CPU or microprocessor,
I'm not sure how much benifit you would see from one inside a standard computer setup.
BUT, that doesnt mean that there are no uses for it. If you can come up with some aspect of your program that you can
hardware to go faster than the equivelent CPU instructions, then you may actually see a tone of benifit.
Quote:Original post by KulSeran
Considering the speed of most FPGA's is considerably lower than that of a standard CPU


This is an often made misconception. True, the clock speed is lower, but amount of data that can be processed is way higher. Latest and greatest FPGA´s are quoting ~ 256 GMACs (thats 256 giga multiply-and-accumulates per second(18bit mults, 48 bit accums). But in general, for any task that can be split into many parallel components, hardware will be way faster. Even the cheapest (well cheaper anyway) FPGA´s can outperform todays processors on certain tasks.

The beauty of the thing is that it is completly programmable, e.g. you run a ray-tracer app, and accelerating hardware is loaded on the FPGA, run a game, and it loads a physics engine, encode video and it loads hardware to do so, etc.

The problem is it´s harder to write the apps (or part of the app) for hardware, and there would need to be a critical mass of apps that support this for it to take off.
I really like the idea. It is akin to loading part of the OS into flash so that the system boots instantly.
It would be much like the hyped PhysX chip, but not specific to any one task.
The main highlight of the thing wouldn't be a add-in card for "parallizing a task" but that of having an add-in card
for "your specific task acceleration". I could definatly see it for a programmable decoders/encrypters/transform
type software application where some complex math is repeated on a dataset over and over.
Expecially since they could be reduced to a few complex hardware steps that would otherwise be many convoluted CPU instructions (logic like (a|b)&(c|d)&(e|f)&(g|h) ) would be several instructions, or just a couple hardware gates).

The only real issue I see happening is that a FPGA is only good for so many cycles. Like a flash card is.
But a flashcard lasts most people forever, since there is little swapping to/from the card.
A FPGA under the wrong load would experience a rapid failure of the add-in card.
Imagine a resource shareing procedure where an app gains FPGA access at startup and holds it till app shutdown.
If several apps are started and shudown over the course of the day you cut 50k cycles to about 4 years.
If each app has several operating modes(say winamp having one FPGA program per codec of media file) then
50k cycles is probably less than 3 years of operation.
3 years is a long time, but the lifetime of any add-in card is something to think about.

Even with that downfall, a properly constructed add-in card could support multiple programs on the FPGA.
ie 3 programs that each only use 8 output pins or 4 programs that use 6 pins.

Hopefully you can make one of these for the comunity in the next year. I'm sure many enthusiasts would buy one
as soon as someone promised appropriate software packages to complement the system.

Thanks again for the reply. Although some FPGAs are flash based, the big hitters (Xilinx, Altera) are pretty much exclusively SRAM based, which means fatigue failure will not be a problem.

You´ve got the idea though, "specific task acceleration" as you call it, i.e. developers have the freedom do design whatever hardware co-processing functions they like to accelerate their application. They can even change the hardware on the fly for different parts of the application.

Of course the downside the typical developer has no idea about hardware, HDLs, etc!

I also think that if an initially small selection of apps supported this, it could take off. I know I´d buy it.

Think I´ll start some serious research into this one, at the worst it´ll keep me off the streets for a few months ...
I like this discussion. I got to play with an altera chip in my intro EE class.
We made a simplistic computer inside it. Lots of fun. They told us lots of stuff about the chips
but since my focus was CS, I didn't pay much attention or search further.
I'm suprised to find out from you that there are less drawbacks than they made it seem like there were.


p.s. so many page view... but so few comments...
Probably not many comments because not many familiar with hardware design. Most hardware designers not really familiar with designing for FPGA´s either!
Also not many FPGA designers who check out gamedev.net I would guess.

I´m gonna start with a simple ray-tracing app, and then a physics type problem. Physics type problem might be a flag flapping in variable wind.

Will post results when I´m done.

This topic is closed to new replies.

Advertisement