Physics Hardware at GDC

Started by
15 comments, last by Dmytry 18 years ago
Some of the most impressive demos were of "physics hardware" at GDC: 1. Ageia PhysX 2. nVidia/Havok SLI Physics 3. PS3/Cell Ducky Demo Ageia showed a cool in-game demo with many moveable objects (Cell Factor), as well as implicit-surface-like fluid (gore effects). The demo did not allow switching between hardware/software, so it was not possible to determine if the demo was actually hardware accelerated and/or how much of an improvement the hardware provides over the already quite fast and impressive Novodex software engine. The PS3 Rubby Ducky / Pirate-boat+cloth sails + water demo was quite cool. The demo allowed spawning of a huge number of duckies which ended with a very large ducky smashing and forcing most of the little duckies out of the tub. There was no visible sign of slowdown; I suspect the PS3 hardware could support much more than the demo showed. The most surprising and most impressive demo was the nVidia demo running Havok on a GPU (SLi). The demo starts out with a relatively small stack of boxes, then progresses to a larger stack, then larger, and so on, until a huge ring/tower stack is shown with so many boxes (10s of thousands) that while collapsing the view is pulled out so that the boxes looked almost like particles of sand flowing. The frame rate did not appear to slow down! Another demo showed thousands of boxes flying in a ring around a player, with switching between hardware and software. For this demo, software ran at perhaps 1-2 fps, while hardware ran fully real time (appeared to be 60+fps). It's clear from the nVidia hardware physics demo that the GPU is plenty general+powerful enough to provide compelling hardware physics acceleration. It will be interesting to see how ATI responds. It would appear that future GPU's will easily be able to support physics without dual-card solutions (more parallel float-units, multi-core-like solutions, etc.). Physics hardware manufacturers will need to provide compelling demos to show that hardware acceleration provides a significant value over software physics. From what I could see at GDC, nVidia provided the only compelling demo for hardware acceleration value vs. software-only for the PC space. It will be some time before games require hardware accelerated physics (perhaps similar to how software rendering moved to hardware rendering). Graphics progressed reasonably well through OpenGL and DirectX: some type of OpenPL or DirectPhysics will probably be needed to help hardware physics to become well-adopted by consumers. Both Havok and Ageia are positioned to promote various types of software and hardware accelerated physics. As a game developer, I want my customers to be able to choose the best product for the money (competition): thus a standardized physics API would be the best for consumers as well as hardware manufacturers. Standardized motion behavior shouldn't be too difficult, even for multiplayer games (typically server-owned ultimate position authority (lock-step would not work)); just has to be "close enough" (within some reasonable tolerance). [Edited by - John Schultz on March 27, 2006 5:37:33 PM]
Advertisement
Some co-workers that attended GDC brought back some info.

Apparently the GPU solution is limited to eye candy effects, whereas the Ageia solution is usable for gameplay objects as well, implying to me that getting data back from it isn't a big deal.

Apparently they didn't recommend using the Havok/GPU acceleration for gameplay objects, which to me cripples the usage a bit.

It seems obvious from the CellFactor movie that there is a ton of gameplay objects accelerated, which seems even more exciting, cuz in theory one could use Ageia as a tool for smarter AI due to I would assume dirt cheap raycasting being available.

movie: http://www.fileshack.com/file.x?fid=8558

This is all secondhand info, I make no claims to its accuracy, though it seems to make sense. It has never been recommended to do much reading back from GPUs. I do wish Ageia were cheaper though, hopefully the cards will come down in price.
Quote:Original post by DrEvil
Some co-workers that attended GDC brought back some info.

Apparently the GPU solution is limited to eye candy effects, whereas the Ageia solution is usable for gameplay objects as well, implying to me that getting data back from it isn't a big deal.

Apparently they didn't recommend using the Havok/GPU acceleration for gameplay objects, which to me cripples the usage a bit.


Spin from Havok

Interesting "industry standard" comment:
Quote:
Will Havok FX Support AGEIA?

Havok FX will support hardware that can execute standard OpenGL and Direct3D code at the Shader Model 3.0 level. If the AGEIA card and drivers adopt and support Shader Model 3.0 industry standard, Havok FX support will be possible.


Ageia 100% custom hardware physics should blow away SM3.0 GPU physics; however, for GDC 2006, no such demo was presented (at least on the exhibit floor). It would appear that uploading collision/environment primitives to GPU memory would allow for the highest performance acceleration (reduced bus traffic, direct rendering of results in GPU memory, etc.). More complicated environments could cache/stream in collision primitives, etc. While I'm rooting for the little guy (Ageia), they're going to have a tough time competing with nVidia and ATI. A standardized physics API would help. Ageia could perhaps perform their own SLI-like physics: multiple PPU's to improve performance for hard-core gamers, etc. Bus traffic+complexity may be an issue, though.

See the Cell Factor HD demo here. This is the same demo shown on the floor at GDC. The best way to showcase HW accelerated physics over software is to run the exact same demo on software and hardware, and compare frame rates. Given the previous excellent Novodex software demos, I would not be surprised if the demos were running on fast PC hardware (limited or no HW acceleration); the real-time interactive demos were not as smooth as the video (hiccups, not 60Hz, sometimes appears less than 30Hz, etc.).

Compare to the nVidia+Havok HW physics demo. Appears to be many more moving, colliding objects, with software/hardware switching (demonstrates significant HW acceleration). Watching the brick demos in real-time, there were no hiccups/slowdowns. There were slowdowns with the Dino demo (see movie/link).

These are early demos for physics HW acceleration: it's a pretty good start...
Hi John,

Thanks for the overview.

Actually the PS3 Ducky Demo is Sony's in-house physics, not middleware.

PS3/Cell supports both Ageia and Havok physics middleware with SPU optimizations. Technically I think the Havok FX can run on the PS3 GPU too, although SPU would make more sense for general purpose physics.

The main purpose of Havok FX on GPU is effect physics, but it can definately be used for game play. Benefit of Ageia PPU processor is that for little cost you can offload physics from CPU to PPU without taking GPU graphics resources.

Physics is hot, and apart from which hardware gives best performance, a big thing will be the amount of popular games supporting the hardware.

Erwin Coumans
Here are some dumb question
Does Nvidea and Havok have an agreement that let Havok use special non-disclosed feature of the GPU?
Isn't shadel model 3 a generic programming model that can be used by anybody.
If so, are Havok physics algorithms so unique that they are they the only one that can be implemented in a GPU?
Or can other algorithms be adapted to be implemented in GPU as well?

If the answer to those questions is No, then it seems to me that this is a natural consequence, as the GPU became more powerful and more flexible to use via a high level language, people will have the same idea to use them for things other that graphics.
What stops ODE, Newton, Tokamak, TrueAxis, Bullet, and any other person from implementing their code and algorithm in a GPU?
I can understand them not having access to AGEIA's board, but the shader model programming languages is free, or aren’t they.

Say Havok can stack 50000 boxes because they have lighting fast algorith, say that other free engine can only do 1000, a ratio 50 to 1 (and those are very extreme estimates)At what point the numner of stacking boxes becausle meaningless (1000, 2000, 50000, 1000000, is there a limit)

I had been working on the video game industry for a long time, and I had worked on AAA titles with very crappy engines, and also in games with engine with very fine pedigree.
In my opinion what make a game successful is not the power of the technology, but the quality of the design.
Technology in fact plays very small role on that, in fact games released base solely on the merits of a new technology have very small lime time and fade out very quickly.
Quote:Original post by Anonymous Poster
Here are some dumb question
Does Nvidea and Havok have an agreement that let Havok use special non-disclosed feature of the GPU?
Isn't shadel model 3 a generic programming model that can be used by anybody.
If so, are Havok physics algorithms so unique that they are they the only one that can be implemented in a GPU?
Or can other algorithms be adapted to be implemented in GPU as well?

If the answer to those questions is No, then it seems to me that this is a natural consequence, as the GPU became more powerful and more flexible to use via a high level language, people will have the same idea to use them for things other that graphics.
What stops ODE, Newton, Tokamak, TrueAxis, Bullet, and any other person from implementing their code and algorithm in a GPU?
I can understand them not having access to AGEIA's board, but the shader model programming languages is free, or aren’t they.


For the present and near future, the ideal HW solution is very fast general purpose parallel FPU units with unified memory between physics and graphics. As the dynamic object count goes up, the rigid bodies become more like particles than typical game element objects. Particle systems naturally batch render states/data for very high speed rendering. General purpose, unsorted/non-batched rendering requires too many GPU state changes, resulting in reduced frame rates (GPU makers should improve this weakness).

Without unified physics+graphics memory, the very large amount of bus traffic, GPU state changes, and synchronization issues will dramatically limit the number and quality of dynamic/physical objects. General environment interaction (height field, BSP, box/sphere tree, parametric/implicit surfaces) requires interaction primitives to be native in the PPU/GPU memory for maximum performance. This means that for very large environments, interaction primitives will have to be paged into memory, much in the same way textures are paged in for graphics. Given that some consumer GFX products now have 512+MB of RAM, some games will be able to upload all interaction primitives to the GPU (and perhaps the GPU driver (or dev API) will bias memory use for physics, causing textures to be paged in from CPU memory (still fast via AGP 8x, PCI-e/Xn, etc.).

The limit for the number of dynamic objects can be computed from update state size (orientiation+position, etc.), FPU rate and bus bandwidth for PPU's. For GPU's (or GPU's with integrated PPU's), the bus bandwidth/sync delays can be removed from the equation.

SM3.0 is an industry standard; there should be nothing stopping anyone from porting their physics engine to run on a SM3.0 GPU. Some physics engines may be harder (or impossible) to port without radical changes. Any constraints imposed by the GPU/SM3.0 will make it obvious how the physics engine must be implemented/changed (by obvious, I mean not novel (nor patentable): the process of solving the problem by one reasonably skilled in the field is straightforward, and will be solved the same way by others due to the constraints of the system). Thus, published examples/papers on implementing GPU physics will help prevent non-novel patents from being granted. If many papers/demos are published, and no one discovers some novel aspect of GPU physics, a patent related to such implementation is fair game.

Once my game ships, I'll take a look at SM3.0 to see how much time it would take to port my physics engine. If my physics engine is well-matched to an SM3.0 implementation, it will be able to run on nVidia and ATI hardware, which is a far larger market than any other hardware solution. Thus, it may be an advantage to anyone creating custom physics hardware to support an SM3.0 API. Again, as with early graphics (OpenGL+DX), game developers desire solutions that can be run on as many consumer devices as possible. The cool thing about something relatively low-level like SM3.0 is that it is either supported or it isn't: no device cap. specializations to deal with (will not be a problem with DX10+ hardware for GFX). Again, another possibility is an OpenPL/DirectPhysics API, however general purpose FPU elements as provided by SM3.0 may be more generally useful (can be used for anything; AI, advanced audio, etc.). Perhaps something like OpenFP, DirectFP, etc. to show generality (not just GFX (SM)).

I agree that content+gameplay is the most important element of a game. However, accelerated physics is very useful in the same way accelerated GFX is useful: no one uses software GFX anymore (except in certain special cases). Multicore CPUs will provide another option for accelerated physics (providing at most 2x speed up for dual core, 4x speed up for quadcore, etc. Not as dramatic as GPU/PPU acceleration).
DrEvil..

Effects physics are just the low hanging fruit for this generation. I think you need DX10 level hardware that allows for a reasonable amount of readbacks before you'll see general purpose physics cpu/gpu hybrids.

Anony (get an account!)..

You are very correct, already/soon more of the smaller players will have pieces of their physics on the gpu. In fact.. I believe it will be the smaller players that will end up doing the most impressive stuff. I say give it less then two years and we'll see a book called.. GPU Physics Gems.
very unhelpful and/or unfriendly
Quote:Original post by billy_zelsnack
You are very correct, already/soon more of the smaller players will have pieces of their physics on the gpu. In fact.. I believe it will be the smaller players that will end up doing the most impressive stuff.


IIRC, you have a pretty decent physics engine... Now (partially?) running on the GPU?

Quote:Original post by billy_zelsnack
I say give it less then two years and we'll see a book called.. GPU Physics Gems.


Perhaps change the "Graphics" in GPU to "Game": Game Processing Unit... This covers graphics, physics, AI, etc.
Quote:Original post by Anonymous PosterOr can other algorithms be
adapted to be implemented in GPU as well?


Of course other physics algorithms can be implemented on the GPU. And, of course they have already been implemented, almost since programmable GPU's were first available. Check out General-Purpose Computation Using Graphics Hardware

Quote:Original post by Anonymous Poster
I had been working on the video game industry for a long time, and I had worked on AAA titles with very crappy engines, and also in games with engine with very fine pedigree.
In my opinion what make a game successful is not the power of the technology, but the quality of the design.
Technology in fact plays very small role on that, in fact games released base solely on the merits of a new technology have very small lime time and fade out very quickly.


In the interest of adding credibility to your opinions and analyses (which to date you have never backed up with verifiable technical data or personal background info), please tell us what specific AAA games you have worked on, when and how long were you in the industry, what your role was at the time (physics lead, lead developer, testing & Q/A, etc.), why you are no longer in the game industry, and (if applicable) how you have contributed publically to the advancement of physics technology in the industry. Since you apparently are no longer in the industry and therefore probably not under any active confidentiality agreements regarding as-yet unreleased games, there should be no problem with you writing about specific games.
Graham Rhodes Moderator, Math & Physics forum @ gamedev.net
First off, I should say I work at Havok, but happily as a developer it is not my job to add 'spin', so this is not what I am trying to do in this post. This is by far the best thread I have read on physics hardware post GDC, and in particular I think most of what John says is spot on.

I wanted to add a note to the discussion on what I see as a current difference between hardware and software physics simulations as they apply to PC games. Its best illustrated by an example:

Say you want to implement fracturing rigid bodies. You probably have your own fracture model, how you want things to break. Things make break into pre-authored or procedural pieces. If the pieces are procedural you may want them to conform to certain constraints you need for your game. The criteria for fracture may be based on physical conditions (strong forces at contacts, internal stress analyses etc), non physical conditions arising from the game, or anything in between.

The problem is that you often need to evaluate this information right in the middle of the physical simulation step, and take appropriate action (removing bodies, adding new ones, canceling collisions etc). There are definitely ways around doing this (i.e. to modify the simulation after a step, and reset velocities etc), but doing this introduces all sorts of limitations and inaccuracies, particularly if you want to use a continuous simulation, which most of our customers want to, to some extent at least. In this case consider a collision which occurs the start of a step. It causes all sorts of other things to happen in the same step, e.g. the objects bounce off, hit other things etc. But you wanted that collision to break the objects involved, which means that completely different things should have happened. Trying to fix this up as a post process sounds like a nightmare to me.

One way around this issue is to allow users of the hardware physics solution to insert user code to be evaluated at collisions, but there probably is a limit to what such code can do. Another solution is to break up the monolithic stepPhysics() hardware call into finer grained calls, which can be hidden behind a software physics driver, but this granularity may destroy the performance of the hardware, which typically gains its performance wins by being able to process very large amounts of instructions and data per call.

I don’t think these difficulties are insurmountable, but I personally would be very wary of an OpenPL / Direct Physics style physics api at this early stage as a way for everyone to implement all their game physics. Without a really good shared memory architecture I still see hardware physics as a special effects add-on (which is still very cool). But then I guess I would say that…

This topic is closed to new replies.

Advertisement