• Create Account

## physx chip

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

224 replies to this topic

### #41Anonymous Poster_Anonymous Poster_*  Guests

Posted 21 April 2006 - 12:31 PM

The GPU has already filled in the niche market for parallel data tasks. There are already basic physics simulations "graphic gems" and its only going to get better over time. The new shader models will only make physics simulation easier and more powerful, and doesn't involve a middleman. I believe the people at AGEIA and some investors took what seemed like a good idea and ran with it without really thinking things through. Time will tell that I am right.

### #42Sneftel  Senior Moderators

Posted 21 April 2006 - 12:40 PM

The problem with the GPU for general-purpose computation is, it's stupid at doing things that don't fit neatly into four-float vectors. The thing is, that's why a PPU is also (in my opinion) a stupid idea. I DO think we're going to have something like a PPU in our systems, but it's going to be a general-purpose vectorized processing unit. Its application will not be etched into its silicon.

### #43Cubed3  Members

Posted 21 April 2006 - 12:45 PM

Quote:
 Original post by Anonymous PosterThe GPU has already filled in the niche market for parallel data tasks. There are already basic physics simulations "graphic gems" and its only going to get better over time. The new shader models will only make physics simulation easier and more powerful, and doesn't involve a middleman. I believe the people at AGEIA and some investors took what seemed like a good idea and ran with it without really thinking things through. Time will tell that I am right.

PhysX SDK aka NovodeX is also the physics engine included with the PS3 Dev Kits. Ageia's software has a very good support for parallel processing which is why they don't really have themselves dug into a hole even if the PPU doesn't take off.

### #44Cubed3  Members

Posted 21 April 2006 - 01:07 PM

Quote:
 PhysX is nothing but a 400 MHz chip with a handful of SIMD units that are very much like SSE. And given the extra overhead the actual available processing power of a PPU could be closer to a CPU than you think.By the way, let's have a look at Cell. 4 GHz x 8 vector coprocessors x 8 floating-point calculations per clock cycle = 256 GFLOPS. This just wipes the floor with PhysX. Also, a GeForce 7800 GTX is 165 GFLOPS. And yes, Cell is a CPU! x86 processors are evolving in the same direction.

Cell is overhyped :-P

Anyways I dunno who is the better performer, but I will this summer ;-)

I plan on upgrading my athlon64 system to a dual core FX and purchasing a PPU(Im a Mech E, I figure it will be nice for some fluid simulations and rigid body dynamics, not 100% accurate, but who knows maybe it will?)
Another good thing is that NovodeX(AKA the PhysX SDK) Is free to use in a commercial product if you support hardware accelerated features via a ppu. On top of that NovodeX is better then ODE, dunno how it compares to newton though.

### #45JBourrie  Members

Posted 21 April 2006 - 01:12 PM

Quote:
Quote:
 Once upon a time, real-time lighting wasn't efficient either. When it became practical, it started appearing everywhere.

What are you talking about? Lighting has been real-time since the first 3D game. Don't mistake a modern CPU for a pocket calculator.

I'm not sure that's what he meant. Real-time lighting is a simple "choose greyscale light based on dot product" calculation. This is pretty much required for 3D. But good looking dynamic lights, lighting based on blended diffuse, pixel shading, and the newest techniques like HDR and normal maps are lighting that wasn't efficient until more recently. Now it's everywhere, possibly even on those pocket calculators.

Check out my new game Smash and Dash at:

### #46Cubed3  Members

Posted 21 April 2006 - 01:54 PM

FYI from BFG techs website

Specifications

Processor: AGEIA PhysX

Memory Interface: 128-bit GDDR3

Memory Capacity: 128MB

Peak Instruction Bandwidth: 20 Billion/sec

Sphere-Sphere Collisions: 530 Million/sec max

Convex-Convex (Complex Collisions): 533,000/sec max

smokin!

### #47C0D1F1ED  Members

Posted 21 April 2006 - 10:26 PM

Quote:
 Original post by Cubed3Processor: AGEIA PhysXMemory Interface: 128-bit GDDR3Memory Capacity: 128MBPeak Instruction Bandwidth: 20 Billion/secSphere-Sphere Collisions: 530 Million/sec maxConvex-Convex (Complex Collisions): 533,000/sec max

Processor: Intel Pentium D 950
Memory Capacity: ~2 GB
Instruction Bandwidth: 20.4 guops/sec (sustainable)
Sphere-Sphere Collisions: 1.7 billion/sec (theoretical)
Triangle-Triangle Intersection: 425 million/sec (theoretical)

I should also add that this processor has a crappy architecture compared to next generation's standards. The efficient branching and cache also allow advanced optimizations to avoid wasting time with the brute-force approach. So let's not stare ourselves blind at the raw numbers. I'm sure Ageia fears a direct benchmark between PhysX and the latest CPU for a real game.

Smokin?

### #48Cubed3  Members

Posted 22 April 2006 - 02:40 AM

Quote:
 Original post by C0D1F1EDProcessor: Intel Pentium D 950Memory Capacity: ~2 GBInstruction Bandwidth: 20.4 guops/sec (sustainable)Sphere-Sphere Collisions: 1.7 billion/sec (theoretical)Triangle-Triangle Intersection: 425 million/sec (theoretical)

Somehow I highly doubt those numbers...
Where did you get them?

I don't think its nearly as powerful as that

PhsyX vs Pentium XE 840 HT

[Edited by - Cubed3 on April 22, 2006 9:40:40 AM]

### #49C0D1F1ED  Members

Posted 22 April 2006 - 05:28 AM

Quote:
 Original post by Cubed3Where did you get them?

Just calculate them. 3.4 GHz x dual-core x 3 instructions per clock (sustained) = 20.4 gigainstructions per second. SSE can do 4 floating-point operations per clock (but only one can start every clock cycle) so GFLOPS might be even higher. It's a theoretical maximum but so is the number from PhysX. The other numbers are derived from this, assuming optimal SSE code.

Quote:
 PhsyX vs Pentium XE 840 HT

What's the source of this video? Ageia? Of course they will show a demo with a badly optimized software version! Don't be fooled by that. Their marketing is perfect, they want to sell the hardware, but I'm only interested in the capabilities of the product in a real situation versus optimized software.

Besides, a 6+ GFLOPS CPU not capable of handling 6000 objects at more than 5 FPS? Please. That's 200,000 floating-point operations per object. Two-hundred-thousand! Unless you're doing some really stupid brute force collision detection that's a vast amount of processing power.

### #50Cubed3  Members

Posted 22 April 2006 - 05:42 AM

Quote:
Original post by C0D1F1ED
Quote:
 Original post by Cubed3Where did you get them?

Just calculate them. 3.4 GHz x dual-core x 3 instructions per clock (sustained) = 20.4 gigainstructions per second. SSE can do 4 floating-point operations per clock (but only one can start every clock cycle) so GFLOPS might be even higher. It's a theoretical maximum but so is the number from PhysX. The other numbers are derived from this, assuming optimal SSE code.

Quote:
 PhsyX vs Pentium XE 840 HT

What's the source of this video? Ageia? Of course they will show a demo with a badly optimized software version! Don't be fooled by that. Their marketing is perfect, they want to sell the hardware, but I'm only interested in the capabilities of the product in a real situation versus optimized software.

Besides, a 6+ GFLOPS CPU not capable of handling 6000 objects at more than 5 FPS? Please. That's 200,000 floating-point operations per object. Two-hundred-thousand! Unless you're doing some really stupid brute force collision detection that's a vast amount of processing power.

How did you calculate the sphere-sphere collisions? Im qutie curious :P

Also the source of the video is from ageia of course... But the software physics is being done via NovodeX which is a very optimized physics engine.

200,000 floating point operations per object? Is that per frame or per second?

### #51RedDrake  Members

Posted 22 April 2006 - 06:27 AM

Quote:
Original post by C0D1F1ED
Quote:
 PhsyX vs Pentium XE 840 HT

What's the source of this video? Ageia? Of course they will show a demo with a badly optimized software version! Don't be fooled by that. Their marketing is perfect, they want to sell the hardware, but I'm only interested in the capabilities of the product in a real situation versus optimized software.

Besides, a 6+ GFLOPS CPU not capable of handling 6000 objects at more than 5 FPS? Please. That's 200,000 floating-point operations per object. Two-hundred-thousand! Unless you're doing some really stupid brute force collision detection that's a vast amount of processing power.

That demo seemed like a rocket demo (Ageia PhysX demo framework), and i think you can download the demo (not 100% sure tough). See for your self, there are various demos in that suit and compared to other physics engines i saw, they are weary fast and accurate.

IMO a thing to note from video is that using PhysX card the CPU usage is minimal merely for updates. Even if CPU can calculate physics as fast as PPU, PPU is still useful since CPU can't be dedicated to physics only. And havening realistic-real time physics solution is worth couple of hundred $for a true gamer, since PPU should not have the tendency to upgrade constantly like GPU (every couple of months), and anyway a person who can afford dual core CPU (like you proposed) and other equipment can probably afford the extra$ for PPU.
Ageia PhysX simulation can run in software and is probably optimized for next gen consoles (it is available for X Box 360 & PS3) , since PC CPUs aren't built with that much vectorized units, PhysX card is a good replacement.
The only real barrier i see is the lack of the games for PPU to be useful ... but this is going to change i hope ...

### #52C0D1F1ED  Members

Posted 22 April 2006 - 08:24 AM

Quote:
 Original post by Cubed3How did you calculate the sphere-sphere collisions? Im qutie curious :P

// Vector between sphere centersmovaps xmm0, c0xsubps xmm0, c1xmovaps xmm1, c0ysubps xmm1, c1ymovaps xmm2, c0zsubps xmm2, c1z// Length calculation (reciproke)mulps xmm0, xmm0mulps xmm1, xmm1mulps xmm2, xmm2addps xmm0, xmm1addps xmm0, xmm2rsqrtps xmm0, xmm0// Compare to sum of radiimovaps xmm1, r0addps xmm1, r1mulps xmm0, xmm1cmpltps xmm0, one

Some explanation might be needed: This calculates sphere-sphere collision of four sets of spheres in parallel. Also, rsqrtps calculates the reciprocal square root at half precision. To avoid using rcpps to invert that again (and lose more precision than acceptable), I multiply by the sum of the radii (1/x < y becomes y/x < 1). So this is 16 instructions for 4 tests, or 4 instructions per test. Which should bring us to the number I wrote earlier. This is obviously peak performance but I'm pretty sure Ageia counted their performance in the same way. Strictly speaking there's a movmskps needed to get the final results but that's compensated by the fact that movaps instructions are not arithmetic and will be executed in parallel by the load/store units.
Quote:
 Also the source of the video is from ageia of course... But the software physics is being done via NovodeX which is a very optimized physics engine.

NovodeX is an Ageia product. If it runs several times slower than optimal then that doesn't really hurt Ageia. And for this demo they wanted PhysX to look good so I wouldn't be surprised if they crippled the software version to make it look even better (e.g. by forcing a brute-force approach).

Part of the problem is that we don't have independent benchmarks and quality specifications for physics in games yet. But I think we can expect that to be standardized pretty soon.
Quote:
 200,000 floating point operations per object? Is that per frame or per second?

Left your pocket calculator at home?

### #53C0D1F1ED  Members

Posted 22 April 2006 - 08:58 AM

Quote:
 Original post by RedDrakeThat demo seemed like a rocket demo (Ageia PhysX demo framework), and i think you can download the demo (not 100% sure tough). See for your self, there are various demos in that suit and compared to other physics engines i saw, they are weary fast and accurate.

I'm sure it's pretty fast and robust. They can't create hardware suited for physics calculations without superiour knowledge and excellent software. But that doesn't mean it's absolutely optimal yet, and the comparisons are fair. By the way, what other physics engines did you compare it with? Most freely available physics engines are C/C++ code that use suboptimal algorithms (lack of knowledge) and no SSE (for portability). It can't be that hard for a professional product to beat that. But in the comparison demos they specifically wanted PhysX to look good. They know it's hard to market so a bit of cheating has to be very tempting... They pretty much have a monopoly anyway and using non-standardized benchmarks works in their favor.
Quote:
 Even if CPU can calculate physics as fast as PPU, PPU is still useful since CPU can't be dedicated to physics only.

You can pretty much dedicate a whole core to it with dual-core processors.
Quote:
 ...and anyway a person who can afford dual core CPU (like you proposed) and other equipment can probably afford the extra $for PPU. Within a few months, Intel's Conroe processors will set new performance records for a very modest price. This is extra dual performance that will soon be available for all games, and for other applications as well. So you won't need big investments to run impressive games with great physics. Buying a 250$ card with a PPU really seems like a waste of money to me. Of course there will still be people who buy it anyway.

This is just my personal opinion of course. I just think multi-core CPUs have more future than PPUs. In three years from now, I envision 5+ GHz octa-cores with improved SIMD units (higher issue width, reverse hyper-threading) that are capable of doing everything except the heaviest graphics work. I don't see PCs with weak CPUs and a third processor. Both Intel and AMD produce the most advanced chips on the planet and the competition between the two will yield some very interesting results.

### #54Alpheus  GDNet+

Posted 22 April 2006 - 09:03 AM

honestly, i don't see the big deal. a dedicated PPU processor takes the math intensive load off the CPU. the GPU does its regular job. If you have a dual core chip or two processors you can use the other core/processor for AI. Plus if the chip is made specifically for physics processing it's going to beat a CPU/core hands down.

### #55C0D1F1ED  Members

Posted 22 April 2006 - 09:44 AM

Quote:
 Original post by Alpha_ProgDeshonestly, i don't see the big deal. a dedicated PPU processor takes the math intensive load off the CPU. the GPU does its regular job. If you have a dual core chip or two processors you can use the other core/processor for AI. Plus if the chip is made specifically for physics processing it's going to beat a CPU/core hands down.

Do you really expect everyone who wants to play games 'the way they are meant to be played' to buy an extra card? It's simply not economical. For my next PC I don't want to be forced to choose yet another component. Also think about laptops, where the trend has been to integrate everything in the least amount of components and keep it cool as well. And laptop sales have surpassed desktop sales, but people expect to be able to play games on them almost equally well.

And where's the limit really if we introduce more dedicated hardware into our systems? There have been speculations about A.I. chips as well (this is the A.I. forum after all). But A.I. uses mainly integer calculations so another chip would have to be added if the CPU isn't considered fast enough. This adds even more bandwidth and synchronization overhead. And while we're at it lets add a chip for volumetric sound propagation, with a specialized architecture for accessing advanced data structures. Heck, why not implement each game separately as a chip? Ok, I'm getting ridiculous now but I had pretty much the same feeling when I first heard about a physics chip for games. It's a reasonable idea for consoles, but PCs have to be general purpose to keep the complexity manageable.

Besides, like I said before, let's not underestimate CPUs! In the Quake era people knew how to get the best performance out of every clock cycle. Nowadays an average programmer doesn't understand software performance beyond big O. They also expect the compiler to optimize their crappy code and blame the CPU when it doesn't run at the expected performance (instead of themselves). I personally have optimized several applications by 2-5 times using assembly code and improving the design in the right spots. Last but not least, CPU manufacturers finally realized that clock frequency isn't everything, and started to focus on vast improvements in IPC and thread parallelism.

If extra processing power is really absolutely required then I see a lot more future in using the GPU for physics. Especially with DirectX 10 and unified architectures, GPUs will be pretty much a huge array of SIMD units anyway. Borrowing a few GFLOPS shouldn't be much of a problem, and there would be little or no overhead to get the results from the physics stage to the rendering stage. So even if I'm wrong about future CPU performance, PhysX still has another huge threat.

### #56Nick Gravelyn  Members

Posted 22 April 2006 - 09:49 AM

Quote:
 Original post by C0D1F1EDInvesting in a robust physics engine would have made all the difference.

### #57C0D1F1ED  Members

Posted 22 April 2006 - 10:35 AM

Quote:
Original post by NickGravelyn
Quote:
 Original post by C0D1F1EDInvesting in a robust physics engine would have made all the difference.

Indeed, thanks for the info. I can't see such bugs in Half-Life 2 and it's powered by Havok as well. So apparently the integration of Havok didn't go too well. It's again a lack of knowledge as far as I can tell...

### #58BrianL  Members

Posted 22 April 2006 - 06:56 PM

1) I like the idea a specialized CPU for particular types of tasks, but this hardware only runs a single phsyics library. Unless this libary works for physics research (which seems unlikely, as games take shortcut all over) or they provide a scientific version of the lib (not sure what the profit margin is there), this first generation of cards seems very limited.

2) The library runs on next gen hardware, but I would be careful about comparisions with the cell. This hardware has 128 megs of dedicated memory - the cell spus have 512k each and require custom programs to run on them. I am sure they leverage the spus, but I don't know how similar the implementations would be under the hood.

3) Until these cards become mass market, games won't be able to rely on them being present. Until then, they will likely increase your framerate/add visual candy but little else. That said, that may be enough to sell them. They may take off, but I doubt they will become big for several (4-8) years.

4) Half-Life 2 physics look good because Valve got a source code license and spent a bit chunk of time customizing/rewriting it (something like a year+). I am sure the content/tweaking/polish played a bit role too.

### #59Nick Gravelyn  Members

Posted 24 April 2006 - 02:16 PM

Honestly, I wouldn't be surprised if companies do start coming out with games that have different levels of physics or require the PPU altogether. Afterall, Oblivion can't run on most PCs because the video cards it recommends start at X800 (and even that can hardly handle it), so the belief that companies one day may require this for certain titles isn't out of the question.

In today's world, it just comes down to the cash you want to pay to play. Some people would rather spend $250 and have developers utilize that technology (if only for a minority) and create a very physically realistic experience. As I said before, my next PC will have the PhysX chip. I will probably make a game that relies on the hardware because I am such a huge fan of intense physics as well as the improvement to the AI that could be done utilizing CPU power that is freed by outsourcing the physics. And City of Villains is actually about to support the PhysX chip to allow for more particles and more destructive environments. http://ve3d.ign.com/articles/697/697302p1.html Quote:  Original post by C0D1F1EDIf extra processing power is really absolutely required then I see a lot more future in using the GPU for physics. Especially with DirectX 10 and unified architectures, GPUs will be pretty much a huge array of SIMD units anyway. Borrowing a few GFLOPS shouldn't be much of a problem, and there would be little or no overhead to get the results from the physics stage to the rendering stage. So even if I'm wrong about future CPU performance, PhysX still has another huge threat. But that's just dumb as well. Why continue to make the GPUs huge and expensive so they can power physics, something they were never meant to do? I think the 'using GPUs for physics with DX10' thing is just dumb. And honestly, comparing Ageia's physics demos with their PPU vs the Havok/nVidia videos, Ageia wins hands down. Maybe it's just the demos, but I personally don't think that using GPUs for physics will work out as well as everyone hopes. My card (an X800) is still striving hard to run BF2 on highest settings. If they were to take away from the rendering power for physics, I'd be more than upset because essentially instead of saying "You have to buy this PPU to play our game." they are saying "You need to buy nVidia's top-of-the-line GPU because, although the rendering could be done on an X800, we are using your GPU for physics as well." Clearly it's not as large of a deal now, but to achieve what the PhysX chip can do, the GPU requirements for games will go from the top about 20% of cards to the top 5% of cards. ### #60C0D1F1ED Members Posted 25 April 2006 - 03:34 AM Quote:  Original post by NickGravelynHonestly, I wouldn't be surprised if companies do start coming out with games that have different levels of physics or require the PPU altogether. Afterall, Oblivion can't run on most PCs because the video cards it recommends start at X800 (and even that can hardly handle it), so the belief that companies one day may require this for certain titles isn't out of the question. In today's world, it just comes down to the cash you want to pay to play. Some people would rather spend$250 and have developers utilize that technology (if only for a minority) and create a very physically realistic experience.

I'm sorry but that's just silly. I'm playing and enjoying modern games on my budget laptop with integrated graphics. I don't want to be forced to pay several 100 bucks extra just so a game is runnable. I'm sure there are always people willing to spend all their money on new hardware but a game with insane demands will just not sell enough copies. Optimizing a game for mid-end systems vastly increases the market.
Quote:
 And City of Villains is actually about to support the PhysX chip to allow for more particles and more destructive environments. http://ve3d.ign.com/articles/697/697302p1.html

I still want to see proof that a dual-core wouldn't be capable of handling that. They talk about 10 times the number of particles. If the physics processing budget goes from 10% to 100% thanks to dual-core then the exact same thing is possible.
Quote:
 But that's just dumb as well. Why continue to make the GPUs huge and expensive so they can power physics, something they were never meant to do?

GPUs and PPUs are not that much different. Each has a number of SIMD units with a very similar instruction set. Especially the DirectX 10 generation will be capable of efficiently processing floating-point intensive tasks other than vertex and pixel processing. There's nothing in the PPU that makes it any better suited for physics.
Quote:
 If they were to take away from the rendering power for physics, I'd be more than upset because essentially instead of saying "You have to buy this PPU to play our game." they are saying "You need to buy nVidia's top-of-the-line GPU because, although the rendering could be done on an X800, we are using your GPU for physics as well."

What's the difference, really? If I invested the price of a PPU into my graphics card it would have plenty extra FLOPS to perform the physics. I'd have SLI actually... But the most interesting thing is that people have a choice to buy a budget graphics card and still be able to play their games, even if at lower detail. What's best, playing a game at slightly lower detail or not playing it at all?

The same is true for dual-core. Budget versions are on their way and their extra processing power will be available for every application. By the time I see a PPU card in my local store Intel will have a worldwide lauch of Merom/Conroe based processors. Not to mention DirectX 10 will appear around the same time.

Everyone has a CPU and a GPU. Relying on a third processor that only a fraction of end-users will have is crazy. Worst of all is that it's not cost-effective at all.

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.