Sign in to follow this  

How many triangles can xbox 360/ps3 push per frame?

This topic is 4355 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Assuming that every polygon has multiple texture passes(color, normal, gloss, and specular coefficient, lightmap), and all the textures are high resolution 512x512 and 1kx1k how many polygons/triangles can xbox 360 and ps3 handle on a per-frame basis? thanks

Share this post


Link to post
Share on other sites
oil-on-glass, I can't answer your question specifically, because it is very unrealistic. First, an inaccuracy: color, normal, gloss, spec and lightmap should all be able to be rendered in a single pass. Only shadows and effects which look-up more than 4 textures (which couldn't be produced by math equations) should require multiple passes.

Second, it would be downright stupid for developers to render all polys with the same quality and texture resolution. High quality pixel and vertex shaders will be LOD'ed out 5-10 metres from the camera, and high rez textures are terrible for texture cache coherency; so they cost not only memory but per-pixel speed. 75% of pixels can be rendered at top-speed with minimal shaders without any appreciable difference in quality.

So, I couldn't begin to guess what "maximum possible" triangle count would be, but for a normal 60 FPS game: whereas for PS2 it was 50k - 75k, for PS3 I would guess 100k - 300k. Of course, that ballpark-figure alone should demonstrate how meaningless triangle count is... those foreground triangles will be dramatically shaded and dynamic in ways far beyond what shaders can provide (dynamics calculated by the insane extra CPU power).

Share this post


Link to post
Share on other sites
Relevant specs:

Xbox 360:

Polygon Performance - 500 million triangles per second

Pixel Fill Rate - 16 gigasamples per second fill rate using 4x MSAA

Shader Performance - 48 billion shader operations per second

Share this post


Link to post
Share on other sites
Why would you want to measure triangle per frame? I would assume it would be limited by the amount of memory in the system. If you want to measure the performance you would want to measure triangles per clock or per second.

Share this post


Link to post
Share on other sites
Quote:
Original post by teh_programerer
Quote:
Original post by parasolstars
correct me: essentially, xbox has a radeon 850xt gpu


It is indeed an ATi R520 Chip in the XBOX 360.


Actually it isn't an R520 chip its actually called Xenon and has a unified shader achitecture so you really can't compare it to a R520 its probably more like a hybrid R580.

Share this post


Link to post
Share on other sites
To clarify, the ATI GPU is not named Xenon, but Xenos. Xenon is the triple core IBM CPU. As has been said, the Xenos is not comparable to any current desktop GPU. The two primary distinguishing features are the Unified pixel/vertex shader units, and the 10MBs of INCREADIBLY FAST embedded framebuffer, which sits on a bus which is 2560 (yes, 2-5-6-0) bits wide and is what permits anti-aliasing to be essentially free. The technology does stem from what ATI intended to be the successor to the 9800 series, but for now they have simply chosen to modify and scale the 9800s basic technology on newer cards to maximize profit. ATI will almost certainly bring a desktop part out that will share many of Xenos' technology soon, but I would guess that's at least another generation or two out, 18-24 months or so. The Xenos also supports a feature set that is beyond DX9, but not quite DX10 either. The 360 equivilant of Direct3D also reflects this, though it is not directly derived from either API version.

Its hard to give a triangle count, but honestly triangle-setup has not been the bottle-neck for some time. GeForce4 generation hardware could setup something like 150 million triangles IIRC, but of course, that says nothing about how many you can actually draw. "triangles" has become irrelivant, nowadays its more about fillrate and shader perfomance.

Share this post


Link to post
Share on other sites
ravyne, is that all true?

I'm no GPU expert, but I thought multi-sample AA was done by running the pixel shader multiple times with slightly jittered samples within a pixel, then blending together the result. I don't see how the edram would make that faster, since still only the final result is written to the framebuffer... you have to run the pixel shader 4x as much, so that is the bottleneck over non-AA. I would see the edram helping mostly with framebuffer blends.

Anyway, does 10 MB seem smallish?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
For a frame buffer thats huge I would say

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by ajas95
ravyne, is that all true?

I'm no GPU expert, but I thought multi-sample AA was done by running the pixel shader multiple times with slightly jittered samples within a pixel, then blending together the result. I don't see how the edram would make that faster, since still only the final result is written to the framebuffer... you have to run the pixel shader 4x as much, so that is the bottleneck over non-AA. I would see the edram helping mostly with framebuffer blends.

Anyway, does 10 MB seem smallish?


I really should have specified FSAA (Full Screen Anti-Aliasing) which, correct me if I'm wrong someone, but as far as I know is accomplished by rendering internally at 2x the resolution in each axis, then averaging each 2x2 quad down to a single pixel before displaying it (in the case of 2xFSAA). The EDRAM makes it faster simply because theres alot of read/write going on to accomplish this. Its really quite good for any sort of post-processing effect: motion blur, depth-of-field, FSAA, etc. Another reason its so fast is that much of the final pixel-processing logic is on the same die as the EDRAM, thats actually where the 2560bit bus exists, the bus between the two physical dies is much smaller (however, theres less data handled here). You're correct that shader performance is also taxed though, as it does have to be run 4x as much. Possibly the pixel-processing logic does this internally.

10MB sounds small, but you have to remember that its ALL framebuffer. Texture, vertex and other data is stored in the 512MB of GDDR3 memory. 10MBs is kind of an odd number really, though I'm sure it was chosen for a good reason. The Xenos GPU supports HD resolution by using several large 'tiles' rendering them with FSAA individually then piecing the final frambuffer together in system ram. Its similar to how PowerVR's architecture works, except that its a small number of large tiles, rather than a ton of 32x32 tiles.

Share this post


Link to post
Share on other sites
Quote:
Original post by ajas95
(dynamics calculated by the insane extra CPU power).


Ive read an article which was posted on AnandTech which discussed about the CPUs in XBOX 360 and PS3, with feedback from developers working on actual games from these platforms, the general feedback was that these CPUs were actually a disappointment, particulary the PS3, which Sony has a reputation of overhyping things as in the past. Anyone remember how the PS2 was supposedly banned for export because it was so powerful that it could be used to develop advanced nuclear weapons?

But with the fear that Microsoft might want to trace these developers down, the article was removed.

However there are still some articles which delve deep into the architecture which make an interesting read.

CLicky
Clicky2

Share this post


Link to post
Share on other sites
Hmm... I'm not scared. Having a lot of first hand experience with both PS3 and XBOX360, I'd say this pretty much sums up the general opinion: It's easier to get more performance out of the XBOX360 without radically re-inventing any techniques (3 symmetric general processing cores). The PS3, on the other hand, can technically churn out more floating point operations but only if your operations are highly vectorizable.

The problem with the PS3, in my opinion, is that a lot of game code, particularly interesting game code, is non-trivial to vectorize efficiently. Yeah, it's great if you're doing generic physic solves or software transforming tons of vertices (for particle effects), but doing animation state flow (a huge focus of next-gen) is far more straightforward on the XBOX360.

And, BTW, the GPU in the PS3 essentially is the latest and greatest desktop GPU from NVIDIA (6800?).

Share this post


Link to post
Share on other sites
Quote:
Original post by Simagery
The PS3, on the other hand, can technically churn out more floating point operations but only if your operations are highly vectorizable.


I think it's a little misleading to say this; your implication is that the SPUs are only capable of vectorized floating point operations, when in fact they are more than capable of running bog standard c/c++ non-vectorized code as well; even if they may not be quite as good at this as they are at highly vectorized float ops, we're still talking about Seven 3.2Ghz processors on top of the 3.2Ghz PPU which is a hell of a lot of power.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by JohnBolton
Relevant specs:

Xbox 360:

Polygon Performance - 500 million triangles per second

Pixel Fill Rate - 16 gigasamples per second fill rate using 4x MSAA

Shader Performance - 48 billion shader operations per second


so 500 million polygons per second divided by 60 fps = approx 8.3 million polygons drawn on frame at once? would this be assuming the triangles have texturing or other effects?




Share this post


Link to post
Share on other sites
Xbox games generally run at 30fps (strictly speaking it's 60hz interlaced), so double that figure. But as pointed out, poly counts are not a problem and haven't been for some time.

Share this post


Link to post
Share on other sites
Quote:
Original post by GamerSg
Ive read an article which was posted on AnandTech which discussed about the CPUs in XBOX 360 and PS3, with feedback from developers working on actual games from these platforms, the general feedback was that these CPUs were actually a disappointment, particulary the PS3, which Sony has a reputation of overhyping things as in the past.


To be honest, whoever wrote that article wasn't a console developer.

This particularly is damning:
Quote:

The Cell processor doesn’t get off the hook just because it only uses a single one of these horribly slow cores; the SPE array ends up being fairly useless in the majority of situations, making it little more than a waste of die space.


These are clearly just PC programmers trying to make the jump to consoles hoping they don't have to learn anything new... But yeah, to use these effectively you have to re-think your algorithms and re-structure your code.

So it's hard, but it's not impossible. And the reward is tremendous.

[edit] So I went back and read that whole second Anand article, and this page really sums it up well.

[Edited by - ajas95 on January 21, 2006 3:11:21 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Promit
Xbox games generally run at 30fps (strictly speaking it's 60hz interlaced), so double that figure. But as pointed out, poly counts are not a problem and haven't been for some time.


I don't know if you could say that generally. Most of the Xbox titles I'm familiar with run at 60hz progressive (as there isn't really an interlaced video mode on the Xbox, it's always full frame buffer). Though there is certainly a fair share (just like the PS2) that run at 30hz progressive.

Share this post


Link to post
Share on other sites
Quote:
Original post by WillC
Quote:
Original post by Simagery
The PS3, on the other hand, can technically churn out more floating point operations but only if your operations are highly vectorizable.

I think it's a little misleading to say this; your implication is that the SPUs are only capable of vectorized floating point operations, when in fact they are more than capable of running bog standard c/c++ non-vectorized code as well; even if they may not be quite as good at this as they are at highly vectorized float ops, we're still talking about Seven 3.2Ghz processors on top of the 3.2Ghz PPU which is a hell of a lot of power.


You're right, operations was the wrong word to use. I didn't mean literally the assembler operation, I was thinking more algorithmic operation such as "cull objects against frustrum," etc.

Re-reading my post, I guess I could more clearly state my concern: because the PS3 SPUs have specific constraints about how they can access memory, various bus contention issues, as well as instruction ordering specifics, your code has to be specifically designed for a very parallelized architecture to soak up all those cycles. This really just iterates my opinion that while the PS3 may have more cycles for the developer to use there's no guarantee that the developer will be able to write code to use them effectively. On the XBOX360, where you have 3 cores, it's a bit more obvious how to break up a game engine into 3 fairly independent (or cooperative) parallel tasks... but breaking it up into 7? Well, you start getting ordering dependencies, etc...

Share this post


Link to post
Share on other sites
Quote:
Anyone remember how the PS2 was supposedly banned for export because it was so powerful that it could be used to develop advanced nuclear weapons?

what do u mean 'supposedly banned' from memory it was in fact 'actually banned'

Share this post


Link to post
Share on other sites
Quote:
Original post by Simagery
This really just iterates my opinion that while the PS3 may have more cycles for the developer to use there's no guarantee that the developer will be able to write code to use them effectively. On the XBOX360, where you have 3 cores, it's a bit more obvious how to break up a game engine into 3 fairly independent (or cooperative) parallel tasks... but breaking it up into 7? Well, you start getting ordering dependencies, etc...


The "ordering dependencies" is a really good point (in my mind it is the only point, since it will be the most significant bottleneck). And not only within the logic of a frame but also regarding the GPU (getting back to "maximum triangles" topic).

For instance, say you want to render some cloth you've attached to a character, and this cloth's vertices get calculated on an SPE. Well, you need to have issued GPU instruction to render that cloth and then provide syncronization so that when the GPU gets around to fetching those vertices, they've already been calculated. BUT, doing the cloth physics is dependent on the collision geometry being in place, which depends on the character having already been animated, which depends on the game logic setting the current animation...

To do it properly, it will be very complicated. I think it is still possible though. Of course algorithm sequencing and dependencies will have to be planned very thoroughly, which recalls my suggestions about PC programmers. I think PC game developers are used to processors and memory architectures where many good things happen on accident. Console developers I know say that performance never happens on accident... probably so it is with parallelism.

Share this post


Link to post
Share on other sites
Quote:
Original post by ajas95
For instance, say you want to render some cloth you've attached to a character, and this cloth's vertices get calculated on an SPE. Well, you need to have issued GPU instruction to render that cloth and then provide syncronization so that when the GPU gets around to fetching those vertices, they've already been calculated.
Those SPEs can feed directly to the GPU, you just have it take the data when it's ready.
Quote:
BUT, doing the cloth physics is dependent on the collision geometry being in place, which depends on the character having already been animated, which depends on the game logic setting the current animation...
Screw that. Just use the data from the previous frame. It won't even be a noticeable difference.

Share this post


Link to post
Share on other sites
Quote:
Original post by Promit
Screw that. Just use the data from the previous frame. It won't even be a noticeable difference.


And where will that memory come from?

MSN

Share this post


Link to post
Share on other sites
I believe that "pipeline" designs will be much more common in game engines. And the extra buffering memory will just have to be taken out of the half a gigabyte that's available on these new consoles -- sure beats the 32 MB you had on the PS/2 :-)

Quote:

I don't know if you could say that generally. Most of the Xbox titles I'm familiar with run at 60hz progressive


Last I checked, the Xbox 1 doesn't actually have field rendering hardware, so games run at 30 Hz progressive.

Share this post


Link to post
Share on other sites

This topic is 4355 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this