So what's the verdict?

Hardware Discussion Technical

Started by Gaiiden May 17, 2007 07:30 PM

6 comments, last by _the_phantom_ 16 years, 11 months ago

5,720

Author

May 17, 2007 07:30 PM

So ATI has it's new part out to challenge NVidia - I've seen some blurbs here and there along my web travels over the past few days and it sounds like ATI is still struggling behind NVidia. True? Anyone have the time to take a comprehensive look at the latest benchmarks and reviews?

Drew Sikora
Executive Producer
GameDev.net

_the_phantom_

11,263

May 17, 2007 09:06 PM

Well, the 2900XT is slower than the GTX; however given that it isn't the 'top of the range' product that's hardly shocking (2900XTX was meant to be, more on that in a bit).

It's hotter and more power hungry than than GTS, but it does beat it to a degree, however it's priced to be at that point as well. I would expect things to improve over a few months, speed wise, with newer drivers.

It has a programable tesalator, which depending on the API support, might be the tipping point for me getting one (hey, I like shiney new tech).

Geometry shaders are also (currently) significantly better than NVs solution (currently GS totally tanks out the threads on a G80, I hope it's a driver problem).

The thing has plenty of raw ALU power thats for sure, it'll be intresting to see how they get their compiler up to standard for producing efficant code for it.

Two issues of concern however are the texture samplers and the ROPs/AA;
- NV's texture samplers are better, mostly because they run in their own clock domain and because they have more precision; AMDs are very good at FP16 and int8 taking no speed hit, however at int16 it drops off as it does at fp32, the former being down to a lack of precision.
- The ROPs/AA is an intresting one; the ROPs don't to the AA resolve, instead it is done in the ALU in the shader core. The upside of this is that it is very programable, the downside is it slows things down. Apprently their reasoning was that future games will be using shader based AA resolves and not hardware due to FP render targets; this isn't a new thing really as ATI played the future game before now (R580 vs GF7 is a good example of where a poor performing product at launch came from behind after driver updates), however this might be too early.

The other problem they have is that the process the chip is built on has very poor leakage; a shink to 0.65 is going to be needed to get the XTX out the door as currently the chip is making too much heat and using too much power to make it possible to supply one.

Personally, while I think the R600 is a really cool piece of tech, if I had the cash now I'd wait for the refresh at 0.65 (rumored to be only a few months away!) rather than drop the cash now. I wouldn't go NV because I've heard too much about recent driver cluster fucks, certainly with regards to Vista and their DX10 performance doesn't look that great when compared to the R600 even though their hardware has been out forever. Also, the programmable tesellator has me intrested [grin]

As it turns out I don't have the cash right now so it's not an issue [wink]

ps ATI don't exist, its AMD now [razz]

nts

968

May 17, 2007 09:43 PM

I outline some of my thoughs in the other thread.

Quote:Original post by phantom
Two issues of concern however are the texture samplers and the ROPs/AA;
- NV's texture samplers are better, mostly because they run in their own clock domain and because they have more precision; AMDs are very good at FP16 and int8 taking no speed hit, however at int16 it drops off as it does at fp32, the former being down to a lack of precision.

AFAIK only the ALUs are double pumped and on a seperate clock domain. The TMUs and everything else runs at the core clock.

As for the rates it depends on the channels...

Also ATi has the bandwidth so why didn't they stick more samplers on the chip?

Quote:(R580 vs GF7 is a good example of where a poor performing product at launch came from behind after driver updates), however this might be too early.

Check here for the most recent benchmarks between the X1900XT/X and 7900GTX. The X1900 wins in every benchmark, so better drivers or better hardware (more future looking).

Where are you hearing rumours about the 65nm XTX, I don't think that this is likely at the moment. AMD seems to be having problems with the process with the low end parts RV610/RV630 so doing a 700 million transistor part on it might not be the best idea.

Would AMD also be interested in releasing the XTX, it's not where the money is made (they need it) and would having a halo effect be worth it for them. Or should they just jump right into R700 and get that out the door as fast as possible. NVIDIA has a 6 month lead on them right now...

I still think the XT has a bit of broken silicon (ROPs) and was a cost cut rush to market so I'm also gonna wait to see what the refreshes bring before making up my mind...

google

_the_phantom_

11,263

May 17, 2007 11:08 PM

Quote:Original post by nts
Also ATi has the bandwidth so why didn't they stick more samplers on the chip?

Well, it's not just bandwith and samplers; there isn't enuff precision in the samplers to fetch int16s quickly apprently, and it looks like it'sjust setup for FP32 in two cycles.

Also, with how the samplers are arranged with respect to the ALU arrays I would guess it becomes harder to just dump more in. I'll have to look at the data paths in the beyond3d arc. article again once I've had some sleep today, get my brain around it all.

Still, I think an increase in precision in the samplers might be a good idea, if only to grab int16 textures faster (although a boost to fp32 would be nice as well).

Quote:
Where are you hearing rumours about the 65nm XTX, I don't think that this is likely at the moment. AMD seems to be having problems with the process with the low end parts RV610/RV630 so doing a 700 million transistor part on it might not be the best idea.

Just chatter over on rage3d; I wasn't aware of any process problems, that could well delay things if they can't ramp it up for the low end parts.

Quote:
Would AMD also be interested in releasing the XTX, it's not where the money is made (they need it) and would having a halo effect be worth it for them. Or should they just jump right into R700 and get that out the door as fast as possible. NVIDIA has a 6 month lead on them right now...

Well, the performance crown does matter when it comes to sales of mid/low end parts (OEM deals and the like) so I wouldn' be surprised if they did a limited release. If they can get 0.65 XT out the door as well in the refresh then it would be worth it.

As for the R700, the work on that would have been progressing for some time now, remember it's not a everyone focus on one thing, it's all pipelined, so when the R600 was a year into design is probably when they started kicking about ideas for the R700; I don't think NV has a lead on them, if anything AMD might well have the lead when it comes to experiance with things which are 'possibles' for DX11 and things such as shader AA (which, while it was a bad move now might well prove to be a good one in the future).

I expect the R700 will be a modified R600-refresh chip, much like how the R300 core survived a few incarnations; the R800 will probably be the next redesign job (probably with AMD input as well).

Quote:
I still think the XT has a bit of broken silicon (ROPs) and was a cost cut rush to market so I'm also gonna wait to see what the refreshes bring before making up my mind...

Maybe, maybe not.. apprently this was posted about the AA resolve/ROPs;

Quote:
We asked Richard Huddy, Worldwide Developer Relations Manager of AMD's Graphics Products Group, to go into more detail about why the Radeon HD 2000-series architecture has been optimised for shader-based AA rather than traditional multi-sample AA. He told us that 'with the most recent generations of games we've seen an emphasis on shader complexity (mostly more maths) with less of the overall processing time spent on the final part of the rendering process which is "the AA resolve". The resolve still needs to happen, but it's becoming a smaller and smaller part of the overall load. Add to that the fact that HDR rendering requires a non-linear AA resolve and you can see that the old fashioned linear AA resolve hardware is becoming less and less significant.' Huddy also explained that traditional AA 'doesn't work correctly [in games with] HDR because pixel brightness is non-linear in HDR rendering.'

While many reviews of the HD 2900XT have made unflattering comparisons between it and Nvidia's GeForce 8800-series, Huddy was upbeat about AMD's new chip. 'Even at high resolutions, geometry aliasing is a growing problem that can only really be addressed by shader-based anti-aliasing. You'll see that there is a trend of reducing importance for the standard linear AA resolve operation, and growing importance for custom resolves and shader-based AA. For all these reasons we've focused our hardware efforts on shader horsepower rather than the older fixed-function operations. That's why we have so much more pure floating point horsepower in the HD 2900XT GPU than NVIDIA has in its 8800 cards... There's more value in a future-proof design such as ours because it focuses on problems of increasing importance, rather than on problems of diminishing importance."

I wouldn't be surprised if this was the case, I don't see how they could have broken the ROPs and forced a fall back to the ALUs, unless it was designed in such a manner the data paths just wouldn't be there to do it...

nts

968

May 18, 2007 01:53 AM

Quote:Original post by phantom
Well, it's not just bandwith and samplers; there isn't enuff precision in the samplers to fetch int16s quickly apprently, and it looks like it'sjust setup for FP32 in two cycles.

How many games right now are making extensive use of int16 and FP32?

Quote:Also, with how the samplers are arranged with respect to the ALU arrays I would guess it becomes harder to just dump more in. I'll have to look at the data paths in the beyond3d arc. article again once I've had some sleep today, get my brain around it all.

Right now a sampler unit is dedicated to one math/alu unit (4 of them). So to add another sampler unit would mean adding another math unit. They could also improve the sampler units, fetch more, filter more, etc.

Quote:Just chatter over on rage3d; I wasn't aware of any process problems, that could well delay things if they can't ramp it up for the low end parts.

When they originally announced the delay they were aiming for a family launch (low to high end), which didn't happen. From what I've heard it was problems with the process.

Quote:Well, the performance crown does matter when it comes to sales of mid/low end parts (OEM deals and the like) so I wouldn' be surprised if they did a limited release. If they can get 0.65 XT out the door as well in the refresh then it would be worth it.

Only to a degree though. AMD will also have a cost advantage over NVIDIA, 65nm compared to 80nm for mid and low end parts. Also one 80nm chip compared to two separate 90nm on NVIDIA highends (8800).

Going from 80nm to 65nm isn't as simple as an optical shrink so if an XTX does show up I don't think it'll be on 65nm.

Quote:As for the R700, the work on that would have been progressing for some time now, remember it's not a everyone focus on one thing, it's all pipelined, so when the R600 was a year into design is probably when they started kicking about ideas for the R700; I don't think NV has a lead on them, if anything AMD might well have the lead when it comes to experiance with things which are 'possibles' for DX11 and things such as shader AA (which, while it was a bad move now might well prove to be a good one in the future).

Yeah I am sure the R700 is well under development right now and I agree that AMD has a technology lead over NVIDIA but they need to catch up to NVIDIA's release schedule. If NVIDIA releases 6+ months earlier then that isn't good for AMD, their parts will be short lived.

google

hplus0603

11,916

May 19, 2007 01:52 PM

Quote:It has a programable tesalator

You might be missing an "m" and an "s" :-)

Anyway, I think that any DX10 card has a programmable tesselator, because you can do that in the GS unit.

enum Bool { True, False, FileNotFound };

_the_phantom_

11,263

May 20, 2007 07:39 PM

heh, yeah, I think I did miss them [grin]

And nope, the G80 doesn't; it has a GS, but so does the R600, the PTU serves a simular but different functionality (and it really is only for tesselation, so it's not programmable in the same sense as the other shader units).

It sits BEFORE the vertex shader stage, where as the GS sits conceptully AFTER it (although with stream out you can do another trip around).

_the_phantom_

11,263

May 22, 2007 12:35 PM

Quote:Original post by phantom
I expect the R700 will be a modified R600-refresh chip, much like how the R300 core survived a few incarnations; the R800 will probably be the next redesign job (probably with AMD input as well).

Hmmmmm, quoting myself [grin]

Anyways, I'm wrong about this; I forgot that the R700 is going to be another large redesign into a multiple core solution (XB360 daughter board in there maybe?).

I'm sure bits of the R600/R650 will make it in there but I'll be intresting to see how.

And on that news; there is a rumor being bounced about that the R650 will get more texturing power in its incarnation; I'm still hoping that can get AA resolve into the ROPs for it as well. Combine the two with higher clock speeds and it should fly [grin]

So what's the verdict?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

So what's the verdict?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines