R600 rumors

Started by
15 comments, last by wodinoneeye 16 years, 11 months ago
So, rumor has it that the ATI R600 will trounce the NV 8800 for performance, at least for DirectX. Rumor also has it that the new ATI OpenGL driver (worked on for many years) will finally be released in conjunction. However, those rumors were from January, and it's now March and no parts in the stores. I haven't kept up since then, except I've heard that a press event from March got moved to sometime later. Comments? Btw: This post mentions NVIDIA. Any reply to this post that mentions NVIDIA will be deleted, because I want comments on the ATI specifically, not any kind of comparison or discussion war. Post another thread for that!
enum Bool { True, False, FileNotFound };
Advertisement
The parts were supposed to be out by March 15, promised by an ATI executive for Q1. After that an AMD executive (Richard?) got the launch moved to Q2, no good reason was given. Assumed that they wanted to launch the whole family at the same time or to conincide with the launch of Barcelona server chips... not very likely. The tech day and everything was scheduled which got cancelled 2-4 weeks (forget the exact dates) before it was supposed to take place. Now rumored to take place around March 30 and the launch later in April or latest May.

What is known about the chip is that it has a 512-bit memory bus, this was confirmed by an ATI exec in a speech.

So with a 512-bit bus and memory clocked at 2.4GHz (with GDDR4, already used on X1950s) that would give the R600 154GB/sec of bandwidth, not known what it will be spent on though.

AMD also put on a demonstration a while back with the R600s in Crossfire and stated that they hit over a teraflop. So that equates to roughly 500GFlops per R600 card and also mentions 320 'multiply-accumulate units' on the R600, the speculation here lead to the conclusion that they were either using 64 vec5 shaders (vec4 + scaler maybe) or 80 vec4 shaders. It is still very possible that they are also all scalar.

Now the most interesting rumour out of CeBIT is that the R600 family is all based on the 65nm node. The low end R610 and R630 were already confirmed to be on 65nm but the R600 was thought to be on 80nm.

Here are some leaked slides from CeBIT talking about the video decoder on the R600 which is now dedicated.

About the OpenGL drivers, parts of them should already be available on Vista with Catalyst 7.2.

Well, looking at what ATI does for GPGPU with the 500 series, and looking at the unified shader model in DX10, I would assume that they would spend all those FP units on "micro-threaded" programmability. I seem to recall having seen a block diagram with a big fat sequencer in the middle, and lots and lots of execution units surrounding it. Apparently, this will let them pipeline around latency issues. (From my measurements, ATI has always had a few clocks of penalty on texture fetches)

I've also heard a 270W power requirement, which means you need something like a 1000W power supply in your PC for dual-card graphics :-)
enum Bool { True, False, FileNotFound };
Quote:Original post by nts
Now the most interesting rumour out of CeBIT is that the R600 family is all based on the 65nm node. The low end R610 and R630 were already confirmed to be on 65nm but the R600 was thought to be on 80nm.


Yes, that one has made a fair few people sit up, it might even be a reason for the delay (stated reasons of late being a complete solution launch for DX10 top to bottom).

There was also this from the MS Flightsim X guys;
Quote:
Given the state of the NV drivers for the G80 and that ATI hasn’t released their hw yet; it’s hard to see how this is really a bad plan. We really want to see final ATI hw and production quality NV and ATI drivers before we ship our DX10 support. Early tests on ATI hw show their geometry shader unit is much more performant than the GS unit on the NV hw. That could influence our feature plan.

(source)

Which I found intresting, and slightly baffling because of the unified arch they are both using.

Having considered the previous hardware and the unified nature of things I'd reached a conclusion that, unless they went with scalar, the R600 would have to be a vec5/vec4+scalar core because the vertex unit in the previous generations already is a vec4+scalar, as such you won't want to cost any performance in vertex operations and it should still gain you in geo and frag ops.

A decent OGL driver would be nice (currently running Vista Business x64 with Cat7.2 and the OGL component is kinda wonky on my X1900XT), hopefully they'll follow it up with a decent OpenGL Longs Peak implimentation as well...
Quote:Original post by phantom
Yes, that one has made a fair few people sit up, it might even be a reason for the delay (stated reasons of late being a complete solution launch for DX10 top to bottom).

If it is a 65nm chip I'm guessing that they skipped the original 80nm and just pushed the refresh (R650/680) ahead a good 4 months. You don't just go from 80nm to 65nm easily (not an optical shrink) but then that brings up another point, are they really eating all the costs for their work on 80nm (multiple tape outs).
Quote:Which I found intresting, and slightly baffling because of the unified arch they are both using.

Not too suprising for me, if true. I've read other reports that using GS and SO on the G80 incurs a pretty big penalty if you hit a hard limit (I think it was 2-4KB but not 100% sure). If you remember back to the NV40, first gen SM3, the batch sizes were huge and DB performance wasn't great as a result. Could be something similar with first gen GS on the G80.
Quote:
Having considered the previous hardware and the unified nature of things I'd reached a conclusion that, unless they went with scalar, the R600 would have to be a vec5/vec4+scalar core because the vertex unit in the previous generations already is a vec4+scalar, as such you won't want to cost any performance in vertex operations and it should still gain you in geo and frag ops.

Well if the 320 units is true then you could reach that with 64 vec5 (vec4+scaler) but since this is the second generation unified part from ATi (first being Xenos in XB360) that number looks a bit small. Xenos already has 48 vec4 units enabled (hardware has 64 with one quad disabled) and just adding 16 more seems too little for the years they have been working on it...

There has to be something more to it, the other guess for 320 units was 80 vec4 which I don't think is very likely either.

As for the performance, I think scalar is the future. If your scheduler is good enough then you shouldn't lose any performance over a dedicated vec4 unit.

Quote:I've also heard a 270W power requirement, which means you need something like a 1000W power supply in your PC for dual-card graphics :-)

Unfounded rumours, with 2x6-pin connectors (confirmed) you can't get out 270W...

Quote:Original post by nts
Well if the 320 units is true then you could reach that with 64 vec5 (vec4+scaler) but since this is the second generation unified part from ATi (first being Xenos in XB360) that number looks a bit small. Xenos already has 48 vec4 units enabled (hardware has 64 with one quad disabled) and just adding 16 more seems too little for the years they have been working on it...

There has to be something more to it, the other guess for 320 units was 80 vec4 which I don't think is very likely either.


Yeah, I tend to agree with your reasoning, I threw my vec5 idea out there before we had any MADD numbers to work with. While it would be intresting if I was right...
Quote:
As for the performance, I think scalar is the future. If your scheduler is good enough then you shouldn't lose any performance over a dedicated vec4 unit.

... this I also agree with.

As you say, it's a 2nd generation USA, I'm more likely to be surprised if it's not scalar than if it is.
R600 on e-bay... damn wish I'd seen that sooner and for only $50 dollars too. Lost or Broken probably means threat from AMD...









That would be the OEM card, retail is about 9.5" inches


9.5" tall? You mean 9.5" long? I can't see something the size of that pic fitting in a normal case without causing some problems.

This thread makes me wish I knew more of what you guys are talking about with the vec4 and vec5 stuff (and hardware design in general).
Quote:Original post by Moe
9.5" tall? You mean 9.5" long? I can't see something the size of that pic fitting in a normal case without causing some problems.


9.5" long, it's not going to fit into a small case, but this is a high end card and people with high end equiment tend to have large cases for just this reason [wink]
The mid/low end of things where most people will end up will be smaller.

Quote:
This thread makes me wish I knew more of what you guys are talking about with the vec4 and vec5 stuff (and hardware design in general).


Well, vec4 and vec5 are pretty easy to explain [smile]

Basically it's short hand for a vector of 4 or 5 components, in C++ it would be represented as say a structure of 4 components;
struct vec4{    float r;    float g;    float b;    float a;}


When it comes to hardware the description of vec4 indicates that the gpu can work on 4 components at the same time, vec5 is a natural extension of this.

When it comes to the rest of the hardware stuff, it's not so easy [grin]

Ah, thanks for explaing that phantom. It certainly makes a bit more sense now. That reminds me, I need to get around to learning HLSL... its something I have been meaning to do for some time now.

I too, am looking forward to seeing the R600 and what it can do. The ATi demos always amaze me (especially the last Toyshop demo they had for the X1900).

This topic is closed to new replies.

Advertisement