AIMeD Corporation: Barcelona Benchmarks Are Non-Compliant

11.07.2007

Barcelona Benchmarks Are Non-Compliant - SPEC

As if the initial Barcelona benchmarks submitted eons ago wasn't bad enough, SPEC.org appears to have labelled them as non-compliant. IBM having problems securing enough Barcelonas failed to meet the requirements of shipping systems 90 days after submitting results.

What I find really surprising is the fact that even a Tier1 vendor like IBM can't seem to get hold of AMD's latest quad-core. If an important customer can't even ship a lower bin 1.9Ghz Barcelona months after product launch one has to wonder how many has AMD really managed get out of the door. I was joking around when I said AMD most likely shipped 20,003 Barcelona's because it's the lowest end of the statement "shipped tens of thousands". With the news spreading and the gathering noise about K10's lack of availability, it appears my tongue in cheek comment may be a bit more accurate than first thought.

IBM fails to ship even the most meager four-core Opteron box - The Reg
"In the meantime, IBM has to look the sucker with NON COMPLIANT stamped all over its benchmarks."

If you're one of those that ever wondered what is the worst kind of evidence when caught doing a paper launch,... well, now you know!

81 comments:

Anonymous said...: DAAMIT; 7 November 2007 at 23:37
Anonymous said...: It's not a paper launch - AMD said they only do hard launches!

It sure does look like AMD is having a hard time launching Barcelona, so I think AMD's description is mo' better.; 7 November 2007 at 23:50
Tonus said...: What speed grade were those scores posted at?; 8 November 2007 at 00:06
Tonus said...: Nevermind, it's the 1.9GHz Barcelona. I was wondering if it was a higher speed-grade and that they had pulled from the lineup.

That's pretty bad. The chart claims availability in Nov 2007 and the month is barely a week old, does this mean that they do not expect any systems to ship before December? Does Spec canvass OEMs/retailers to gauge availability?; 8 November 2007 at 00:10
Anonymous said...: Oh I know where IBM can by some Barcelona’s! Right here! Do you think I should drop them an e-mail?

It seems the GERMANS came FIRST, forget the tier one vendors! After all, they are the ones who are subsidizing the Dresden facility.

Cray, now IBM? Boy, you really got to hand it to those guys at AMD. They sure know how to run a company and keep their vendors happy!

DOH!!

http://geizhals.at/eu/a282686.html

SPARKS; 8 November 2007 at 01:00
Anonymous said...: It's not a paper launch. They're just waiting to, err, launch the complete lineup top to bottom. Or something like that.; 8 November 2007 at 01:48
Anonymous said...: Where's that abinstein douchebag? Looks to be hiding from the Penryn massacre.

Let's motivate George Ou to write an article calling out AMD on this lapse.

LOOOOOOL; 8 November 2007 at 02:05
Anonymous said...: AMD cancels analyst get together. Know they're PWND.

http://www.theregister.co.uk/2007/11/08/amd_cancels_gig/

Overclocked Phenom X4 destroyed by Intel

http://www.google.com/translate?u=http%3A%2F%2Fwww.planet3dnow.de%2Fcgi-bin%2Fnewspub%2Fviewnews.cgi%3Fcategory%3D1%26id%3D1194431751&langpair=de%7Cen&hl=en&ie=UTF8; 8 November 2007 at 02:16
Anonymous said...: This is beyond ridiculous. Even beyond beyond ridculous is Dementia still holding the faith.

The final straw comes November 19th. Mark my words, HD 2900 repeat. 45nm Shanghai will suffer the same fate again like the HD 3xxx will against the 8800 GT; 8 November 2007 at 02:45
Anonymous said...: Anyone care to guess what the 4Q losses will be??? Hmmmmmmm?

Do I hear 300M?
400M? Hmmmmm?

SPARKS; 8 November 2007 at 02:56
Anonymous said...: "Well, according to AMD spinmeister John Taylor, the company has decided that early next year will be a better time for the analyst event. AMD likes to give the technical analysts a "deep dive" under NDA into its future products."

Hmmm...I can just see the Dementia blog explaining this one:

1) The delay is just to show off DTX? (Damn, I used that one last time)
2) AMD just wants to show the progess on the F30 to F38 conversion (damn, did I use that one too?)
3) They'll already have a 3GHz part out by then and they're just waiting to show it off to give Intel a chance
4) Instead of trying to drum up PR for the December holidays they are trying to get momentum going into the all important Groundhog Day buying season.

Now for the real reasons (speculation):
1) Hector is planning to "resign' (amicably of course) an they are trying to get ducks in a row - oops this was mentioned in the article, didn't see it.
2) They are hoping for a miracle and that Phenom/Barcy parts come close to original intended clockspeed by then (and don't want to deal with the question of what the ---- happened this year)
3) They are hoping to shift some of the focus toward graphics parts, given the apparent failure of current CPU's? (bit of a reach)
4) They are working intently on developing a new benchmark metric to justify K10's 40% better.

I just find it hilarious how they now just suddenly realized (after all of the planining) that hey maybe next year is a better time...what a freakin joke this management team has become! I'll ask again what the hell is the board doing?

"Again, Taylor said, this has nothing to do with Barcelona. It's all about future roadmap tweaks and wanting to give the analysts the most information possible at the right time."

I hope this guys remembers to wash all of that slime off that he is oozing out... and on a related noted he asked, for those of you who believe what I just said, do you have any interest in some convertible notes?; 8 November 2007 at 04:06
Anonymous said...: "Anyone care to guess what the 4Q losses will be??? "

150-300Mil (probably in the mid 200's).

ATI acquisition cost might now be off the books (I think that was something like 120Mil for the last few qtrs). Q4 should be seasonably healthy and Intel is capacity constrained (by their own admission). AMD should be able to push enough K8's and K8 Opterons with Intel's supply constraints without having to resort to slashing prices further plus the overall market continues to shift to mobile (which helps AMD's ASP's more then Intel's as AMD is trading REALLY cheap desktop chips for CHEAP, but higher ASP, notebook chips). Also I'm not sure if some of the 200mm equipment sales from F30 will hit the books.

I actually think there's an outside chance AMD will be under 150Mil loss. The sad thing is with expectations SO LOW, most AMD fans will consider any loss under $400 a major positive step/trend - my how things have changed from Q3/Q4 last year! You mean there's a chance we can finish with under 2Bil in losses for the year - SUCCESS!; 8 November 2007 at 04:22
Unknown said...: It's not a paper launch. They're just waiting to, err, launch the complete lineup top to bottom. Or something like that.

That's right. Opteron quad core is ready now. But dual core K10 won't be ready until next year. AMD is just waiting so they can launch all at once. Honest!! Don't you believe good ol' AMD?!

Do I hear 300M?
400M? Hmmmmm?

My bet is on 350M, sparks. Their situation hasn't improved at all in terms of new products. The loss will be slightly smaller because of larger CPU shipments. Expect Intel to report another record quarter in terms of revenue, and profits should pass $2bn this quarter.; 8 November 2007 at 04:25
Anonymous said...: You know I had a tough time reconciliating IBM pulling the spec scores with AMD claiming they shipped 10's of thousands by the earnings call and hundreds of thousands by the end of the year....

They're shipping them all to the nearest LANDFILL as they are not functional. I ask you - did AMD say they had shipped 10's of thousands of **FUNCTIONAL8** K10's or just 10's of thousand of K10's.

I say to you all it just depends on what the meaning of "is" is. Those AMD PR folks are getting real clever, next thing you know they'll be canceling an analyst meeting last minute in order to tweak their roadmap and give the analysts a better picture of the roadmap! :); 8 November 2007 at 04:32
Anonymous said...: I’ve never been inside a FAB, But, I’ve seen clips on You Tube where INTC super guru Mike Bohr(?) give the viewers a relatively unabashed tour of the facility. It was very impressive to say the least.

However, I tried to imagine what it would be like inside the AMD FAB while workers sorted out the good Barcelona’s from the bad ones.

I can imagine newly recruited migrant farm workers sifting through chips in old joint cement buckets. They pull them out and pop then in a AM2 motherboard. There are 40 people in this testing/search phase of production. Suddenly an old woman cries out, “I’d god won, I’d god won!” “It go to 2, it go to 2”, she cries! They give her a banana and a cookie.

Going on to the handicapped (or more appropriately, the Special Section), as AMD is an equal opportunity employer, these folks are charged with the newly developed Triple Core Section. They’ve developed a special bond with chips through Human Resources’ mastery of psychological identification.

The joint cement buckets are color coded to avoid confusion after sorting. The Blue one goes to IBM, Orange one goes to SUN, the black one goes to the Channel, the gold one gets saved for Special Section, and the remaining 36 black buckets go to the dumpster.

Occasionally, Dirk Meyer stops by to give a little pep talk, and reassures them if it were not for their diligent efforts he wouldn’t have moved on to the Board of Directors. But he also leverages his excitement with caution as he reminds them of the “tens of thousands” of buckets that still need to be processed.

SPARKS; 8 November 2007 at 04:52
Anonymous said...: AMD unable to introduce 2.6GHz Phenom CPU by 2008 on barriers over 65nm conversion

http://www.digitimes.com/mobos/a20071107PD220.html

Can someone taunt Scientia and Sharikou with this? I don't have a blogger id.; 8 November 2007 at 05:40
Anonymous said...: "AMD unable to introduce 2.6GHz Phenom CPU by 2008 on barriers over 65nm conversion"

This makes no sense:
- AMD has APM3.0 which allows for the worlds best and speediest conversion
- Scientia has told us the AMD 65nm conversion was the fastest ever
-AMD has stated all starts in F36 are now 65nm (which means it's not a CONVERSION issue, it's a PROCESS issue)
- Scientia told us that there would be a 2.6GHz by year end, maybe even a 2.8GHz
-Scientia told us AMD doesn't do paper launches
-AMD uses SOI which is the best technology in the world

I'm so confused, do I believe digitimes or Scientia's blog - Scientia has such a well documented background in manufacturing and technology (second only to Sharikou of course), I have to believe everything he writes even though he doesn't provide any facts to support it.

Hmmm... Phenom speeds dropping...Analyst meeting cancelled... what a KO-IN-KEY-DINK. It's a lot easier to duck the what's wrong question when you don't meet with the analysts.; 8 November 2007 at 06:13
Anonymous said...: Good news...I just got off the phone with AMD. They aren't shipping the 1.9 GHz CPU's to OEM's since they just created a "dancing in the aisles" stepping of 2.8 GHz. No sense in sending out legacy gear. So hang on tight for their imminent arrival.

Keep the faith, O green believers!; 8 November 2007 at 06:36
Anonymous said...: You know I think we need to start taking a hard look at the IBM/AMD 65nm process:

1) K8's top 65nm clock is 2.6GHz - this is clearly not a K8 architecture issue as AMD can get 90nm K8's to 3.2GHz. So they are either doing this by choice (as Scientia or others would have you believe) or there are issues specific to the 65nm process.

2) Barcelona's launched late AND down from original top speed bin of 2.3 (or 2.6 depending on who you ask) to 1.9GHz.

3) Phenom looks to be launching at 2.4GHz at best (or 2.3Ghz) with the faster parts 2.6+GHz out to next year

4) AMD's claims of a quick clock speed ramp on K10 by end of year seems to be BS

5) And most importantly - the rumor of PS3 switching over to 65nm cell is apparently not true and will come later. If you recall IBM struggled miserably with Cell yields and as they use the same process as AMD, this seems more than just a coincidence.

6) Two of the key changes IBM/AMD implemented on 65nm are known to have yield and clockspeed difficulties. NiSi was delayed for many generations at all IC manufacturers (including Intel) due to yield issues. Intel finally implemented it on 90nm. Selective SiGe process also is very pattern sensitive and potentially yield sensitive - again Intel implemented this on 90nm.

- If you look at Intel's 90nm yield curves, compared to the other generations, there were a couple "kinks" on the 90nm curve and the overall 90nm yield learning rate was a bit slower than 65 and 45nm. It's difficult to say whether those issue were the cause of the kinks but those were the 2 major changes on Intel's 90nm process.

7) Even with Barcelona's lower speeds and AMD's claims that F36 is fully converted to 65nm, parts are still hard to find. "100's of thousands of K10 by year end" is TERRIBLE volume if you break it down to actual fab production:

- Each wafer should have ~180-190 potential die (approximation from using wafer calculator at geek.com)

- Assume poor initial yield of ~50% or ~90 good die per wafer.

- 200,000 thousand chips would be ~2200 wafer starts TOTAL over the quarter (we'll throw away Sept and assume all 200,000 are being done in Q4), this is about 750 WSPM.

- F36 is somewhere ~20,000 WSPM (or more by now)...

- 750 WSPM of K10 would represent <4% of AMD's production - that is a HORRENDOUSLY BAD RAMP, getting to 4% would definitely NOT be an issue with fab operations like mask availability, logistics, batching, etc...

- This back of the envelop calculation assumed a 50% yield, if the yields are "mature" as AMD claims then you would have more die per wafer and even FEWER K10 wafer starts and a conversion rate to K10 in the 2-3% range!

To put things in perspective 750 WPSM is a little less than one production lot started per day of K10....Hey Joe, it's 3:00 - did you start that K10 lot yet? Awww, crap I almost forgot!

For all the bluster about efficiency and flexibility, K10 is going through an awful slow conversion from K8 wafer starts - in my humble opinion this indicates a process marginality and/or such poor yield that AMD is only starting minimal wafers with K10 to save face and keep their critical customers from abandoning ship until they can hopefully get the process issue(s) fixed.

The reason I say process and not design - if it was a design issue, I would think AMD would produce higher volumes of low speed bins (especially server) as there does seem to be demand for them. The only reason not to produce more parts is if you are throwing so many bad die away there is a negative ROI for starting a wafer. Given these are server parts and fairly high ASP, you'd have tp be throwing a LOT of die away on each wafer not to break even.

Anyways, given the recent rumors on Phenom, the Barcy "launch", the K8 65nm clock speeds and the recent rumors of PS3 not converting over to 65nm Cell's, it seems to me that there is a PROCESS issue. There may be architecture issues with K10 as well (difficult to say with the lack of clockspeeds and testing), but it doesn't appear to be ONLY an architecture issue.

If you put any stock into the INQ rumors way back when the AMD engineers were dancing in the aisles over a 3GHz part, that would also be consistent with the current issues being process related and not architecture related.

But hey what do I know, I hear from well informed bloggers that AMD continues to close the technology gap and that Intel's 45nm process may not be so great as they are throwing away parts according to their earnings call (sarcasm intended!)

Seriously though - does any of my theory make sense? Is there any data that would contradict it? (there is the low pricing on the Phenoms - but that may be more of a case where AMD has no choice for performance reasons as opposed to the pricing being related to good yield)

-GURU; 8 November 2007 at 07:18
Ho Ho said...: "K8's top 65nm clock is 2.6GHz"

It is actually 2.7GHz, they released a new CPU about a month ago. Of course it is interesting that they launched 65nm K8 at 2.6GHz last November and since then have got a whopping 100MHz increase. During the same time 90nm has gone from 3GHz (FX 74 in 30 Oct) to 3.2GHz (6400+ in august). I always thought it is easier to tweak and get faster speeds out from a newer process than from one that has been used and tweaked for years.

"Phenom looks to be launching at 2.4GHz at best (or 2.3Ghz)"

Launch is at 2.3GHz max but 2.4 should follow in December.

"And most importantly - the rumor of PS3 switching over to 65nm cell is apparently not true and will come later."

PS3 version of Cell: yes. The version used in IBM servers should be 65nm.; 8 November 2007 at 08:07
Ho Ho said...: Whoops, that 2.6GHz 65nm K8 was actually released on 5'th December.; 8 November 2007 at 10:49
Anonymous said...: The 2.7 GHz Brisbane was a rumor, reported by Dailytech.... it has yet to materialize.; 8 November 2007 at 12:14
Ho Ho said...: Hm, perhaps so. I used the data on their site and they listed the CPU there. Also Froogle showed it being availiable in couple of places.; 8 November 2007 at 12:30
Anonymous said...: If that anonymous poster was Guru, well done, I think you hit the nail on the head.

I never actually took the time to run the numbers on AMD's expected Q4 Barcelona shipments and compare them to AMD's total wafer capacity as a function of yield and realize how utterly pathetic things actually are. It might be a fun exercise to graph the die output vs. wafer starts as a function of different yields based on 200,000 or even a million total die produced in Q4.

I bet the numbers would be startling across the board considering your very conservative numbers pulled out of the air show how completely ridiculous the results are right now. There is definately something wrong and AMD won't admit exactly what it is, but the writing is on-the-wall.

I imagine they will continue their product shipments will continue to be K8 for the vast majority well into 2008 even when Phenom is supposedly in full swing.; 8 November 2007 at 16:39
Khorgano said...: Just a quick number crunch off the top of my head to support what Guru said.

Lets assume 1,000,000 Barcy shipped this quarter which would be very aggressive based on AMD's own guidance. Lets also assume the absolute worst and that their yields are a mere 30% which is based on previous calculations done here at this blog. With an approximate 190 possible die per wafer.

This gives ~57 yielding die per wafer. 1,000,000/57 = 17,543 wafers or in other words ~5850 WSPM. F36 should have a minimum of over 20,000 WSPM, although I think it's actually closer to 25,000. Which means ~25% of total wafer capacity.

Either way, in this "worst-case-scenario" if only 25% of AMD's capacity was designated to Barcelona, they could ship 1 million CPU's this quarter.... Now what does that say about how bad things really are if they only plan to ship "hundreds of thousands" and still there are no vendors actually shipping them.; 8 November 2007 at 17:20
Anonymous said...: "Lets also assume the absolute worst and that their yields are a mere 30% which is based on previous calculations done here at this blog."

I stand by the math (and I clearly delineated it as approximation). Other factors which should make the K10 conversion rate WORSE than the 4% (on a wafer out basis) include:

1) You should still factor in Chartered's production - AMD likes to have folks only look at F36 to make conversions and ramps look better but Chartered is producing foundry capacity for AMD (at least 1000WPSM) and should be considered part of AMD's total chip production

2) It's impossible to tell what yields are, I chose 50% arbitrarily. I think 30% is too low (at that point I think you are taking too much of a hit compared to using that capacity for additional K8 supply). If AMD's "mature" yield claims are true - then as I previously posted the die/wafer should be higher and K10 would account for even less of total production.

4) AMD's claim were by end of the year cumulative, my calc was based solely on Q4, theoretically it should be based on time from launch to end of year. This means slightly longer time (~3 weeks) for the same amount of production, dropping the conversion rate lower

HoHo the 2.3GHz top bin was a rumor, the pricing details release covered 2.4GHz, that's why I included it (and put 2.3 in "()"). It's for me to say what will be available as neither is available today.

The bottom line on the 4% estimate was I find it amusing how Scientia harps on such a "slow ramp" of Intel 45nm and it having minimal impact (and the slow ramp of C2d when it was released) and yet ignores the cold hard facts of AMD merely switching products on the SAME (theoretically mature) process.

For enjoyment, I would suggest going back and looking at his estimated K10 ramp rates...; 8 November 2007 at 18:30
Khorgano said...: ^^^^

another excellent point. Considering that Chartered is also contributing wafer capacity and yields are likely much higher than 30%, this means the actual number of Barcelona wafers in the line are a very small.

Obviously, AMD still has to provide for the existing K8 market, so they can't just dump it completely, so the Barcelona ramp will start out slow before it overtakes a majority of the product in line, but based on the information we have of AMD's total capacity and assuming conservative yields it still seems like unusually light volume for a new chip.

Surely they are starting more than a couple of lots per day, but based on availability, how could they be doing anything more than that if there wasn't a major problem?; 8 November 2007 at 20:12
Anonymous said...: Hm, perhaps so. I used the data on their site and they listed the CPU there. Also Froogle showed it being availiable in couple of places.

Ooops, my bad... I recall the press on this CPU and tracked it, after several months of no show I assumed it was a goner and did not follow up before posting...

Thanks,
Jack; 8 November 2007 at 21:13
Anonymous said...: "If that anonymous poster was Guru, well done, I think you hit the nail on the head."

Of course it was GURU, he has turned scientific chip analysis into an art form. (And, you work for INTC!)

Hail GURU!

SPARKS; 9 November 2007 at 02:26
Anonymous said...: Regarding the 2.7 GHz Brisbane... ha, it is stepping G2, this just came out... one stepping and 100 MHz...

This is probably why I did not notice it, and there was no whoopla around a new stepping.
http://products.amd.com/en-us/DesktopCPUResult.aspx?f1=&f2=&f3=&f4=&f5=AM2&f6=G2&f7=65nm+SOI&f8=&f9=&; 9 November 2007 at 04:03
Anonymous said...: Regarding the 2.7 GHz Brisbane... ha, it is stepping G2, this just came out... one stepping and 100 MHz...

This is probably why I did not notice it, and there was no whoopla around a new stepping.
http://products.amd.com/en-us/DesktopCPUResult.aspx?f1=&f2=&f3=&f4=&f5=AM2&f6=G2&f7=65nm+SOI&f8=&f9=&; 9 November 2007 at 04:03
Anonymous said...: The way I see it, AMD's problem comes from K10's design, though process also contributes to the problem. It seems the process issues have been beaten to death already. But, I think it's worthwhile to look at K10's layout for an idea of how design contributes to poor yields.

Barcelona is an amazingly messy design. 280 mm^2 of silicon, every bit essential for a functional quadcore, even the caches. In contrast Conroe's critical area is about 80mm^2. Brisbane's appears the same or lower.

From what I understand, certain cache defects can be tolerated but will create speed bottlenecks.

Now, with these cache defects in Conroe, Intel can choose to disable the bad portions or just run with them, at lower speed bins.

What I don't understand is how AMD expects to achieve any kind of respectable yield when K10's critical area is 4x as large as Brisbane's or how they expect to ramp it any effective capacity.

Not only do you have a chip that is more than 2x the area decreasing the number that fit's in a wafer, but it has a much greater likelihood of failing. And the ones that pass for quadcores have to deal with cache defects that inhibit speed.

Remember, AMD expects to shutdown Fab30 by year's end, or is it "on December 31st"? The cost of making K10 is just too high in terms of how many K8's are sacrificed instead. Coupled with the increasing demand for mobiles, Fab30 shutdown might just be AMD borrowing from Peter to pay Paul. Sure, they're timing it to take advantage of seasonal demand keeping one Fab churning when demand is low in Q1 and Q2, then get Fab38 up and running when demand picks up in Q3. It's a gamble against inventory depletion. But, how does K10 figure into this? Can AMD screw up an already poor plan with even shittier execution? What happens when they fail to supply the OEMs?

Anyway, to me, the performance differences between AMD's and Intel's quadcores reflects the differences between their respective design philosophies.

On the one hand there is a single level shared cache, neat, and easier for manufacturing to refine the QA/QC system from the yield standpoint. It's all about manufacturing efficiency. On the other hand, there is a massive jambalaya of cores and caches shared and private. It's all about design flexibility or was that "willy-nilly"?

I recall a certain asstard calling Core 2's cache a "bandaid" solution. Seems more appropriate to call K10's L3 a bandaid solution. It even looks like one. One fat bandaid to hold 4 cores together. What happens when the bandaid breaks? No quadcore.

Sad really...; 9 November 2007 at 04:05
Anonymous said...: "Regarding the 2.7 GHz Brisbane... ha, it is stepping G2, this just came out... one stepping and 100 MHz..."

5000+ model:

G1 voltage = 1.25/1.35

G2 voltage = 1.325/1.35/1.375

Maybe they're saving the "good wafers" for K10 and using the shitty ones for this new stepping.; 9 November 2007 at 04:12
Ho Ho said...: bao, actually Nehalem will be quite similar to K10: 3 levels of cache, 8M L3 total. Though I agree that with K10 there is more non-cache silicon area compared to most Intel CPUs.; 9 November 2007 at 10:46
Anonymous said...: Quote: "Maybe they're saving the "good wafers" for K10 and using the shitty ones for this new stepping."

I don't know, I recall back in the Julyish/August timeframe a story was printed on a few sites that AMD was releasing a 2.7 GHz Brisbane... so I tracked it and after a while it never showed up so I assumed it was a bad rumor at that point.

Then hoho points to it, and then I took note -- it is a G2 stepping, so this is likely how they squeezed another bin... not sure how they are binning up the chips, but the voltages are a bit odd.; 9 November 2007 at 10:46
Anonymous said...: NVIDIA REPORTS RECORD PROFIT AND REVENUES!

http://phx.corporate-ir.net/phoenix.zhtml?c=116466&p=irol-newsArticle&ID=1075229&highlight=

For the third quarter of fiscal 2008, revenue increased to a record $1.12 billion compared to $820.6 million for the third quarter of fiscal 2007, an increase of 36 percent. Net income computed in accordance with U.S. generally accepted accounting principles (GAAP) for the third quarter of fiscal 2008 was a record $235.7 million, or $0.38 per diluted share, an increase of 121 percent compared to the third quarter of fiscal 2007. GAAP gross margin improved by 550 basis points from a year ago to a record 46.2 percent.

Non-GAAP net income for the third quarter of fiscal 2008, which excludes stock-based compensation charges and the associated tax impact, was $264.2 million, or $0.44 per diluted share, an increase of 77 percent compared to the third quarter of fiscal 2007. Non-GAAP gross margin improved to a record 46.4 percent, an increase of 350 basis points from a year ago.

Everyone wants Nvidia Geforce GPUs and Intel Core 2 CPUs. This is clearly indicated in Nvidia and Intel's record revenues while AMD posts massive losses.

The Geforce 8800 GT is a KILLER price/performance GPU. Nvidia's Q4 will be massive.; 9 November 2007 at 14:31
InTheKnow said...: And this from elsewhere on the web...

Also, a market that is as heavily dependent on R&D as this is always going to have a very strong reliance on a bit of luck. Intel having a horribly leaky and hot 90nm process: bad luck. AMD having a 90nm process that's, even today, very competent: good luck. AMD having, reportedly, a weak 65nm process: bad luck. Intel having an amazing 45nm process: good luck.

We could focus on the errors regarding the inability of the poster to differentiate between process capability and design, but something else disturbed me more.

The poster's statement seriously downplays the effects of good R&D work. If your R&D process is working properly, most of the disasters should be identified during R&D. Thst is the role of first pathfinding and then R&D, to find the solutions that will work.

To say, for example, that Intel was lucky to end up with an excellent 45nm process is ridiculous. Intel spent years researching the right materials and the process technology to deposit those materials. Not to mention the untold millions of dollars. Credit should go to a well thought out experimental program, not luck.

It is true, I have learned more from experiments that didn't go as planned than from a series of experiments that told me exactly what I expected it to. But it is the ability to sort out the wheat from the chaff as it were that separates good research from poor research, not luck. All the luck in the world isn't worth anything if you miss the critical data points when the clues are there.; 9 November 2007 at 15:00
Roborat, Ph.D said...: "Luck is the residue of design" - some luck guy; 9 November 2007 at 16:32
Anonymous said...: With all the talk of Hector Ruiz about to be leaving, I can't help but find this post hilarious. It was from an AMD fan right after Intel demo'd Conroe benchmarks:

http://episteme.arstechnica.com/eve/forums/a/tpc/f/174096756/m/887005808731

"When it comes down to bare knuckles, AMD is the proverbial David against the Goliath Intel. But what I just said would make Goliath roll over the grave.

Intel's corporate culture is FUD+Hype-based marketing foremost, and technology second, with technology trying to keep up with the grand pronouncements of marketing, and so far the technical smarts have done quite well in keeping tow. The smarts in far-away Israel, inspired by the lack of marketing people clouding their thinking, produced the great Pentium M and Centrino, which begat Merom and Conroe. Still, the culture is that of the techs being subservient to the fudder/hyper-marketers. That is evidenced by the bunny mascot cum CEO Otellini.

On the other hand, AMD is marketing without Hollywood glitter and confetti choking down your windpipe. Hector Ruiz is as good as they come- engineer cum CEO. He isn't about to let lies and deception take over the technical mastery of the art and science of making chips out of the best minds in the industry crammed into a smaller R&D budget. He can spot the fake chipmasters from a mile away and gently shove them into Intel's doorstep, where they are welcomed and elevated.

Which is why I think that after the grand illusory spectacle that is the IDF has settled down, it comes down to the corporate culture, the culture that decides who gets to lead the company. And we will know which CEO will lead the company to greater heights.

Intel's designs are not borne out of careful strategic planning, they are a tactical response to the threat posed by a competitor. As such, its designs will have the hallmarks of workarounds to compensate for its poor design. A little band-aid here and there, a little reinforcement here and there, and there! Inspiring and creative genius! On the other hand, AMD has a well thought-out blueprint. It doesn't need workarounds. It adopts and grafts new technologies quickly as the design can accommodate them, and it responds to the markets needs more quickly as its designs are more flexible.

And this kind of vision has to be directed by capable and visionary leadership, one that is not in evidence in Intel but is seen in AMD. This difference is what separates the wheat from the chaff, much like what separates Toyota from GM.

So while much of the thread is focused on what Intel has now presented, it matters really little. It's like what does it matter if GM just produced a new model of the Hummer.

Let's see some substance for a change."

Lol, he wanted substance, I believe he got it. =p; 9 November 2007 at 17:22
Anonymous said...: Intel for cheap cpus tiny 45nm dies yeah!; 9 November 2007 at 23:48
Anonymous said...: Intheknow...taken from web:

"Also, a market that is as heavily dependent on R&D as this is always going to have a very strong reliance on a bit of luck. Intel having a horribly leaky and hot 90nm process: bad luck. AMD having a 90nm process that's, even today, very competent: good luck. AMD having, reportedly, a weak 65nm process: bad luck. Intel having an amazing 45nm process: good luck.
"

Alas, unfortunately, this is a prevailing problem in the back alley circles of the net.... people who make such statements know nothing about how the CPU is made nor how process interacts with architecture to deliver the final result.

I suppose that 'leaky' Intel 90 nm process was soooo horridly leaking they managed to slip in a 31 Watt Pentium-M (dothan) and dominate the mobile market. :) :) ...

The truth --- Intel's 90 nm process is no more or less leaky than AMD's 90 nm process... and it takes just a little digging to find the right data to see the fallacy of such statements.

jack; 10 November 2007 at 02:55
Anonymous said...: Guy’s, hear is bit from VR Zone. It is, however. an unconfirmed story. Never the less, for the most part, it confirms what you boys have been saying about AMD's 65nM scale down for months. Why am I getting a sneaking suspicion that Pheromone will never see, in volume, 3 Gig?

What gets me is how the hell can they even talk about 45nM when clocks over 2.6 GHz at 65nM, from all accounts, seem to very difficult, if not impossible, to obtain.

Guru, especially, has been pounding home the fact that the thinner layers in the existing process (among other things) made things much more difficult (leaking like sieve), obviously, when they moved from 90nM to 65nM.

I believe it was 'In The Know' who spoke of uneven layer thickness.

As this motherboard manufacturer is validating a product, shouldn’t he know what the Pheromone can and cannot do?

Therefore, the question remains, will this chip ever see 3 GHz or better? Is it possible there is a “barrier” with AMD’s current process and it CAN'T go any faster? (Read: no hafnium nuclear control rod alloys)

Further, will the 45nM process exacerbate these issues? Then what are they going to do?

Hey, maybe they'll get lucky?

Incidentally, they are also saying 2.6 will come in 2008, not 2007 as AMD hyped.

http://forums.vr-zone.com/showthread.php?t=202659

SPARKS; 10 November 2007 at 04:20
Anonymous said...: Ho Ho said...
bao, actually Nehalem will be quite similar to K10: 3 levels of cache, 8M L3 total. Though I agree that with K10 there is more non-cache silicon area compared to most Intel CPUs.

I don't think the Nehalem quadcore will have L3 cache. This idea was pushed by Hans deVries who lacks credibility in my view. If anyone remembers, poor Hans actually had AMD fans believing K10 was to be 170mm^2 then revised to 220mm^2 or so, which was utterly laughable as anyone with any sense eyeballing the "K8L" CAD diagram would have said it would be closer to 300mm^2. Hans failed again to estimate Penryn die size, overshooting it by some 15%, though I forget the exact numbers. Hans pictorial analysis illustrates a type of fanboism he is loathe to admit. I call that dishonest.

The Nehalem analysis by Hiroshige Goto from the Japanese site PC Watch makes more sense, though he clearly means to indicate that his work is also speculative which is all anyone can do at this point. He points out 8MB of cache which appears to be L2. I agree there. First, the die shot does not indicate an L2/L3 architecture. Second, it makes little sense for Intel to discard the superior unified L2 cache design rather than develop it further for Nehalem. Third and finally, is the reason I have already explained in the previous post with respect to yields.

Part of the confusion with Nehalem's cache hierarchy stems from PC Watch's last article on future Intel processors. That article discusses Dunnington and Beckton aka "Nehalem EX". Dunnington will be Pennryn-based hexacore while Beckton will be Nehalem-based octocore. Both will carry L3s. The problem is people might understand this to mean both will be monolithic designs. Both are meant for MP servers, being replacements for Tigerton which replaces Tulsa. Tulsa, is an MCM design with a huge L3 just like the Extreme Edition P4s. Judging by Intel's past and present course with regards to L3 designs and the relatively large L3 sizes, I would have to conclude that Dunnington and Beckton will also be MCMs and that these will be low-volume, high ASP parts that won't be seeing service on the desktop. They will take up huge chunks of silicon. But, the pay-off will be well worth it, IMO. Again these will have separate L3s integrated during packaging while maintaining their own distinctive L2 architectures, 3x6MB for Dunnington and 2x8MB for Beckton.; 10 November 2007 at 05:14
Anonymous said...: My apologies for not breaking the above down into more readable paragraphs.; 10 November 2007 at 05:17
Unknown said...: humm... one interesting question for you bao..

How does Intel implement 3 levels of cache without penalties from latency? As you said, Dunnington and Beckton will feature both shared L2 and L3. How will Intel implement it so CPU doesn't spend all of its time trying to synchronize both of them?; 10 November 2007 at 08:52
pointer said...: bao said...

I don't think the Nehalem quadcore will have L3 cache.

No, Nehalem QC will have L3 cache. There are so many reference in the web can confirm that, even the source from Intel.

However, I do believe that intel cache will have better latency; and somewhere in web, it also has different access policy than the Barcelona (something to do on how core access it, read some where, but lazy to dig out the detail); 10 November 2007 at 09:30
Ho Ho said...: bao
"I don't think the Nehalem quadcore will have L3 cache."

Intel itself has talked about multi-level caches with Nehalem

"Second, it makes little sense for Intel to discard the superior unified L2 cache design rather than develop it further for Nehalem"

Well, the more cores are attached to the same L2 the slower it gets.

Thought there is one possibility that might also be viable:
each core (pair of cores) have their own L2 and L2 of other cores acts as L3 for them.; 10 November 2007 at 11:20
Anonymous said...: Scientia from AMDZone said...

Actually, it is a fact that QX9650 is rated at 125 watts. You would have to give some proof showing that Intel's rating is wrong.

Ho Ho said...

You do know that 45nm 3GHz quadcore Xeons will be sold with 80W TDP, do you? From 80W at 3GHz it shouldn't bee too hard to get way over 3.4GHz with 130W.

Just to clarify a few points here. Yes, there is an obvious disparity in power ratings for server chips and the Extreme Editions. The reason is twofold.

1. Sure there are differences in power usage from chip-to-chip that are evidenced when speed binning, but does anyone honestly believe that difference amounts to over 50% in total power usage? I didn't think so. The reason the EE's carry such lofty TDP's is for a different reason. A standard desktop or server chip is intended to always operate within a certain TDP. They are not marketed for overclocking (even though plenty of enthusiasts do :p). The EE are sold at a premium with an unlocked multiplier so there is an implicit acknowledement that the chips are intended to be pushed farther than any other standard product or their stock ratings, thus they rate the chips with a much higher TDP to ensure that system builders make sure the entire system is capable of handling higher loads.

2. Even though a server/desktop/mobile Penryn family chip has the exact same physical transistor layout, there are actually differences in certain physical transistor parameters. There are slight changes in the masks at certain layers to adjust features to optimize transistors for either performance or power consumption. Thus, some products will naturally have a slightly higher power consumption at a given clock rate than others, but be able to scale to much higher speeds.

On another note, I'd love to shed light on Nehalem architecture for everyone, but unfortunately, not even I have details about it. In fact, details are under strict lock and key on a need-to-know basis and will be published when Intel feels the time is right.; 10 November 2007 at 17:29
Anonymous said...: There are rumors that TSMC will do 45nm CPUs for AMD in 1H08.; 10 November 2007 at 22:49
Anonymous said...: "here are rumors that TSMC will do 45nm CPUs for AMD in 1H08."

Old rumors:
1) It is questionable given the sources on this.
2) If it is 45nm it is likely not CPU's but chipsets and/or GPU's.
3) The amount of effort required to port the AMD CPU design over to the TSMC process would be daunting (remember AMD and TSMC use different 45nm processes). It's not just a matter of AMD showing TSMC the masks and say - hey please make CPU's for us.
4) This move would only make sense if AMD was stopping all in house 45nm CPU production (it would not be cost effective as AMD would have to pay TSMC a profit margin on top of the basic wafer cost in addition to the cost of maintaing 2 different process files (one internal, 1 external)
5) If AMD is really ramping 45nm production in "H1'08"(trying real hard not to laugh), it makes no sense to do this also at TSMC in the same time frame.

Other than that the rumors cold be true... I hear TSMC will be implementing reverse hyperthreading for AMD in those CPU's too!; 11 November 2007 at 00:26
Anonymous said...: “The EE are sold at a premium with an unlocked multiplier so there is an implicit acknowledement that the chips are intended to be pushed farther than any other standard product or their stock ratings, thus they rate the chips with a much higher TDP to ensure that system builders make sure the entire system is capable of handling higher loads.”

See, this is the kind of stuff that warms my soul, the crème de la crème, if you will. The ever lovely and cherished Extreme Editions, my prized collection, immortalized forever in my display case, after they released all their wonderful pleasures and magic moments. I’ve never had one fail me.

Here in the Big Apple, I see my fair share of Ferraris and Bentley’s. But to think that at a mere $1000 I can have a Specialty product specifically selected for un-clockable goodness, manufactured at a 4 billion dollar factory makes me swoon.

You just keep cranking out those badboys, Ortho baby. Oh, my new heart throb QX9770, has me positively giddy.

Hey, can you tell us the juicy details about “implicit acknowledgement” thingy and how the little JEWELS are selected, without giving away company secrets, of course.

Are they sorted by scantily clad, female employees, with Sade playing in backround, as they are delicately tucked into their packages with black lace gloves? Is the box sealed with kiss?

SPARKS; 11 November 2007 at 03:25
Khorgano said...: This comment has been removed by the author.; 11 November 2007 at 04:34
Orthogonal said...: LOL Sparks,

Hey I decided to finally sign up for a Google account but some bastard already took my Handle, fortunately it allows me to assign my own Blogger alias...

Now to your question, unfortunately the process of sort, bin, assembly and test isn't as exciting or glamorous as you make it out to be, but if it were, you can be sure I'd be waiting in line for a rotation. In Arizona we only have a Chipset sort facility so I don't know all the details involved in the CPU sorting (done in Malaysia I believe).

I don't have all the data, but you pick up on things here and there... Once the process is stable and volume has ramped, it probably comes as no surprise that bin splits follow a roughly normal distribution. The process of determining the range of product speed grades depends on the overall distribution which is then plugged into a model (Obviously, market forces and competition can pull it one way or another). Since the lower speed bins are generally much higher volume products, it's not uncommon to bin down mid range product to fit the bill. Further, they obviously want to sell every yielding chip they make, but you have to weigh the consequences of how low you want the product speed range to go in order to not have to bin down too much product and depress margins.

In order to combat this effect, they purposely design product to fill the low end (Think Allendale vs. Conroe). The Allendale has less cache, and thus, less silicon real-estate, so what it lacks in features is a cost advantage in order to compete in that market space.; 11 November 2007 at 04:50
Anonymous said...: yomamafor2 said...
humm... one interesting question for you bao..

How does Intel implement 3 levels of cache without penalties from latency?

I'm not sure that they must and I don't know how they would.

As you said, Dunnington and Beckton will feature both shared L2 and L3.

That's what Hiroshige Goto is saying.

How will Intel implement it so CPU doesn't spend all of its time trying to synchronize both of them?

It would be faster to synchronize three L2s using an L3 rather than using the FSB in Dunnington's case, or the way Tigerton currently does it. No reason why it wouldn't be still be faster than QP in Beckton's case.

These designs are meant for MP servers, in blades and rackmounts where higher density is a main selling point. 8 cores>>6>>4. That's what it comes down to. Some IPC can be sacrificed to achieve these ends, IMO. I'm not saying they will lose, gain or maintain, IPC either.; 11 November 2007 at 05:59
Anonymous said...: Pointer

No, Nehalem QC will have L3 cache. There are so many reference in the web can confirm that, even the source from Intel.

As I already said, Hans de Vries was the one who pushed this idea. Here are some of his posts at Aces Life Raft:

http://aceshardware.freeforums.org/viewtopic.php?t=140

"The image and the transistor count (713M) belongs to a four core
Nehalem. The L3 seems to be indeed the 8MB as was known but the
L2s seems to be larger as the 0.5MB previously mentioned. They
are more like 1MB. Could this have something to do with the rumor
of a later Shanghai version with 1MB L2s?

Regards, Hans"

Note how Hans says Nehalem QC would have 12MB of cache on 713M transistors. It's kind of funny but not if I have to explain it.

Continuing here also:

http://aceshardware.freeforums.org/viewtopic.php?t=141&postdays=0&postorder=asc&start=0

Anyone website that reports a QC Nehalem with L3 cache would either have done it after Hans makes his pictorial analysis or would have edited to that effect.
Believe Hans if you want. But, the only other "source", PC Watch, in this regard is far more credible.

PC Watch's Hiroshige Goto has been reporting not just on architecture but is also very prolific in reporting OEM guidance. This tells me his source(s) fits into one/more of these categories:
1) Intel marketing person(which is unlikely)
2) someone working Tier1 OEMs
3) parts supplier for a major OEM such as board makers

Whoever that someone is has privy information enough to allow Goto-san to paint some pretty accurate pictures. However, he is very straightforward when he speculates as he will say it up front.

The only original Nehalem analysis from PC Watch is here:

http://pc.watch.impress.co.jp/docs/2007/0916/kaigai386.htm

Note the article is from before Fall IDF and indicates QC Nehalem with last level shared L2 cache.
Several articles from PC Watch continue to say the same(previous related articles are linked at the bottom of more current articles) all the way until now.

In this article:

http://pc.watch.impress.co.jp/docs/2007/1018/kaigai394.htm

Dunnington and Beckton are said to be MCMs

Here's a more recent article

http://pc.watch.impress.co.jp/docs/2007/1023/kaigai396.htm

where he merely states Beckton will have 24MB "last level shared cache". Now if Beckton is going to be single die, it wouldn't suprise me either because these designs are going towards an extremely lucrative segment.

The lastest guidance projections from PC Watch was released yesterday if you want to check it out:

http://pc.watch.impress.co.jp/docs/2007/1109/kaigai399.htm

Finally, the latest webcasts from Fall IDF are here:

http://www.intel.com/pressroom/kits/events/idffall_2007/webcasts.htm#

Nehalem info comes about 20min into Otellini's presentation and near the end of Gelsinger's. Compare with what journalists reported coming from these events. Anything that says QC Nehalem will have L3 cache finds roots in fudster Hans de Vries and not from leaks or whatever Intel has revealed.; 11 November 2007 at 06:08
Anonymous said...: Ho Ho
Intel itself has talked about multi-level caches with Nehalem

Recall, I was really talking about QC layouts and comparing design philosphies. It's not my intention to speculate on Nehalem EX. Again, Intel has not stated that Nehalem QC will have shared L3 and non-shared L2. And again, I'm definitely agreeing that progressive versions of Nehalem will have shared L3 as that really is consistent with past experience.

Well, the more cores are attached to the same L2 the slower it gets.

I'm not so sure about that. There might be a concern that increasing the number of cores using the same L2 might lead to L2-bandwidth congestion. It depends on how Intel implements and how much L2 bandwidth each core demands.; 11 November 2007 at 06:14
Roborat, Ph.D said...: Orthogonal said: In Arizona we only have a Chipset sort facility so I don't know all the details involved in the CPU sorting (done in Malaysia I believe).

The ATD group in Chandler AZ do all development work for assembly and test. They also have a small manufacturing capability for pre-launch volume and the IPF line.

They have done some interesting stuff which is really a world away from wafer manufacturing, more physics related rather than chemistry (i.e., the hermetically seal Pentium(1) which is supposed to last forever, or the organic packaging technologies in a collapsed chip interconnect.); 11 November 2007 at 08:45
pointer said...: Orthogonal said...
In Arizona we only have a Chipset sort facility so I don't know all the details involved in the CPU sorting (done in Malaysia I believe).

I thought the sort has been always being done in the fab? I believe there must be one in the Arizona 'new' CPU fab.; 11 November 2007 at 09:01
Anonymous said...: “In Arizona we only have a Chipset sort facility so I don't know”

No, No, No, Mon Ami! C’est Se Bon! C’est Se Bon!

What good would the “Precious Ones” be without a beautiful bed to make Lamoure! Non, Oui!

For example, the very exotic 850E, panned by most main steam pencil necks and enthusiasts as Intel’s foray into the obscene was a revolutionary change in technology. The MCH, for it’s time was blindingly fast. It wasn’t until DDR2-3200 did the bandwidth approach RDRAM’s fast and furious speeds. They complained how high the prices were going to be. Now look what we are paying for high end DDR2 and DDR3 lovemaking! The fools, stuck in the memory missionary position, they took the very fast chipset on with evangelical fervor. I saw it as a smear campaign by the Asian memory cartel to eliminate a serious performance threat, from this country, at the time.

In conjunction with my recently (sadly) retired P4 3.06, there was nothing by sheer joy for years. Oh the wonderful encounters with PC1066, I wish they would have let that platform mature. Ah, C’est la Vive.

However, and more currently, XDR simply blows away DDR3 offerings. It’s “Deja Vous all over again”, as INTC is looking at the technology once again! This is not to mention the reintroduction of Hyper Treading, possibly returning with Super Model, Nehalem.

Hyper Threaded Nehalem with CSI, and XDR, stop I can’t take it!

But please, if you please, feel free if you can, talk about the hot bed of joy X48. This lusty combination of X48 (1600 FSB, I get the chills), QX9770, and DDR3 1800, is giving me that warm, gooey feeling I had with my 850E platform.

Also, don’t downplay the chipset thing. We all know how valuable they are in maintaining a great marriage! There’s nothing like a nice stable platform to have wonderful encounters.

SPARKS; 11 November 2007 at 11:02
Anonymous said...: It's freaking hilarious and pathetic that Scientia deletes Lex's post and then proceeds to selectively debate points that Lex brought up in the deleted post.

"expletives and insults" - yeah right, the only insult I remember seeing in Lex's post was that Scientia has no business commenting on process technology when he has no clue what he is talking about. I think Lex is not the only one who agrees with that. =p; 11 November 2007 at 23:24
Anonymous said...: It's freaking hilarious and pathetic that Scientia deletes Lex's post and then proceeds to selectively debate points that Lex brought up in the deleted post.

the argument for AMD has long been lost and should only get worst tomorrow. I don't understand the point of standing around and waiting for AMD to put up a fight.; 12 November 2007 at 00:53
Anonymous said...: "It's freaking hilarious and pathetic that Scientia deletes Lex's post and then proceeds to selectively debate points that Lex brought up in the deleted post."

Not quite as comical as the 'well he was just rehashing old arguments and points, anyway' (paraphrasing)

As opposed to how many time Dementia has ERRONEOUSLY claimed that AMD is only behind 1 year based on 65nm launch dates. Or how we can compare the Si process capabilities by looking at the top clockspeeds of 2 different architectures? Or perhaps, how Intel's 90nm process was crap due to P4 thermals? (Obviously ignoring the Pentium M data).

It's funny, I think he believes if he keeps saying 'AMD is only a year behind, AMD is only a year behind', that he may wake up one morning at it may actually be true?!? And it is OK for him to keep re-iterating his same old stale points in the form of paper bullets as he obviously has no more real ammo left in his own pro-AMD gun.

K10 will rock...things will be different in H2'07...well K10 was a good start...well 2008 is where things will be interesting...well it just needs to be competitive clock for clock...AMD will quickly ramp clocks.... well if they get the clocks ramped withina A YEAR... well it's not a paper launch...well they don't need volume until 2008...

Is there any point where he just might look at things objectively and then come to a conclusion? Sure he clearly has some emotional investment in AMD, but there are times where you have to tip your cap and say, you won this round, we'll try to get you the next time. At what point does spin and wishful thinking turn into delusion?; 12 November 2007 at 05:32
Anonymous said...: Where have abinstein and baronhowell now that Intel's Penryn performance numbers have been outed?

Chicken shits.

AMDZone has gone the way of the dodo. I guess the owner wanted to prevent mass suicides due to Penryn.

Scientia will try his unique brand of spin and censorship and claim that he does not have AMD bias.

Chicken shits.; 12 November 2007 at 08:04
Khorgano said...: Come on folks, be nice, you all need to remember this.

Scientia is never wrong, but on occasion, reality has failed to meet his expectations.; 12 November 2007 at 14:53
pointer said...: Khorgano said...

Come on folks, be nice, you all need to remember this.

Scientia is never wrong, but on occasion, reality has failed to meet his expectations.

yeah, agree.

similarly, I fast everyday, days and night, except when I am eating. :); 12 November 2007 at 16:18
Anonymous said...: Rectraction:

Previously from this link,

http://pc.watch.impress.co.jp/docs/2007/1018/kaigai394.htm

I had said Dunnington and Beckton were MCMs, despite the fact that the pictures have them as single dies. This error resulted from using the Google language translator. I've asked my wife. She has translated that they will be single die "rather than" MCM.

Ooops...

Also, Tukwila will be a massive single die. So, Intel appears prepared to make very large single-die chips, at least for the high-end server segment.; 12 November 2007 at 22:01
Anonymous said...: "So, Intel appears prepared to make very large single-die chips, at least for the high-end server segment."

Intel has always done this in the server area (didn't they have 24MB cache server chips or something big like that?). I think most people don't realize this and choose to believe AMD's problems are solely a large die problem that Intel will also eventually run into.

The subtlety people fail to appreciate is that it is much more likely that the large die highlights AMD's process issue / marginality (and the fact they are running it near a cliff). We'll see with Intel on 45nm, but Intel has been able to produce massive server die in the past on 90nm and 65nm, so I think the whole "big die theory" is a bit too simplistic and is more wishful thinking than anything else by the AMD fans.; 12 November 2007 at 22:42
Unknown said...: Are you still feeling rich sparks?! If so, may I present:

http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=3451428

QX9650 In stock right now. That's a hard launch. I'll be waiting until January for mine. $1000 is a tad too much for my liking :-); 13 November 2007 at 01:02
Anonymous said...: GIANT,

Ooooh, such a tease!

I’ve got my finger on the credit card trigger, and its itch’n and bitch’n. The Asus P5E3 Deluxe Wi Fi and Corsair Dominator DDR3 1800 look real good. If the QX9650 were released with a native 1600 FSB, I would have shot the load.

Nope, I want that QX9770! I’m gonna wait, just in time INTC’s nice New Years dividend check and X48.

SPARKS; 13 November 2007 at 02:34
Anonymous said...: Hey DOC, going back to your original post, it seems IBM does have egg all over its face. They pulled the spec scores. Whoops!

It must have pained Charlie to report it, but here it is:

http://www.theinquirer.net/gb/
inquirer/news/2007/11/12/
ibm-pulls-barcelona-spec-scores

SPARKS; 13 November 2007 at 02:46
Anonymous said...: Sure Ok, AMD is going to 45nM in 2008
Right, sure hundreds of thousands!
CTI, right.
New steppings, really!
40% over Clovertown!
Performance per watt with Pheromone.

FAB 38 undergoing major retooling!

BLAAAAA, wrong again!

http://blogs.barrons.com/techtraderdaily/2007/11/12/
amd-in-talks-on-sale-of-fab-38-to-tsmc-jefferies-says
/?mod=yahoobarrons

Do ya think I can get some bicycles from Wrector, cheap?

SPARKS; 13 November 2007 at 03:24
Anonymous said...: Hector lie?

Come on now

He also says he and Sanders get along well, despite their differences in style. "I am a collaborator and a team player," Ruiz says. "I am not going to go in public ranting and raving about Intel. That is not going to happen.

http://www.statesman.com/business/content/business/stories/archive/042102.html

Someone please post this nugget where others can see.; 13 November 2007 at 03:49
InTheKnow said...: The subtlety people fail to appreciate is that it is much more likely that the large die highlights AMD's process issue / marginality (and the fact they are running it near a cliff). We'll see with Intel on 45nm, but Intel has been able to produce massive server die in the past on 90nm and 65nm, so I think the whole "big die theory" is a bit too simplistic and is more wishful thinking than anything else by the AMD fans.

That is because those outside of manufacturing often only see part of the picture. Too many people want the whole thing to be distilled down to a nice simple solution. So all they see is that a large die and equivalent defect density = less good die.

They don't understand what it takes to keep a process tool on target, what those targets are or what being off target will do to the product. They don't understand control limits and spec limits or why it is bad to have your control limits and spec limits close to each other (i.e. running near a process cliff), let alone what it is like to try and run a process with spec limits inside the control limits. (I've had the misfortune of trying to run such a process until newer, more capable equipment could be installed.)

Those that don't have the experience and background and are unwilling to learn from those that do will never understand the idea of maintaining a process tool. The idea of those people understanding an entire process is laughable.

Since Nehalem seems to be a large die it would seem that Intel's concern with quad-core before 45nm wasn't a concern about defect density and dead die as I (and others) had originally assumed. It would seem it was more an issue about architectural capability.

With the ability to pack more transistors into the same space, Nehalem offers the potential for more computing power in a lower power threshold than would have been possible on 65nm. In retrospect, I can only assume this was what Intel was referring to when they said that native quad core wasn't practical on 65nm.; 13 November 2007 at 05:46
Anonymous said...: "Since Nehalem seems to be a large die it would seem that Intel's concern with quad-core before 45nm wasn't a concern about defect density and dead die as I (and others) had originally assumed. It would seem it was more an issue about architectural capability."

Another theory...this one may be a little heavy on the semiconductor physics:

Intel runs a dual Vt process - this is a process which has both NMOS and PMOS transistors running at 2 different threshold voltages (the cost of doing this is that it adds an extra litho and implant step for both NMOS and PMOS). For reference, I believe AMD runs a tri-Vt process. The reason behind doing this is that the low Vt devices are faster, however at the cost of increased leakage. By running a dual Vt (or tri-Vt) process, you can run the critical speed portions of the chip on the low threshold voltage transistors (and deal with the leakage) and run the non-critical area at the higher Vt (and get better leakage performance in those areas).

Typically the cache is high Vt device - in the past Intel has had a fairly high % of cache vs logic - however with Nehalem this ratio will go down and therefore in general, everything else being equal, Nehalem will be be leakier (obviously architecture, sleep states, clockspeed differences come into play).

I believe the issue AMD is having with binsplits on 65nm is leakage (conjecture on my part, but let's say with the background I have, this not at the Dementia level of conjecture). If they can get leakage under control they can run at higher clockspeeds, however to get into the TDP's advertised they are forced to limit clockspeeds.

Perhaps, Intel saw this same issue with Nehalem (remember they knew the % of logic would be going up with Nehalem as there is less cache on a % basis) and knew that 65nm, which is basically at certain leakage limits, would struggle with higher clocks on quad core die. With dual core and MCM approach you should be able to clock them better (which makes me wonder why AMD has made no mention of dual K10's 0 in theory these should be yielding/binning much better).

However with high K/MG, both subthreshold and gate leakage (the 2 main components of leakage) are substantially reduced which should allow for acceptable TDP's at reasonable die size/clock speed.

To summarize:
* larger die (or more transistors) = more overall leakage (thank you captain obvious, eh?).
* IMC means more logic / less cache which means a greater percentage of those logic low Vt ("leaky") transistors. With cache "heavy" processors like C2D, this is less of an issue as a greater % of the chip is this hgh Vt/low leakage area.
- More manageable leakage on 45nm with advent of high K/metal gate means it is likely to have more stable / better bin splits.

I believe intheknow is dead on about this not being a simple particle issue (although obviously defect density comes into play a bit) - it is another case of AMD fans wishing the problem away by saying that Intel will have to deal with the same problem too. This is much like they suggest Intel will botch the Nehalem schedule as this is their first rev of IMC (which by the way it is is NOT!) I believe Scientia's past predictions on the absolute best case Nehalem release was H1'09....

Back to the point - I think the native quad issue is more a process capability (specifically leakage) issue. If that is true Nehalem should be well positioned as 45nm will be up and running for a year and is better equipped to deal with leakage issues than 65nm.....it also means AMD will likely continue to struggle on K10 with 45nm as they will still be dealing with leakage issues as they are not moving to high K gate oxides (though 45nm should help a bit due to the lower active power)

Sorry for the ramble...; 13 November 2007 at 06:48
Anonymous said...: Anonymous said...

Intel has always done this in the server area (didn't they have 24MB cache server chips or something big like that?).

The Itanium line has been and still is MCM. Tulsa has 16MB of off-die L3.

It's not like Intel will using 400mm^2 monolithic dies for the desktop. Increasing die size will always decrease yields all things equal. But, Intel appears to have the advantage here.; 13 November 2007 at 07:51
Anonymous said...: One factor that has often been overlooked is the effect of increased transistoral density with progressive process shrinks on thermal density, specifically hotspots that critically increase resitivity.

AMD may have been too aggressive with 65nm designs, to increase die count/water. Chalk another one up to poor design... that would be Mister "marketshare at all cost's" fault.; 13 November 2007 at 07:55
Roborat, Ph.D said...: ...it also means AMD will likely continue to struggle on K10 with 45nm as they will still be dealing with leakage issues as they are not moving to high K gate oxides (though 45nm should help a bit due to the lower active power)

but the big question is how much AMD can successfully scale its gate CD at 45nm without switching to a high-k/metal gate solution. 65nm already is at the limit for SiO2 as a gate dielectric without drastic leakage implications and we can see that AMD's process is already at the cliff egde. I'd say AMD's leakage problems will only magnify at 45nm. Of course they'll soften the impact by partial scaling, then again this provides partial performance gains. I can only see AMD falling further behind because of this.; 13 November 2007 at 22:52
Anonymous said...: "but the big question is how much AMD can successfully scale its gate CD at 45nm without switching to a high-k/metal gate solution."

Robo, you are dead on - they won't be able to scale very much, the 1.2 nm EOT (or so) is as thin as it gets without leakage out the wazooh (that's technical for a lot). Also at this point nitridation techniques (to put N into the SiO2 and achieve some minor K value scaling) is pretty much at it's limit too. Without scaling the gate oxide thickness, scaling the gate CD becomes difficult and AMD will suffer from various SCE's (short channel effects) from the smaller gate CD's. You can minimize these a bit with implant and anneal and other process tricks, but you really need to scale the oxide to keep this under control. So the choice is to scale the oxide (which really can't be done due to leakage), not scale the CD's as aggressively as desired, or start running a much more complicated process and teak the heck out of some of the other steps (which doesn't lend itself to a very manufacturable process)

There was very little scaling done on 65nm on the gate oxide to begin with... much of the work done on this technology was strain, implants, and some nitridation work to move SiON closer to SiN which has a higher K. This gave some Ion improvements, but leakage is an issue (look no further than AMD's clock rate problems on 65nm).

Again, this is why it is extremely naive (I hate to keep saying this) to simply look at the technology node introduction to compare Si process technology like Dementia does. There is so much more to it then simply the tech node and when it is introduced... if you talk to anyone in the field (that is not working for either Intel or AMD) they will likely acknowledge that Intel has a several year lead on Si process technology over the entire industry. A minimum of 1 year on schedule and several years on critical technologies (strain, high K, salicide, anneal to name a few). The areas where AMD purports to be "ahead" (immersion and SOI), are very dubious as Intel is able to get comparable or better performance WITHOUT needing to use these technologies.

Also of note is continued high K scaling will be a challenge - the solutions implemented on 45/32nm will likely only be good for 1-2 generations and will then have their own scaling & leakage problems, so it's not like if/when AMD implements high K they will be "caught up", at that time Intel will be on their 2nd generation high K solution (beyond Hafnium).; 14 November 2007 at 07:06
Anonymous said...: From above ...

"Robo, you are dead on - they won't be able to scale very much, the 1.2 nm EOT (or so) is as thin as it gets without leakage out the wazooh ... " (not quoting whole post)....

EVERYONE READ THIS.... :) ... bang on.

In fact, the SCE and the inability to scale the gate thinner at 45 nm may in fact induce a repeat of AMD's 65 nm 'out of the gate' performance (pardon the pun). i.e. it is quite possible (perhaps even likely) that the 45 nm process will under perform (wrt to Fmax) the 65 nm process.

AMD continues to fall behind, ramping in 1H 2008 is not 'catching up' frankly.

Jack; 15 November 2007 at 00:05
Anonymous said...: "it is quite possible (perhaps even likely) that the 45 nm process will under perform (wrt to Fmax) the 65 nm process."

Not sure if I agree with this - 45nm will yield active power gains (from the lower VT targets). With that alone AMD should be able to increase Vcore (or essentially hold it the same as 65nm instead of reducing it) and still iive witin the overall TDP's - essentially lower active power but higher off power.

The problem with this is it will hit the idle power consumption more.

Obviously this is not the way to design the process, but with no gate oxide scaling you either don't significantly scale performance or need to steal from other areas (like power). It is neither an easy or pleasant decision to make.

The only problem with this dicussion is we will likely never know the answer, the 65nm process is still maturing and AMD may more or less ditch trying to improve it and focus on 45nm (so the 65nm baseline will be artificially low). Another problem of drawing technology comparisons when using the CTI approach. Also it looks like when 45nm finally comes out in H2'08 (ramp in H1'08 = no product until H2), the 65nm K10 clocks will still be fairly low.; 17 November 2007 at 22:38
Anonymous said...: "Not sure if I agree with this - 45nm will yield active power gains (from the lower VT targets). With that alone AMD should be able to increase Vcore (or essentially hold it the same as 65nm instead of reducing it) and still iive witin the overall TDP's - essentially lower active power but higher off power."

There is a lot dependent on the Vt, without scaling the oxide they will have a tough time maintaining Vt and keeping drive currents high enough to get clock speed up, even with the commensurate gains in thermals due to lower power density. This is essentially what happened with their 65 nm process, overall their drive currents were not as healthy and the stress engineering came up short.

Unless they work a miracle, the stressed channel will yield less performance enhancement over 65 nm simply due to the shrink (stress is a pressure, so if stress stays the same but smaller, the actual strain is less).

I could be wrong, but I am not hopeful.

Jack; 18 November 2007 at 00:37

11.07.2007

Barcelona Benchmarks Are Non-Compliant - SPEC

81 comments:

Links

Archive

About AiMeD