12.05.2007

All AMD needs is some TLB

TLB – translation look-aside buffer. Sounds techie and yet doesn’t sound so disastrous. Bear in mind that TLB is a microcircuit within the CPU and what AMD wants you to know is where the problem occurs while not necessarily telling anyone what went wrong and what changes are required to fix it. Or maybe I’m just the type that wants more detail. As a consolation they did say what could happen and I suppose the worst is a “system hang” with a very low occurrence rate. It’s not really the worst problem but you can scratch servers that might be used for critical missions off your customer list. Above anything else, this should be the reason why no OEM is shipping Barcelona servers at the moment. The seriousness of this bug makes even the poor performance just an after thought.

The fact that the problem is across the K10 platform (Opteron and Phemon) makes it possible that this is a micro-architectural problem and should have been identified at the Functional Validation stage. There are several other validation screens in place to catch problems like this but because it managed to slip out makes this more of a symptom of how AMD operates these days. A rushed and under resourced design process can lead to disasters like this. But to be fair to AMD no amount of pre-silicon validation can successfully screen the complexity of today’s microprocessors. Running a full chip simulation on an almost infinite amount of combinations of dyadic instructions is impossible especially when time is crucial.

Validation Primer:
There are two levels of validation and they are done at pre-silicon and post-silicon levels. At pre-silicon, Functional Validation and Logic testing are done.
Functional Validation is a simulation done to test different micro-architecture features such as Barcelona’s TLB. This is time consuming and compute-intensive, typically running assembly language in low single digit Hz speeds completing billions of cycles per week. The test is feature-focused and localised.
The other pre-silicon validation is logic testing and the purpose is to validate circuit behaviour using logic sequence and combinations.

Post-silicon validation involves Performance Verification, Design Validation and ultimately Manufacturing Validation.
At Performance verification the chip is tested against physical specifications such as leakage, voltage, temperature and more importantly speed and timing. Being slow and leaky, anyone can imagine how AMD would have felt when K10 reached this stage. Timing analysis is done here and if Barcelona had critical path problems at the L3 TLB like GURU suggested, they should have detected the problem at this stage.
The next step is Design Validation where a complete system level check is done. At this stage the CPU is attached to normal peripherals (i.e., BIOS, chipsets, Operating System) and all features of the chip are tested.
The last step is Manufacturing Validation where yield becomes the major metric which is essentially driving overall cost to manufacture. Unfortunately for AMD their problem with design is compounded by 65nm process issues.

Due to lack of information, it’s hard to say if AMD is applying a circuit/architectural design change or a process design change on the B3 stepping. A circuit design change points to a poor pre-silicon validation process while a process design change (i.e., change in CD’s) points to post-silicon validation mistakes. To be honest, with the amount of problems K10 is having only God knows how many modifications AMD plans to include in their next stepping. We hope they get it right this time around because I can only feel sorry for the guy who buys a tri-core Phenom with a microcode patch disabling the TLB. Surely there is a line in the sand that says when a product in broken and cannot be sold.


Update: In relation to my blog post on the 27th Nov wondering where the AMD stock would settle, well today the stock closed at $8.91 as it continues to slide downward. It appears like institutional investors are finally bailing out so no bottoming out just yet.

78 comments:

Tonus said...

What is most embarrassing about this is AMD's response when asked why their technical documentation had not included the TLB erratum. They claim that the person in charge of making the updates to the technical documentation is on vacation!

The response looks contrived and designed to continue a pattern of hiding information or of covering themselves with bad information. More and more you get the feeling that AMD is hoping to survive with as little damage as possible until stepping B3 is released. And more and more I worry that stepping B3 will not be enough to stem the tide of bad news.

AMD does not have Intel's resources or industry position, so they cannot scrap Barcelona the way Intel was able to scrap Netburst. They need to fix their problems and make Barcelona competitive, not with current CPUs, but with current and upcoming Intel CPUs. 2008 is shaping up to be a very bad year for AMD, it looks like they will spend the next 12-13 months mostly hanging on.

Unknown said...

Just found this:


SAN FRANCISCO (Dow Jones) -- Shares for Intel Corp. jumped more than 3% Wednesday morning after the chip giant's stock was upgraded on expectations of a robust personal computer market in 2008.

The move added to Wall Street's already bullish view on the chipmaking giant. Intel's shares have benefited from the positive sentiment, having already risen 35% so far this year. The stock is now at its highest level in two years.


Analyst Kevin Cassidy of Thomas Weisel Partners sees additional upside. He raised his rating for Intel (INTC) to overweight, citing projections of strong seasonal demand for PCs that is expected to continue onto the coming year, especially in emerging markets.

"We believe 2008 PC demand could exceed expectations driven primarily by trends in emerging markets, including Brazil, Russia, India and China," he wrote in a research note.

Notebook PCs will be particularly strong in the coming months, which will be a big boost to Intel because the company has "roughly 20% higher dollar content per notebook versus desktops," he added.

Cassidy said notebook shipments in the fourth quarter have been "limited by supply of certain components" such as panels, batteries and keyboards.

"We believe this pent-up demand will carry over into [the first quarter] of 2008 driving revenue higher than seasonality may suggest and above the street's expectations," he said.

Cassidy is hardly alone in his bullish view on the chipmaker. Currently, 29 of the 37 analysts covering Intel rate the shares as a buy or equivalent, according to data from Thomson Financial.

He echoed the view of other analysts who see Intel outpacing rival Advanced Micro Devices Inc. (AMD) on the technology front.

AMD, the No. 2 maker of computer microprocessors, has struggled to keep up with Intel, which just recently rolled out its first products based on the state-of-the-art 45-nanometer manufacturing process.

"Based on various industry reviews, we believe Intel currently has the superior [computer processing unit] line-up in all market segments -- servers, desktops, and notebooks with Intel's greatest advantage in notebook CPUs,"
he wrote. "It now appears to us AMD may not have a competitive alternative to Intel's barrage of 45-nanometer CPUs through 2008."

AMD shares were flat Wednesday morning.

Cassidy raised his revenue estimates for Intel for 2008 to $42.5 billion to $ 41.7 billion and GAAP earnings per share estimate to $1.52 from $1.28.


AMD is finished.

Unknown said...

This just in from AMD:

11:29 (Dow Jones) AMD says contrary to some analyst reports, it has not
halted shipments of its new Barcelona chips, which have been subject to delays
due to a technical glitch. AMD has a software fix for the problem and is taking
the chips and the fix to customers that have made large orders, including some
OEMs, spokesman Phil Hughes says. As such, AMD still expects to ship "hundreds
of thousands" of quadcore processors for servers and desktop computers in the
4Q, as promised on its recent quarterly call. "As we get into 1Q, we'll be able
to ship the platform more broadly," to OEMs and the company's sales channel, he
says. AMD down 0.8% at $9.18. (RTR)

Eddie said...

Roborat64, what exactly do you mean by "There are several other validation screens in place to catch problems like this but because it managed to slip out makes this more of a symptom of how AMD operates these days"... I wrote an article where I collect the evidence that AMD launched Phenoms knowing they were defective, so what is the stage you mean that the bug slipped?

On the other hand, AMD assumed that disabling the thing could cost 10% performance, so, I don't find your statement that "The seriousness of this bug makes even the poor performance just an after thought", because this bug is pathcable through a performance hit between 20% according to early reviews and 1% in a patched Linux Kernel; I mean, this is something that zaps performance.

Keep on the good work.

Ho Ho said...

Is anyone surprised that AMD basically scrapped K10 based X2 in favor to K8 based ones? I certainly am not.

Roborat, Ph.D said...

Eddie said: Roborat64, what exactly do you mean by "There are several other validation screens in place to catch problems ... I wrote an article where I collect the evidence that AMD launched Phenoms knowing they were defective, so what is the stage you mean that the bug slipped?

I wrote a detailed description of different validation screens that is supposed to catch serious bugs such as this. While K10 has some serious manufacturability problems i don't think AMD knew about the bug or at least the seriousness of it until recently. A stop-ship order is serious enough not to launch a product because it always costs more when your customer already has the CPUs on the systems. The further down the supply chain, the more expensive and embarassing it is.

On the other hand, AMD assumed that disabling the thing could cost 10% performance, so, I don't find your statement that "The seriousness of this bug makes even the poor performance just an after thought"..

What I meant with that statement is that, if an OEM were to find a reason not to start selling Barcelona, the primary reason would be the bug while its poor performance now only comes second (like I said, an afterthought because of its seriousness).

Anonymous said...

I have to take issue with the tri-core being defective. While it is a quad with a diasbled core, it is by no means defective and their are countless examples in and out of the semiconductor space where this is done.

That said TLB is a significant issue and should have been openly stated, vacationing errata publishing folk aside - are you FREAKIN' KIDDING ME!?! Up front and any early adopters should have been given the option of taking a SW/BIOS patch which gives a 10% hit, take a risk on it not happening much, or waiting until thenext stepping came out. This was clearly botched from a PR perspective from the get go. AMD should get out in front of this and offer some sort of exchange or credit for a future K10 for the suckers, um...customers, who bought an early K10.

There is nothing wrong with AMD trying to sell tri-cores; if people want to buy them, great, if not AMD will end up cutting prices on them further or just drop it all together. I honestly hope it's the latter as dual and quad (and probably soon octa) cores are enough in my view from a market complexity standpoint and we don't need to give SW development folk another excuse for the pace of SW development.

Christian H. said...

Is anyone surprised that AMD basically scrapped K10 based X2 in favor to K8 based ones? I certainly am not.

Where'd you hear that? They are just moving ALL X2s to 65nm. There will still be a Kuma and probably a Rana. That will cover every market segment. That's the point. Griffin K8 entry level mobile, Kuma mobile desk replacement, etc.

I guess it'll be good to have another place to drive people crazy.

Eddie said...

Roborat64, I am sorry, but I don't feel my questions were answered: First, there is evidence that AMD launched Phenom knowing it had the L3 problem, they even used that as an excuse for not launching the 2.4GHz; so, the problem was detected and I still don't know what you mean by the error having slipped.

Second, the bug ultimately implies the performance hit of the patch and the availability of the patch. As serious as it sound, AMD is still selling K10s bug 'n all, so, performance is not an afterthought compared with the severity of the bug, it is the performance hit of the existing patches what constitutes the severity of the bug, there is a ray of hope for AMD because Linux was able to patch around the bug with a 1% performance hit, although who knows the exact details.

Again, keep on the good work, and don't be too concerned about a little pedantic comment. I understand how difficult it is your blogging initiative; I am just asking for a bit more of precision, if you have the disposition.

Roborat, Ph.D said...

eddie said: ...First, there is evidence that AMD launched Phenom knowing it had the L3 problem, they even used that as an excuse for not launching the 2.4GHz; so, the problem was detected and I still don't know what you mean by the error having slipped.

there is always a lead time between knowing you have a problem and quantifying the impact of the problem. a company that makes logic products for consumer level quality requirements will always err at the side of risk rather than safety before ordering a recall. So maybe that is why AMD took a while to acknowledge the problem. But you could be right with your assumption that AMD completely knew how bad the problem is when they launched Phenon and went ahead anyway, but the fact is we realy don't have proof about how much they knew then.

When I said performance is an afterthought due to the bug, I meant it as an expression rather than fact. i feel we're not fully connecting.

Anonymous said...

"Is anyone surprised that AMD basically scrapped K10 based X2 in favor to K8 based ones?"

I have to agree with Christian - this allows AMD to push out the dual K10's (which I think were originally either scheduled for this quarter or Q1'08). It is rather obvious the TLB bug partially screwed this up as they were likely hoping to launch higher clocked dual cores. I don't think even Hector has the cojones to try and get away with a 2.3GHz dual core "black edition"! (Though I'd expect to keep hearing the price performance per average power per Si area per monopolistic power * innovation benchmark!) I personally can't wait for the eventual black widow platform (trademark pending) - you take a spider platform put it with a "black edition" processor and crank it 1.5Vcore and cross your fingers.

Given that AMD wants to shutdown F30 and would preferably not want to use Chartered 90nm foundry (hits margins) and that they probably have an EXCESS 65nm capacity (given the slow transition to K10) this move makes sense financially.

It is curious though as I stated earlier that they would simply drop the top 2 90nm K8 bins - this really seems like a 65nm process capability issue. What this will allow is an even slower transition to K10 should AMD continue to have problems (or are unable to get clocks up). All in all a decent business move - though probably a decision that should have/could have been made 2 quarters earlier if they hadn't been drinking the K10 coolaid.

Anonymous said...

Check this out. New AMD press release...

http://investorshub.advfn.com/boards/read_msg.asp?message_id=24935068


LOL

Anonymous said...

HoHo -
"Is anyone surprised that AMD basically scrapped K10 based X2 in favor to K8 based ones? I certainly am not."

Kuma has always been on the roadmap, but a few months ago the buzz was it was pushed out... there has been little detail here.

I don't think it was scrapped, I think this is a hedge move. Considering that, for all intents and purposes, K10 will actually launch mid Q1 2008, the Kuma tape out for a dual core version is likely waiting on a complete debug of the core logic (speculation on my part true).

As such, AMD is probably looking to improve the costs (margins) by moving the remaining dual core for K8 over to the smaller node (a money saving move, plus the means by which they can convert Fab 30).

They are sacrificing their some of their highest performing parts to get there though as the 65 nm process will not clock as high as the 90 nm process. The hit in any ASP that this might bring evidently will be offset by Quad and Tri-core, while bringing the cost structure down.

This, in fact, seems a reasonable move considering their circumstance.

core2dude said...

We practically have no proof regarding how blatant of an issue this problem is. "The CPU freezes very rarely". Guess what, if it freezes once after 200 hours of uptime, no amount of pre-silicon validation is going to catch this bug. 200 hours of CPU uptime is equivalent to probably billions of hours of pre-silicon validation. So I would not just as yet blame AMD for not catching this bug. After all, TLBs are very complex--and with the "nested paging" support on K10, it becomes even more complex. It is entirely perceivable that some particular combination slipped through.

OTOH, AMD clearly knew about the TLB problem when the launched the CPU. Intel would not have launched a CPU fully knowing that it had an outstanding bug--not without making sure that the patch was already well-deployed in the field.

But considering how much pressure AMD is under, this behavior is also understandable.

OEMs will not ship the system until the microcode patch is fully integrated and tested at their end. Typical OEMs do not allow end users to update the BIOS, and it would be a nightmare for them to deploy a system that they know requires a BIOS update

InTheKnow said...

Intel would not have launched a CPU fully knowing that it had an outstanding bug--not without making sure that the patch was already well-deployed in the field.

I wish this were true. Do you remember the division error that existed on an Intel processor? I don't know if Intel knew about it or not at the time of release, but Andy Grove's cavalier attitude towards his customers (whether he was right or not is irrelevant) leads me to doubt your assertion.

His attitude as I recall was "it won't affect you if you don't do intensive mathematical operations, so quit whining and live with it." With that attitude, you have to wonder if they knew and released the processor anyway.

Unless I'm mistaken, Intel ended up offering replacement processors free of charge to anyone who wanted a replacement. I wonder if AMD will end up doing the same.

Anonymous said...

K10 dual cores were originally speculated and planned (according to AMD's own foils) for Q4 release:

http://www.dailytech.com/More+AMD+Next+Generation+Desktop+Details+Leaked/article5874.htm
http://www.fudzilla.com/index.php?option=com_content&task=view&id=1046&Itemid=1
http://www.electronista.com/articles/07/07/26/amd.details.cpu.roadmap/
http://www.techpowerup.com/?19758
http://img115.imageshack.us/img115/457/amdroadmaphu0.gif
http://arstechnica.com/news.ars/post/20070727-amds-platform-roadmap-for-graphics-processing.html

The last link has an ACTUAL FOIL FROM AMD which has phenom X4 and X2 both "planned" for 2007.

The latest roadmaps now show a Q2 dual core launch (can't find a quick link) which represents a minimum of a 2 quarter slip on the dual core side. Personally I would call the quad core a 2 quarter slip 2, as the original plan for both server and desktop was Q3 and neither will really be available/functional until Q1 (though some could and will argue that these are "launched")

Given the current initial data on the quads, if AMD can't get the K10 DUAL core speeds up, it makes sense to keep pumping out the K8 dual cores. Also as it appears as though the quad conversion is much slower than anticipated due to the various gotchas, so this allows them to keep pumping out the cheap K8's while they slowly convert to K10 quads. Once that gets far enough along, I'm sure AMD will begin phasing out K8 dualies as they start ramping up K10 dualies.

I would be curious to hear from the folks who thought Intel's Core2 conversion was slow (BTW - Intel was at 50% Core 2 server volume 6 months after launch) as to what they think about AMD's K10 conversion rate/ramp. I would think with AMD's glorious APM 3.0, and fab flexibility/opertaional excellence that they would be able to far exceed Intel's antiquated copy exactly approach.

Anonymous said...

"I wish this were true. Do you remember the division error that existed on an Intel processor? I don't know if Intel knew about it or not at the time of release, but Andy Grove's cavalier attitude towards his customers (whether he was right or not is irrelevant) leads me to doubt your assertion.
"

Actually, Intel did not discover the FDIV bug that you refer to, it was reported by a mathematics professor when he got such a bogus result on one computation that he could not rationalize it. After some clever investigative work, as I recall, he found that a certain combination of calculations with a certain bit pattern for the divisor would trigger the bug.

So technically, Intel did not knowingly release the processor with the bug, it was kinda of a surprise to them.... of course, we all know how they botched the PR on that one....

I actually owned on of those 90 MHz Pentiums with the bug... never replaced the CPU, never had a problem.

Anonymous said...

Ahhh, found it.. an account, blow by blow of the FDIV bug by the person who uncovered the bug...

http://www.trnicely.net/pentbug/
pentbug.html

Ho Ho said...

anonymous
"I have to agree with Christian - this allows AMD to push out the dual K10's "

Ok, maybe I was a bit too direct when saying "scrapped". My point was that for as long as K8 x2's are being produced there is no point in selling K10 x2's, at least not as long as their clock speed reaches the same level so new performance levels are reached. Anyone wants to take a guess how long it takes to releases 3GHz+ K10 x2?


"Given the current initial data on the quads, if AMD can't get the K10 DUAL core speeds up, it makes sense to keep pumping out the K8 dual cores."

I agree, this is the reason I wasn't surprised to read about that. I was always expecting them to delay K10 dualcores for as long as K8 was at least competitive.

core2dude said...


Unless I'm mistaken, Intel ended up offering replacement processors free of charge to anyone who wanted a replacement. I wonder if AMD will end up doing the same.

AMD does not have to, as the problem is fixable using a microcode patch.

However, from the perspective of Dell and HP, this is an unfixable problem. They ship systems with locked BIOS. So, unless they recall the shipped system, they cannot fix the problem. That is a huge nightmare for an OEM that operates at razor thin margins. So they will not ship any systems unless they are fully confident that the micro-code patch reliably addresses the issue.

I do not know whether Intel knew about the FDIV bug beforehand, but I would be surprised if they did.

Also, that was then and this is now. Times are different, companies are different, expectations are different.

I believe, in good old days of K8, even AMD would not have knowingly released a buggy CPU, but today they don't have a choice. If it is fixable in the field, they will release it.

Anonymous said...

HoHo -- re pushing out... thanks for clarifying, makes more sense.

Anonymous said...

Whoops!

Phenom TLB Patch Benchmarked

Across every test we ran, the difference between the Phenom 9600 with and without the TLB patch averages out to 19.3%. However, if we rule out the synthetic memory tests and consider only the application tests, that difference drops to 13.9%.

The most troubling results here are the applications where we see large performance drops with the TLB erratum workaround active, including the Firefox web browser and the picCOLOR image analysis tool. If one happens to spend a lot of time running an application whose memory access patterns don't mix well with the TLB patch, the result could prove frustrating. The BIOS-based workaround for the TLB erratum may achieve its intended result—system stability—but it comes at a pretty steep price in terms of performance.

For the average retail PC consumer, this price might not be unacceptable. After seeing the Firefox test results, I spent some time browsing the web with our Phenom-based test system, and it didn't feel noticeably sluggish to me compared to most modern PCs. Then again, I doubt whether the average sort of consumer is likely to purchase a system with a quad-core processor. One wonders where that leaves AMD and the PC makers currently shipping Phenom-based PCs. I'm not sure a recall is in order, but a discount certainly might be. And folks need to know what they're getting into when purchasing a Phenom 9500 or 9600-based computer this holiday season. Caveat emptor, indeed.

In fact, a credible source indicated to us that at least some of the few high-volume customers who are still accepting Barcelona Opterons with the erratum are receiving "substantial" discounts for taking the chips. One would hope consumers would get the same consideration. The trouble is, I doubt AMD would have shipped Phenom processors in this state were it not feeling intense financial pressure.

Unknown said...

Who wants to hedge bets on when AMD will release a product that can best my Q6600 by a significant (10% or more) margain? My money is on 2009 with a 45nm product. By then Intel will be flooding the market with Nehalem.

Anonymous said...

http://www.tgdaily.com/content/view/35149/118/

'As a result of the erratum, AMD is currently shipping quad-core Opterons only to "specific end-user installations where customers have had the opportunity to validate the stability and robustness of the solution where it leverages the BIOS fix or some other potential software workarounds," according to Phil Hughes at AMD.'

Spin machine is in full effect - "where customers had the opportunity to validate the stability and robustness of the solution"....isn't this AMD's job - I mean especially on the spider 'PLATFORM'... you gotta be kidding me...

If you read further in the article, AMD gave their gains in quad core market for Q4? Is it just me or is Q4 not over yet? How could they have this # with a month still to go?

"No guidance has been given by AMD as to the significance of any impact on performance from the BIOS fix." Thanks to the previous anony who provided a link (apparently AMD CHOSE not to do this themselves? Or were they just embarrassed by the result?)

This issue probably isn't that big a deal, but it does show AMD in a new light in how they have handled it. Oh, the article said the BIOS fix was released PRIOR to the Nov19 launch, so clearly AMD knew about the issue and just chose to bury it. What a joke - makes me long for the days of Henri Richards, at least he could put a decent spin on things...it's like they're not even trying anymore.

Conversation at AMD:

"well let's just let it slip out that it only affects parts at 2.4GHz and up, since we can't ship those parts anyway" (everyone laughs)

"no, let's invite the press up to Tahoe, have them test the next B3 stepping without the issue and just prior to them publishing their reviews we'll tell them we *just* discovered the problem so it's too late for them to do anything about it!"

"that's good, that's good...but I need some more out of the bx thinking...."

"how about we release a fix and not mention the negative hit on performance, and then say we are only providing solutions to our most valued customers who can validate stability...."

"Great...get the email out, I'm sure some chump at THG will print it!"

"Our work is done here boys, see you in Q1 when we actually have to launch this thing again!" (More laughter follows. Hector kicks his feet up on the desk, pulls out an AMD share and lights it up in order to light up a cigar.)

Anonymous said...

Hector lights up another share...

"look everyone - ASSET LIGHT!"

Anonymous said...

"Unless I'm mistaken, Intel ended up offering replacement processors free of charge to anyone who wanted a replacement. "

Yes, I believe it was the Pentium 90 and I did order and get a replacement. Intel initially made light of the bug but changed their minds and offered replacements within weeks after all the uproar and bad press they got. I don't recall any allegations that Intel knowingly shipped processors with the bug however.

Anonymous said...

Looking back, a few things occurred to me. Back in November, IBM failed ‘spec’ validation because ‘they could get the required volume of processors’ to meet the necessary criteria. Well, based on this new (?), TLB publicly released information, IBM was probably aware of the’ bug’ all the while. The question is why they went ahead with the validation tests in the first place.

I said Cray was quiet months ago, in fact, too quiet. Now we know why. Apparently they knew, too, as they are staying with the K8 line.

What’s worse, in this AMD processor tech conspiracy theory, is AMD deliberately sold an inferior, broken product to the enthusiast class buyers. The big vendors didn’t want it, so they dumped it on the little guys.

AMD fan boys wake up and smell the coffee. Think twice when you call Intel the evil empire. It doesn’t get any more evil than this. Sure, Netburst was long in the tooth and was way outdated, but it worked, and worked very well, as it wasn’t a BROKEN product at launch. Intel didn’t deceive any one, they took the bull by the horns a built a better product.

Selling an undisclosed number of broken processor knowingly to the public constitutes FRAUD and is subject to a major class action suit if such evidence were discovered.

Why did IBM go forward with the tests, why was Cray so quiet, why is every vendor steering it’s buyer towards Intel products or older AMD products? I’ll tell you why, because they will not knowingly sell broken product to banks, corporate buyers, etc. They DON’T want to put themselves in the same legal position AMD may ultimately face! I don’t believe IBM; let me say that again, I---B---M would miss this bug in its OWN validation test, not by any stretch.

Doc, I’m sorry to disagree with your technically brilliant comment, as I believe they didn’t miss a thing during testing. Frankly, with all due respect, I am not that naïve. I feel they shoved this dirty pig with lipstick out the door and sold to the general public with FULL knowledge of its flaws out of sheer desperation. This is why there were delays after major delays. They couldn’t get it past the BIG GUYS; therefore, everything else regarding this “SOFTWARE FIX” is utter nonsense.

Just finding the TLB ‘only recently’ is an outrageous deception. I’m sure even INTC knew about it as they scaled back on the launch speeds of Penryn.

Delay, after delay, for 6 months is indisputable. Slow launch speeds of Pheromones are indisputable. Major AMD inside corporate players bailing from AMD is indisputable. Stepping after failed stepping is indisputable. The time line is perfect, the product was broken and they sold it anyway.

This all maybe conjecture. However, I’m sure there’s a paper trail somewhere. And, it may come out during discovery in court of LAW.

SPARKS

Anonymous said...

Sparks a couple of errors (I think)in your analysis.

The Cray order was delayed for the 8xxx series - if you recall their stock took a hit when the specific Barcies they needed were delayed, I think this had something to do with the HT speed or the # of links (not necessarily this TLB bug). I'm not saying that he TLB thing wouldn't have hit them too, but they delayed for availability issues that had more to do with just the TLB errata.

As for IBM it is impossible to say whether they knew or not - even with no bug AMD would have had issues producing these chips in quantity this early on. That sais they must have had samples fairly early on (way before the Barcy launch) so it is difficult to say whether or not they encountered the issue - also who the hell know what AMD was handing out for samples in terms of validation testing!

The Phenom launch however was acomplete fraud (figuratively, not legally) as the BIOS fix for the Barcies was releaesed before the launch (unless AMD is under the foolish assumption this issue only occurs on Barcy?). For Phenom, I don't see how AMD can possibly get around things without having a replacement offer after the B3 stepping comes out - if they don't it is simply a matter of time before someone sues.

As for legal issues, it is a matter of how a company reacts to the erratamore so then the errata itself, especially in today's litigous society where we have idits suing Apple because they lowered the iPhone price and they paid a higher price by being one of the idiots buying it on the first day. AMD could deal swiftly with this issue by promising a replacement chip to anyone who wants one, I would also think they should either suspend production until then (unlikely) or put a hge warning label on everything (which is what will happen).

Either way AMD botched this and the biggest issue was not missing the issue, but how they reacted to it. If they react quickly going forward they can quash this, otherwise they run the risk of permanently tainting the K10/Phenom brand.

Anonymous said...

AMD has had a bad week...

http://www.channelregister.co.uk/2007/12/06/amd_benchmarks_down/
SPEC scores will be removed.

InTheKnow said...

Jumpingjack, thank you for the link. From the link you posted I think the following points put Intel in a poor light from more than a PR perspective.

Intel's initial failure to publicize the problem, even in a listing of errata to their OEMs and most valued customers, was in retrospect a mistake which alienated these constituencies.

* Even more baffling, Intel failed to warn their tech support desk to immediately report any external complaint about the bug, so that it could be given special handling.

* The bug was found late in the life cycle of the chip, after millions of them were already distributed or in production.


If these statements are correct, Intel found the bug after chip was distributed, but they did not warn their OEMs by identifying it in the errata. In fact they did not even notify their own support personnel.

I would have to say that Intel's failure to be forthcoming about a known issue is as bad or worse than what AMD is doing with K10 right now.

In short neither company seems to have the moral high ground on this issue.

Anonymous said...

"Jumpingjack, thank you for the link. From the link you posted I think the following points put Intel in a poor light from more than a PR perspective. ...."

I am not disagreeing with you hear, I did not mean to imply Intel handled it poorly, but I was also always under the assumption that Intel did not know of the bug before it was uncovered by the mathematician.

After the bug was uncovered, Intel did a really poor job and it was a lesson learned. The 'FAQ' by this fellow though was even in tone, fair, and balanced -- giving it even more credibility -- he shot down IBM's analysis of the impact and concluded with Intel (if you follow the reasoning/history).... nonetheless, a 'bugged' chip has a much stronger psychological affect that the actual bug (EVEN in the TLB errata case we have today).

The AMD TLB problem will likely not cause much headache on the desktop side, but there will be this huge enigma attached to it that will most certainly shy people away.

Jack

Anonymous said...

Damn, cannot edit. ...

"I am not disagreeing with you hear, I did not mean to imply Intel handled it poorly, but I was also always under the assumption that Intel did not know of the bug before it was uncovered by the mathematician."

This should read:
I am not disagreeing with you here, I did not mean to imply Intel handled it well (they handled it very poorly), but I was also always under the assumption that Intel did not know of the bug before it was uncovered by the mathematician.

Anonymous said...

The link supplied by the anonymous poster, immediately following yours is precisely my point. Naturally, you are quite correct; there are probably more holes in my argument than a fine mesh screen door. But the fact still remains that never, ever was there a product launch with more spin, more hype, and more sequential failures, since the Ford Edsel.

You were the one who first pointed out, based on experience and empirical evidence, that AMD was faced with a bad process AND a bad architecture. How right you were! With this in mind I am FULLY convinced they all were well aware of this fatal flaw COMMON to both Pheromone AND Barcelona.

My limited understanding of the nested algorithms associated with L3 cache is what gives servers their performance boost and power, especially, when VT technology is enabled. Therefore, wouldn’t that be the first validation test, among the many that DOC mentioned, that one would use in qualifying a server processor, Barcelona’s primary design target?

What I want to know is why now, why so late, after all the fixes and all the steppings has this so called erratum surfaced so late in the game, after they’re selling the damned thing? AMD’s answer, in light of this, is to DISABLE a server processors strongest suit/function, to me, seams absolutely incredible! Was this missed, after all this time, by multi billion dollar corporations, hell bend on validating a product they invested untold billions in supporting for future sales?!?!

Maybe I’m out there with Alien abductions, Bigfoot, and faster than light travel. But, something really stinks here, and please forgive the pun, it goes to the core of this product.

The launch has been an utter failure. They designed this architecture from the ground up. They all tested, played, and test again ad nausea, and still they didn’t know? Horseshit, I say. If smells like crap, if it looks like crap, it ain’t ice cream, and it only takes a little bit of crap to ruin a whole lot of ice cream. I wouldn’t eat it. I’m surprised so many have.

This whole abomination reeks of scandal. The evidence is that it’s getting more pathetic every day, not better.

SPARKS

Anonymous said...

Sorry, my last post was to GURU.

SPARKS

Anonymous said...

Well, Intel may be having a similar PR issue brewing:
http://www.behardware.com/news/9264/the-yorkfield-delayed-confirmed.html

Ho Ho said...

It is quite funny to see people on amdzone forums and other places telling things like "Scientia was right all along and there really is a bug that is keeping K10 performance down". Too bad they don't understand that the bug has nothing to do with performance, it is the quick BIOS fix/hack that lowers performance. I'm quite sure k10 won't perform any better than the non-patched benchmarks have shown.

Anonymous said...

"I'm quite sure k10 won't perform any better than the non-patched benchmarks have shown.
"
Of course it won't :)

The bug has nothing to do with proper logical operation of the TLB other than in some cases a dirty page gets copied as un-modified (if you read the concepts of the bug/workaround in the linux kernel. Fixing a problem of closing a window of opportunity for a corruption of a shared resource by missing a clean/dirty tag does nothing to improve the actualy core efficiency.

Orthogonal said...

Sparks, I think you can lower the pitchforks and stand-down the troops for now. While the initial knee-jerk reaction may call for an AMD lynching, I still think it's worth giving them the benefit of the doubt that they willfully launched a buggy chip and tried to skirt the issue.

As we have discussed on here numerous times before, using anecdotal evidence we have deduced that AMD has a very poor 65nm process likely causing significant binning issues. We know that Brisbane is slow and a year later has yet to match or exceed 90nm binned K8's. K10 is also experiencing significant TDP issues warranting a change in their power envelope. The 9900 2.6Ghz chip on the roadmap of 140W TDP only further points to these issues.

While we can sit back and say the launch was sabotaged by a TLB scandel, it's more likely that they simply couldn't get the product binned at the desired levels to satiate the market. This also boads with Dirk Meyers comments of "tuning the process to the architecture".

However, at the same time we do know that the TLB error was atleast discovered relatively close to launch although we can't be certain if AMD knew how severe the errata actually was, but they did find it soon enough to warrant not manufacturing any of them.

It was opined not long ago by GURU that Barcelona shipments were inexplicably low relative to wafer capacity (An insignifanct wafer volume was required to meet the "hundreds of thousands" of units shipped guidance even with very low yields). This would suggest that they knew something was wrong with the chip and they weren't worth manufacturing at the time. It was either the binning problems or the errata that caused this (maybe a combination of both). Neither scenario is a fun predicament to defend in the PR arena. It's obvious that they were mum on the issue, but only now have customers and channel partners vented to the point that the shit has hit the fan and AMD would have to balk on one of the issues.

All I'm saying is we don't yet have the data to make a sound conclusion and given the nature of the errata, it could likely have taken weeks or months before they realized the scope of it's problem.

Khorgano said...

Taken from a THG interview found here:

http://www.tomshardware.com/2007/12/07/subsidies_for_jobs/

Toms's: Why are the new CPUs shipping at such comparatively low clock speeds?

J. Polster: As I said before, that is directly related to the market segments we are addressing. The mainstream segment is where the bulk of the quad-core CPUs are sold. There, we are ideally positioned with the Phenom models currently in the market. Of course we will still be releasing versions of the Phenom running at higher clock speeds.


I don't know about you, but that is a load of crap. Who positions their Quad core products for the Mainstream. Either AMD is purposely trying to sell their chips at low margins (not likely), or this guy is so dense he needs a handicap bumper sticker on his forehead.

In other news, looks like Intel may be getting hit by the errata bug too. Yorkfield delayed till mid-late Q1'08?

http://www.techreport.com/discussions.x/13756

No official word on it yet, looks like only 1 website has yet to report on it. If this is true, AMD might of some breathing room. While Kentsfield only embarrasses Phenom, Penryn downright humiliates it. A couple months delay could be just what the doctor ordered ;)

Anonymous said...

Orthogonal,


“Sparks, I think you can lower the pitchforks and stand-down the troops for now. While the initial knee-jerk reaction may call for an AMD lynching”

No way baby; in my mind, the arrogant little bastards aren’t getting a free pass on this one. They wanted fair market competition; they got it, in spades. Monopoly, give them an education the hard way. They kicked sand in the big guys face, he got up kicked the shit of them.

There was something wrong at AMD from the jump. Corporate bigwigs were flying out of there like bats out of hell, while Wrector and Dork were rearranging the deck chairs on the Titanic! I don’t buy it, not for a second. There is a shadow of doubt here; I can read between the lines in your comments and in Guru’s. You guys are industry savvy. If you conservative approach guys have reservations and are thinking twice, then I smell blood.

Forget the lynching; I want Wrectors balls in your bosses top desk draw. You tell you the veterans you work with I said so.

QX9770! Hoo Ya!

SPARKS

Anonymous said...

I find this article about the TLB bug interesting since it's froma person who has traditionally written favorable things about AMD

My take is that Barcelona and Phenom are simply ramping up slowing because AMD is having yield issues at the Dresden fab where it's making the parts. It's had these issues since much earlier this year, when reports first surfaced that the initial Barcelonas weren't running faster than 2.0 GHz. Such a problem could be a design flaw, which would be very serious if that were the case. At first glance, though, not getting your speed grades off of the fab is a yield problem indicative of ramp-up issues.

The other question which ties in with this is, why does there seem to be a trickle (or, "hundreds of thousands"; same thing) of Barcelonas, but no Phenoms? True, AMD has said there won't be Phenoms in quanitity until 1Q. However, this staged roll-out plan also seems to tie in with a scenario where Barcelona is being buzzed out first (with the plan that the Phenom ramp would follow). It's probably the case that what happened is, the manufacturing of Barcelona has turned out to be much more of a bear than anyone anticipated.

I suspect that this has been going on for a quite a while, and that AMD's process engineers are under the gun to make sure that the huge quanitities of both Barcelona and Phenom are rolling off the fabs next year. I'd wish those guys a Merry Christmas, but I'm guessing they're gonna be working for the duration.


AMD Bitten by Barcelona Quad-Core Bug

Anonymous said...

"It was opined not long ago by GURU that Barcelona shipments were inexplicably low relative to wafer capacity (An insignifanct wafer volume was required to meet the "hundreds of thousands" of units shipped guidance even with very low yields)."

This is more likely due to binning issues than the TLB issue. The wafer starts are ~100 days prior to shipment (probably a bit quicker as AMD would have expedited the higher margin K10). Either way, to put things in perspective parts shipping in Dec means actual production starts in the Aug/Sept timeframe (parts shipping in Oct would have started in the fab ~Jun/Jul). I'm assuming no hot boxing here - likely the initial handful of parts were hot boxed, but this is not something that can be done on s significant quantity of wafers (<25-50 wafers at any one time).

If we accept AMD's claim of finding this issue late in the game, then clearly that would not be the cause for the lower wafer starts / slow conversion rate to K10. Let's say they found the issue in August; they would only be able to cut down capacity 4 months later or ~Dec - anything scheduled to ship prior to then would have had to been scrapped (not stopped). This is why I think the K10 capacity ramp was a reflection of binning issues.

The other log on the fire of a poor 65nm process is AMD's switch to ACP - they have long hammered Intel on the typical heavy usage TDP. Their 140 Watt "TDP" has a built in 20-25% IDLE ASSUMPTION (too lazy to look up the WP)! While this # may be representative of normal usage, to call it a TDP is somewhat "interesting".

What is interesting is the TLB bug may perhaps turn into a very good thing for AMD. This gives them a reason to cut down on capacity and not risk saturating the market with low clocked (and lower margin parts) prior to being able to get out the higher clocked (higher margin) parts. Regardless of Intel performance their is a portion of the market that will buy AMD K10 no matter what (either HPC for scalability, AMD desktop fans, folks looking only for chip upgrade) - you don't want this rather price inelastic market (demand not really dependent on price) buying your low margin parts.

It also gives the MOBO makers more time to address any early issues (keep in mind AMD doesn't do anything close to the early MOBO work that Intel does internally, especially early on). It also forces AMD to make more K8 parts which at this point probably provides better margin for AMD for yield and binning reasons. Finally it gives AMD time to get another stepping which may help with TDP and/or speedpath issues (we'll see how B3 looks).

What will be amusing is that AMD will likely address some of the other issues in parallel with the TLB issue and when they come out of this the fans will be saying see it was just a TLB issue and if only AMD had not been so unlucky! The press of course will be clueless and will start ratcheting up clockspeed expectations again, not realizing the limitations of the 65nm process...

Anyway, that's my humble analysis/prediction..

Anonymous said...

their - there :) I speak good English

Anonymous said...

Eh hem, you speak English well.

“This gives them a reason to cut down on capacity and not risk saturating the market with low clocked (and lower margin parts) prior to being able to get out the higher clocked (higher margin) parts.”

Exactly! This is gospel. But, this is going to take time. Time is a luxury AMD cannot afford. After all, who’s going to pay the coffee vendor in the parking lot in the mean time? Shall we say additional losses in 1Q 2008?

TLB, BLT, it’s Déjà vu all over again. This is the continuing saga of AMD’s current M.O., more smoke and mirrors, no product, and no money. Does anyone see a pattern here, or am I just nuts?


SPARKS

Orthogonal said...

Excellent insight and analysis GURU, I had never considered the prospects of the TLB errata as a red herring. It's actually a brilliant strategic move considering the situation. Obviously they're in a bad position with K10 all around, but fostering a scenario that allows them to continue to rely on K8 for the immediate future is much better than the alternative. Admitting a fixable chip bug is a far better news bite than admitting a fundamental process issue. The K8 channel is strong and mature and will be far more beneficial financially, than pushing K10 at this point. Even thought time is of the essence, at this point, a couple extra months to get things "right" is better than forcing it now.

Anonymous said...

"But, this is going to take time. Time is a luxury AMD cannot afford"

I disagree wrt to K10... this is small volumes/revenues over the next quarter or two - AMD's best bet is to milk this small volume for as much as they can - you know the early adopter, bleeding edge, cost is no object folks. Selling them 1.9GHz server parts and 2.3GHz unlocked Phenom black editions at cutrate prices is wasting a market opportunity.

AMD's near term margins and profitability (or more precisely their ability to get closer to profitability) is now a cost issue as AMD has chosen not to raise ASP's and sacrifice or risk a bit of market share for better revenue. AMD's best cost scenario is hammering out K8's until they get the K10 binning well. This is why reduced K10 volume (short term!) is a good thing for AMD.

Long term they obviously risk losing the captive, 'I just want to drop in a chip for an upgrade' market if they don't eventually release a K10, but 3 months is not likely going to dissuade the loyalists. Now if folks hear 6-12months then suddenly the prospect of buying a new $100 mobo is no longer a huge negative for going to an Intel chip.

If I were in AMD's shoes I would want to delay the (real) launch for as long as I could - meaning just prior to folks saying 'screw it I'll buy Intel'. This allows AMD a higher probability and mix of higher bin parts (and therefor milking better margins out of folks).

The best way to accomplish this goal?

-Start with a paper launch to keep people on the hook
-Followup with a bug that will be fixed in the next stepping (which means most will at least wait for that next stepping).
-Then I would not be surprised to see AMD leak a new stepping that is tested/demo'd at around the time the B3's come out which shows more gain in an attempt to keep folks holding out for more.

It's obviously a dangerous game to play, but once folks buy a 2.2GHz Phenom that market is dead for what 2-4 years? (except for extreme enthusiasts)

Intel does similar things as well - it's just business (And I suspect come Nehalem time volumes early on will be 'tight' by design as opposed to by necessity). Folks whine about the consumer getting screwed but in all honesty, the impact is farily neglible to the average Joe who buys a computer every 2-4 years and is buying a mainstream chip not a high bin part. The folks that get "screwed" are the early adopters (much like the folks buying HDTV's early on or any other high tech product) and typically they know that is the price they pay for being first or buying the absolute best.

AMD still has time on K10 - the only folks who know about it or are interested will wait a little while longer.

Anonymous said...

“-Start with a paper launch to keep people on the hook
-Followup with a bug that will be fixed in the next stepping (which means most will at least wait for that next stepping).
-Then I would not be surprised to see AMD leak a new stepping that is tested/demo'd…..”

Ah, didn’t they do this already?

Don’t you mean do it again?

How many times can these idiots cry wolf? Meanwhile, back on the farm, AMD enthusiasts are getting creamed by their contemporaries with INTC 65nM and then 45nM, overclocking rockets during the next (perhaps final) assault on AMD during 1Q 2008. Further, as you predicted, when 180W 2.8 – 3.0 parts eek out of AMD, they will be faced with a new challenge, the Toc in Tic-Toc.

Given their past 6 month performance, I don’t share your best case scenario optimism. Murphy can be a mean son-of-bitch. Should I dump my xxxx shares of INTC at the ‘next launch’ and buy AMD?

I’ll be ice-skating in hell first.

SPARKS

InTheKnow said...

A couple of interesting remarks from AMD on Tom's hardware.

Toms's: The Geode processor, which has been available for several years, only consumes about 2 Watts of power. Couldn't this CPU be used in the consumer segment with a good publicity effect? Rumor has it that Intel is currently designing a CPU for the iPhone 2.

J. Polster: That is an interesting point. Basically, we're facing a situation today where nobody in the consumer market wants to know what chip or processor powers a device. In the end, the buying decision is influenced by the brand name of the product. The same applies to the competition. Apple would never advertise by focusing on the processor.

Toms's: Are there any plans for the UMPC market, i.e. compact portable PCs with low power consumption?

J. Polster: The UMPC market is one with very low sales volumes. We'll be there when that market grows. For now, there are no clear signals for this to happen. We're talking about a product here that is positioned between a notebook and an iPhone.


So according to Mr. Polster, if it doesn't have an AMD label on it they don't want to sell it? Yeah, right. You think Hector wouldn't sell his Grandmother to have a processor in the next gen iPhone?

The UMPC comment is pretty disturbing too. Intial reviews and enthusiasm for the EEE PC seem pretty positive. Last time I checked the best time to get into a market was before it peaked, not after. I find it interesting that he didn't take the opportunity to hype AMDs upcomming Bobcat processor. I have to wonder if it has been shelved.

If I were an AMD investor, those comments would really worry me. It almost seems like AMD is so worried about fixing K10 that they aren't thinking about the future. Their analysts meeting next week should be very interesting.

Anonymous said...

Oh, as a supplement to my last post, I rest my case.


http://www.fudzilla.com/index.php?
option=com_content&task=view&id=
4602&Itemid=1

SPARKS

Anonymous said...

Orthogonal,


“Sparks, I think you can lower the pitchforks and stand-down the troops for now. While the initial knee-jerk reaction may call for an AMD lynching”

From the 'Tech Report':

"AMD's other major concern here should be for its reputation. The company really pulled a no-no by representing Phenom performance to the press (and thus to consumers) without fully explaining the TLB erratum and its performance ramifications at the time of the product's introduction."

The Lawyers are typing the briefs as we speak. Seriously, read the article in its entirety, if you already haven't.

http://www.techreport.com/articles.x/13741/4

SPARKS

Tonus said...

sparks:

"Ah, didn’t they do this already?

Don’t you mean do it again?"


Maybe that is what they mean by "copy exactly." =)

Anonymous said...

Well the Rag (INQ) is stating that AMD will announce something big manana.

As the analyst meeting (you know the one that was postponed at the last minute last month...but that had nothing to do with K10) is this coming Thursday...so any predictions?

Asset lite? My best guess is confirmation of the beginning of outsourcing of CPU's (low end) to TSMC starting in 2008.

Or perhaps a phenom 2.3GHz black edition!?!?

Hornet331 said...

the big announcement was a new fire gl...

Anonymous said...

IBM high K plans...

http://www.eetimes.com/news/semi/showArticle.jhtml?articleID=204800387

End of 45nm? Not so much...as predicted it looks like AMD will be doing this on 32nm.

You can now see the difference between IBM's announcements and Intel's announcement - IBM announced feasibility, INtel announced implementation into production.

That said the gate first technology should be interesting... it should in theory be more scalable than Intel's technology to future nodes (the replacement gate process requires filling of extremely small spaces which will be a challenge as the features shrink).

Of course, things are moving toward FINFET / Tri-gate anyway so this will likely necessitate other changes beyond 32nm.

Anonymous said...

"FINFET / Tri-gate"

All right GURU, I'll bite, what’s FINFET, a diving Field Effect Transistor, shaped like a fish, swimming in a sea of silicon?

What’s IBM got, besides a great theory, SOI, Josephson Junction, etc.

Will it work or is it another pipe dream that can't be implemented to volume production?

Further, gaps in the silicon? What’s in the gaps anyway, noxious chemicals from the last deposition? I thought you guys buffed them all out before you went on to the next layer?

SPARKS

Orthogonal said...

Sparks, Wikipedia is your friend ;)

http://en.wikipedia.org/wiki/Finfet

A Finfet is one of many different Multi-Gate transistor devices. This link also has a SEM image of Intel's Tri-Gate prototype. Instead of having a "2-D" plane for the gate stack over the transistor channel. The Finfet has a Gate on multiple sides of an active region. The transistor channel is built in 3-D allowing you to have greater effective gate surface area over the device. This allows for greater drive current and less voltage and power. In the case of Intel's Tri-gate transistor, it's also possible to have Multiple FINs or Source/Drains on the same device.

Anonymous said...

"Wikipedia is your friend"

Wikipedia IS NOT YOUR FRIEND - there is so much unsubstantiated crap on there that disguises itself as "peer reviewed" especially scientific stuff. That said the description above is OK and if Wikipedia has real references (not by references by parties which have a vested interest) it is OK.

As for gaps in Si - replacement flow (or gate last) means, the transistor is built conventionally like the old days, but before metallization the gate is etched out and the metal gate is put in its place (thus "replacing" the old gate). These are generally speaking the smallest feature on the chip so filling them is not trivial. (They are smaller than the tech node size, so 45nm will have less than 45nm gates)

Orthogonal said...

Wikipedia IS NOT YOUR FRIEND

LOL, I said that with my tongue firmly planted in my cheek. It's a good place to get started or an overview, but never for a legitimate source.

Anonymous said...

What the hell do I need Wiki when I have you guys.

Besides, when you guys want to hookup 1200A (and up) UPS system, don't go there, trust me. It will NEVER pass the NEC.

"This link also has a SEM image of Intel's Tri-Gate prototype"

Ah, good, so long as INTC is working on the thing, too. If they can't make it fly NO one can, at least on a HV level.

"These are generally speaking the smallest feature on the chip so filling them is not trivial."

Got it, GURU. Dig out the old stuff, put in the new/better stuff.

It would seem to be a P.I.A for a process. I can't even get ALL the butter in an English muffin correctly, let alone getting it out!. I always wind up missing a few holes.

Hmmm, multiple source/drains, electrically, I could be off here, but when I double the number of conductors, I half the currant per conductor and thus halving the resistance (heat?) Seems like a nice scheme.

You guys didn’t say whether it will work or not, however. And, at 32?

SPARKS

pointer said...

yeah, i'd use Wiki when i get lazy, and use that for some high level picture.

One comment on Wiki though ... I believe some AMD fanbois are always to update bad news on the Intel (as recent as its advertisement controversy) , and you won't see that on the AMD page (none on any bad references to the K10m etc).

you have to be amazed by the AMD fanbois capability.

Anonymous said...

"I always wind up missing a few holes."

Actually you're words have more wisdom then you realize.... with BILLIONS of gates, you miss one and well, you got issues...or a tri-core! (couldn't resist)

FINFET/TRIGATE will be a challenge for 32nm (my guess would be 22nm is best case). The 3D structure introduces all sorts of new issues to work out. My take is that it will work eventually though.

pointer said...

Finally, the whole background of the AMD current issue:

http://www.crn.com/white-box/204800718

"It was not until mid-November that it actually moved from just being an observation in a testing environment to being a more serious bug. We tried to do BIOS workarounds, we looked for board modifications, even in some instances, for some patches we could do that would not degrade performance," he said.

When AMD came to the decision that the glitch could "affect a real-world application," Rivas said the Sunnyvale, Calif.-based chipmaker quickly alerted customers.

"When we reached that point, it was a Friday and we started notifying the customers on Monday," he said.

AMD developed a workaround for a bug that causes data integrity problems related to the translation lookaside buffer (TLB) on quad-core Opteron and Phenom CPUs. But performance degradation caused by the workaround on the Opteron server-workstation processors was such that only a few customers decided to go ahead with orders.


Some of the AMD true supported justified that all CPUs have errata one way another.

Some of the AMD fanbois said that Intel CPU has a equal serious issue, and cheered over an rumor of an Yorkfield issue.

And now finally we know that the issue is actually quite serious as it will appear in real life usage(not yet showstopper, as it is workaround-able, albeit affect performance)

Anonymous said...

You know what - I at least have some respect from that CRN interview for Rivas to say "With the data I have now, clearly, that was a stupid decision."

Keep in mind I'm sure there are likely 100's of things seen in the lab that either can't be reproduced again or are honestly considered low level issues. While the PR AMD has done on this issue has been HORRENDOUS... the explanation Rivas gives is at least plausible.

Now this I'm not so sure about:
"Those processors are listed at clock speeds that account for the degradation from the BIOS fix, he said, explaining why the first available Phenoms have listed speeds and prices below those AMD initially projected for fourth-quarter shipments."

WTF?!?! So AMD is just relabeling actual clockspeed into converted clockspeed?!? Isn't that what the model #'s are for? I can see them adjusting prices for the degradation, but for them to take say an actual 2.4 or 2.6GHz clocked chip and just relabel it at a different clockspeed seem disingenuous, no?

So should folks believe the CPU-Z listed speed... is that actual or 'corrected'?

So basically it is SOLELY the issue of the BIOS fix (5-20%) that is the cause of the lower clocker Phenoms? He may be selling it, but I'm not buying it (it = the BS explanation on failure to hit clocks)...

And what would be the cause of the Barcies being 2.0GHz (or I should say 1.9)? The BIOS fix must be causing a >20% hit on that part, eh?

Just when you start to think they are starting to get it...they continue with the excuse mongering.

Anonymous said...

"In a statement, the firm reiterated its position at the Q3 conference call.

It said during that call that AMD would ship "hundreds of thousands of quad-core processors" into the server and desktop segments during Q4".

AMD said it's "tracking" to that guidance."

Source - INQ and a variety of other places have published this crap!

Why is AMD boasting about tracking to their plan of essentially achieving a conversion of 1-2% of it's total production? 4 months after launch and K10 will be ~1-2% of their total production? This being "the most important launch of 2007" (to use AMD's words)...apparently it was so important AMD rapidly converted their production to an astounding 1-2% in a mere 4 months.

Can't wait to hear from the peanut gallery that kept hammering on Intel's "slow" conversion to Core 2. (6 months after launch, server was ~50% converted and I believe desktop was ~20-30%, not sure about mobile but it was the lowest of the 3) I also love how all of these websites fail to put "100's of thousands" into any sort of perspective.

Unknown said...

also love how all of these websites fail to put "100's of thousands" into any sort of perspective.

Indeed. Scientia and others were pointing out that the two million quad core CPUs Intel sold last quarter only made up a tiny amount of their total production. I wonder if they'll do the same for the 'hundreds of thousands' of quad core CPUs that AMD is shipping this quarter?

Also see this:

http://www.fudzilla.com/index.php?option=com_content&task=view&id=4656&Itemid=35


AMD needs to find a sweet spot between profitability and a good price, and obviously it has to offer more for the price than Intel with its quad core Yorkfields. The 45nm Intel parts are a fierce competitor, but with the right strategy AMD might have a good chance.


So AMD is going to spark another price war with Phenom? This is going to end in tears for AMD! You're talking about a 280mm2 die compared to two 107mm2 die for a 12MB Yorkfield quad core. Intel can also make value quad core parts with 6MB cache, using two 3MB L2 cache die. I don't recall the exact die size for this (someone fill me in thanks :-) ) but they are under 100mm2 each.

GutterRat said...

Happy, happy, joy, joy!

Shanghai is MIA

“We have 45nm on the way. We will have initial samples also in January. I’m fairly confident that those puppies are going to boot,” said Mario Rivas, executive vice president of computing products group at AMD, in an interview with CRN web-site.

“The 45nm, we consider it Rev C of the device. So all the learning, all the hard knocks that we had on Barcelona, we're going to apply it to Shanghai,” Mr. Rivas added.

"Learning?" "Hard knocks"? These sound like euphemisms for piss poor execution and biting more than we could chew to me.

Why didn't Randy Allen make these comments?

ROFLMAO

Ho Ho said...

"We will have initial samples also in January"

If we believe Scientia then it'll take one year from samples to production. I guess when I said "we'll see as much 45nm CPUs from AMD in 2008 as we saw 65nm in 2006" is more correct than I thought.

Anonymous said...

if you guys want to read something comedic, you should check out the Phenom 9500 reviews on newegg.com.

Many of them start with "Listen Intel fanboys..." and then go into immediate defensive mode on their purchase.

Here is a recent gem:

Cons: I'm using the cons space to explain more things about this chip, because there are no cons here. Intel can say it has 3 GHz quad core processing becasue of the architecture. Intel has an L1 cache that goes to two L2 Caches and then those go to two L3 caches. So it's two Quad cores smacked together and linked by an L1 cache. It will back up, things get jammed in there, and it slows it down. AMD avoided that problem by actually creating an L1 cache that went to 4 L2 caches. It creates less back up, they go directly to the cores which goes for more reliability, and more speed.

hahaha

Anonymous said...

I’m fairly confident that those puppies are going to boot,” said Mario Rivas,

Boot! they don't seem to have very high standards. maybe they should start debugging their chips ;)

and why is he calling them "puppies". I get nervous when supposedly technical people start talking this way. "yeah, we found some of 'em TLB bug thingies. right now we're smoking them out!"

Anonymous said...

"Boot! they don't seem to have very high standards."

Come on man - once they boot, a task manager demo showing all cores being utilized is just around the corner!

Also the "we consider it rev C" is complete crap to make things seem like they are far along on the architecture at this point. It appears as though they are still in the debug stage - far along in that stage, but still not in the 'let's just focus on mass producing these and driving cost down' that is typically associated with something when it is at ramp. The steppings at this point should be tweaks to improve power or speed or give more process marginality to drive down production costs.

As for 45nm, we won't know much, even if these things do boot - I'm sure some will call it success and on schedule and 'closing the gap', but without any clock or power or Vcore data (which I doubt AMD will share), we'll have no idea how things really are. Again AMD will use the press' ignorance and will probably get some good PR on how things are changing for the better at AMD, because the press covering this won't know the right questions to ask.

To me the analyst meeting on Thursday will be more interesting. If they are able to give SPECIFIC details on ramp rates or specifics on the roadmap then they may be starting to get a handle on things - dates vs clock and power for various products, specific ramp rates for K10, for 45nm, for dual core K10's, etc...

If it is another 'launch XYZ mid year', '45nm introduction on schedule', 'fastest ramp ever' or a focus on bulldozer, fusion and other future products - then it will clearly be a signal about K10's current health and viability. Lack of details won't mean that the architecture will never be good, but it will be a clear signal to me that AMD doesn't have a handle on the architecture and/or 65nm process.

Also I think this is the meeting where we find out if Asset light was a bunch of management speak to stall for time (while trying to raise cash) or an actual business plan. If they don't have details on this after what, 9 months?, or we hear another 'we don't want to tip our hand to our competitor', then we'll know Hector is an incompetent CEO who simply rode the success of an already designed K8 part, that had nothing to do with his management. The house is on fire? Hey can we have someone come up with a plan to get some water to put it out? Oh and don't share the plan because the house next door, which has sprinklers installed throughout the house as they may learn from us. By the way we are thinking about remodeling the house, anyone want to invest or provide us a loan - we should have the fire out soon and it (the fire= profitability) is our main priority!

Roborat, Ph.D said...

Anonymous said: "To me the analyst meeting on Thursday will be more interesting...

the Thursday meeting is a Financial Analyst Meeting and typically AMD wants to focus more on money matters.

Analyst: Hector, can you tell us more about Barcelona's TLB bug?
Hector Ruiz: Sorry, we only talk about our products at the Technology Analyst Meeting in spring.
Analyst: Fair enough. Can you then tell us when you think AMD will return to profitability?
Hector Ruiz: Do you know that Barcelona is the world's first native quad-core? Yeah, we make them puppies.

Anonymous said...

Just curious - noone is concerned about Rivas statement regarding K10 clockspeeds:

"Those processors are listed at clock speeds that account for the degradation from the BIOS fix"

I'm thinking/hoping he mis-spoke and was just trying to make an excuse for the lack of high speed parts...but his statement makes it sound like AMD is RELABELING the actual clockspeed with an 'effective' clockspeed?

Unless the TLB actually impacts the clock circuit itself? (which I find hard to believe, though admittedly I'm no expert in this area)

He makes it sound like they are taking a part with say a 2.6GHz clock and saying...well it performs like a 2.3GHz so we'll label it as 2.3GHz.

Anonymous said...

anoymous

Analysis on Rivas' comments in this new TechReport article by Cyril Kowaliski

Anonymous said...

Corrected exchange below.


Analyst: Hector, can you tell us more about Barcelona's TLB bug?

Hector Ruiz: Sorry, we only talk about our products at the Technology Analyst Meeting in spring.

Analyst: Fair enough. Can you then tell us when you think AMD will return to profitability?

Hector Ruiz: Do you know that Barcelona is the world's first native quad-core? Yeah, we make them puppies, we are tracking to our Q3 guidance of 100's of thousands shipped...

Analysts: Do you mean 100's of thousands shipped in Q4, will this actualize revenue?

Hector: No, I mean 100's of thousands eventually.... we have always ran inventory lean and right at the moment our inventory is lean. Yields are where we want them, we are happy with our yields as we achieved mature yields on the first 1000 wafers.

S said...

AMD is going to have some charges for goodwill impairment. So another quarter of bumper losses, I expect

Anonymous said...

"AMD is going to have some charges for goodwill impairment" here's the link (not much in it)

http://www.sec.gov/Archives/edgar/data/2488/000119312507263283/d8k.htm

"The Company expects that the impairment charge will be material, but the Company has determined that, as of the time of this filing, it is unable in good faith to make a determination of an estimate of the amount or range of amounts of the impairment charge."

I'm surprise they didn't say they couldn't provide the amount as it might give our competitors an advantage! I guess an official government/SEC filing can't contain BS lies as there might actually be consequences, as opposed to having the press report your lies.

Hopefully they learn from the subprime guys and get the full amount out as early as possible and not have the slow death trickle.

Timing is rather coincidental, a day before the analyst meeting?!? I guess they couldn't hide it any longer and realized if they didn't bring this up before or during the meeting they'd really piss people off. And doing it this way they do the "refer to the filing, we don't know how much...next question please..." It also get the losses in the books in 2007, and lets them start 2008 without similar charges (good for year on year comparisons).

Anonymous said...

I can always get more goodwill with the guys by not running short of vaseline.

Anonymous said...

Well at least I won't miss his moronic "Intel BK set for Q208" or more recently "Q109". His credibility sunk into the single digits to match his IQ with that "prediction".

I think the problem that many are facing is that after months of posts that were looking forward to the release of AMD's Barcelona processors, the rollout has been very disappointing. And it's hard to continue to make posts about how things will improve after the most recent events. If AMD can get things rolling again (Phenom @ 2.6GHz and faster, soon) then there will be more optimism and reason to speculate with some hope.

I think that AMD burned a lot of its loyal supporters this last couple of months. Not being able to catch up or keep up with Intel, under the circumstances, is not such a bad thing as long as they were being honest about where they stood. But the many broken promises and mess they made leaves many people with a sour taste in their mouths.

"Perhaps AMD should stop spending time and energy trying to sue others and blaming others for their failures."

I tend to agree with the current set of problems. AMD and/or governments has the right to pursue this for potential past transgressions, but clearly the 'monopoly' is not the cause of the AMD's current problems. I think AMD's mgmt conveniently intermixes these hoping folks may not realize this distinction.

My question is this though... is the EU looking for recent stuff (within the last 2 years say) or older stuff? And what exactly is the SPECIFIC damage done to the EU? I can see AMD trying to make a claim (if anything is proved) but what about the EU? If rebates or whatever is deemed anticompetitive then I can see how this impacted AMD, but if the prices that consumers were paying were still low and competitive (like they are now), how exactly is the EU consumer injured? These are not European companies. Does the EU fine companies who sell clothes in the EU that are done by labor at ridiculously low salaries?


We all know that Intel fans are the biggest cock suckers of them all and just as long as we get what we want everyone else can go fuck themselves.

The truth is that this board really sucks. The main people who come here are Intel employees who just want a chance to stroke themselves anonymously.

If the issue is that AMD was injured, than AMD (NOT THE EU) should be the ones pursuing this as they are doing in the US.

This, to me, in disingenuous. If the argument is that consumers got hurt then I eagerly wait to see if the EU distributes checks to all those who purchased a computer in that time period they allege (should they actually levy a fine). Somehow though, I don't think we will see that (call me a cynic).