Wednesday, October 26, 2011

What comes after FX? FX Next

After not so successful launch of their new flagship desktop processor,AMD started talking up their next chip that will succeed Bulldozer Ver1. Say hello to FX Next ,based on Piledriver core :

Piledriver is supposed to fix certain shortcomings in BDver1 processors. I summed up some of them in my previous blog so I won't reiterate those.
What will AMD have to offer before Piledriver arrives? The following roadmap gives us some clues:

FX8170 is supposedly launching in Q1 2012. It will be based on new (somewhat improved) B3 stepping. Hopefully B3 will be enough to polish speed path issues (if there are any) and bring up clock speed. The rumored 8170 is supposed to run at 3.9/4.2/4.5Ghz clocks. This is a very solid uplift(~7.7% over 8150). Bad news is that there is no "8190" on the current roadmaps and 8170 is supposed to tide AMD over until PD arrives in Q3. That is 2 quarters... Good news is that PD will be overall 10-15% faster than what AMD has at the moment of its introduction (so 8170). This lines up well with rumored ~5% IPC increase over BD with the rest being clock speed.
Q1-Q2 2012: 3.9/4.2/4.5Ghz FX8170
Q3-onward : 4.2?/4.5?/4.8?Ghz FX8280?
Effectively 4.2/3.9=1.07 or 7% clock uplift with PD. Now count in the IPC uplift of 5% or so : 1.07x1.05=1.12 or 12% overall faster than 8170. AMD said 10-15% more x86 performance with PD so this slides right in the middle of this projected figure. How much is this faster than current 8150? 8170 should be around 7% faster than 8150 so 4.2Ghz(base) PD based FX is going to be roughly 20% or so faster than 8150. Not a bad upgrade if you look at it from time perspective : 8 months after BDver1 we will get 20% faster FX part (stock vs stock). It's not unreasonable to expect better OC and thermal characteristics too,so all in all it will be a good "fix" for current BD desktop competitive situation.

PS All info above is my interpretation of available data. I used "best case scenario" above. Real PD core and 8170 may turn out to be totally different than what I speculated. We have seen this happen with BD core. Since back then I also based my predictions on publicly available data coming from AMD(which turned out to be a bit optimistic on their side) , this may happen again with PD and 8170. So take it as it is,just speculation.

Thursday, October 13, 2011

Zambezi : 2nd look.

OK, I have read a lot of reviews now. Some things are clearer now.
I suppose I overreacted a bit in my previous blog. Zambezi is hot ,but overall it's not a slow chip. It performs rather well in MT applications. It does have some weaknesses which AMD must correct. Some of the weaknesses are not solely AMD's fault,but GloFo's too.

So this is what ,in my humble opinion , AMD must focus on in the future ( think Piledriver and Steamroller):
1) First and foremost AMD must invest heavily in relationship with developers. They must hire a brand new team of both young and motivated guys who will literary go out and help developers in order to maximize the potential of Bulldozer design. This first iteration is just that ,first. It has some flaws which AMD will try to fix and hopefully succeed in that task. But underlying design ,which is truly revolutionary , will need GOOD software support in order to give best performance to the end users. This means FMA4,XOP,BMI and the rest will need to be properly supported in future multimedia desktop workloads. Notice I'm speaking about DESKTOP space here. Server is in no such need since recompiling is a norm there.

2) AMD must improve the cache performance,especially L1 and L2 writes. This is a major bottleneck and it shows its ugly face in many workloads. AMD is aware of this and hopefuly Piledriver has at least somewhat better write performance with these two levels of cache. L3 looks fine,even more than that. It is much faster than L3 on Thuban.
They also need to work on improving the FP unit. It may be great in FMA4 stuff but it's much less impressive in legacy SSE or AVX128 workloads. Maybe expanding it a bit and expanding the buffers could help. Single thread performance is not anywhere near what this thing SHOULD be capable of so there must be a bottleneck somewhere since in numerous SIMD workloads it's not faster than K10's single core(and its 128b unit).

3) AMD must twist GloFo's arm very hard and very fast. Not only their 32nm production is bringing many defective Llano parts (which is truly a shame since most of the time GPU is broken and then it's not APU any more), but now they can't brake 3.6Ghz barrier on a design that was SPECIFICALLY DESIGNED FOR CLOCK speed (while it does have some IPC improvements in certain areas too). So original goal set by AMD was 30% clock uplift with the same power draw as previous design. We get this ONLY in limited Turbo mode now. We should have 4.1Ghz 125W stock clocked Zambezi parts with 4.7Ghz half core turbo and 4.5Ghz full core turbo. This Zambezi would effectively be 12% faster than 8150 with same power draw. This Zamebezi would allow AMD to use SMT core affinity scheme and release a patch for windows 7 that would force threads first to modules and not cores. Performance uplift is ranging from 5% to massive 40% in some cases,averaging to around 15-20%,depending on benchmark selection. 

So what we need is 95W 3.6Ghz FX8150, 125W 4Ghz 8170 and 4-4.2 Ghz 125W 8270 (Piledriver).
This lineup would hold off  SB and IB ,at least in mid and mid-high performance segments,without many problems.

4) AMD should work closely with MS and release a patch to windows scheduler. As in link I've provided above, performance uplift is not a small number but a very nice 15-20%.
 Trade-off is power draw though. All is explained well in this great review by .

So there you have it. Bulldozer is not what we expected,but it's not a complete failure either. It's a solid chip which will shine in future applications ,which are going for multiple threads. Single core speed ,while still important,is not the main selling point any more. For those who want a good single core performance while having great MT performance (but still slower MT performance than FX8150) ,they can pick 2500K . It's the best chip by intel currently from perf./$ POV. 8150 is not as good but very close! It needs 10% shave from it's MSRP and AMD may sell a sh*t load of these things :).

Wednesday, October 12, 2011

Zambezi launched. 2 words : hot and slow .

How else to describe AMD's latest "high-end" desktop CPU? Well 2 words are enough : hot and slow. Nuff said. You can find reviews via google,best one is definitely lost circuits one:

I'm going to quote the LS author MS since he summed it up great:

Final Thoughts
Bulldozer has been in the works for so long that I don’t even remember when was the first time I heard about it. The concept, at first somewhat odd, gradually started making sense but sometimes these things happen by assimilation. Or, too many geeks who are “in the know” really knew that this was not only a great but a grand design. And who was I to doubt them?
Still, all things considered, there was still more than a shadow of a doubt. Especially when Intel re-introduced HyperThreading and got enormous mileage out of it – for the cost of essentially nothing in terms of real estate which directly translates into cost per die. But we were told over and again that none of this would make sense in light of the analyses performed by AMD’s engineering team. And now we are supposed to believe that Zambezi was designed as a direct competition to the Core i5. That was not a question but a statement.
Of course, this begs the old question, how predictable is performance on a new design? Apparently, it is hit and miss and in so far, my argument still stands, even if it is against the personal religion of some of the decision makers at AMD. This is at the end of a frustrating week, trying to find that one application that would justify buying an FX processor.
 Good luck AMD,you are going to need it :( .

Now a bit of fun :) (found it on amdzone) :

so this loader gets stuck, then other dozer comes to help, then he gets stuck coming out, so another dozer comes over and pushes him out ,hen gets stuck himself and in-turn pushed out by the cat dozer (first dozer stuck. see a trend? i got fed up so, i go swimming for the winch cable (it was under over 2ft of mud and water) and pull it out with my work truck. done.