Sunday, September 11, 2011

Opterons and Sisoft benchmarks-what does it all mean?

Ok a small update on sisoft numbers for Opterons.Here is 6282SE and here is 6220 . I'm pretty sure now about following things:

1)Both Opteron sisoft results are real. Some features are turned off though.

2) 6220 part has a correct SIMD throughput,while 6282SE has somewhat inflated number (memory configuration is maybe specced higher there).

3) Both 6282SE and 6220 servers had Turbo off in integer tests. First ran at 2.5Ghz(2x 16C) ,second at 3Ghz(2x8C). Multimedia test uses AVX and gives 11% better score than legacy SSE.

4) The Opterons that will launch very soon will have roughly(2P top bin next gen vs 2P top bin previous gen) : 28-30% higher spec_int_rate score and 33-35% higher spec_fp_rate score. This is 2.3Ghz(2.8Ghz Turbo on all cores for integer workloads) Interlagos Vs 2.5Ghz 12C Magny Cours. In IPC numbers this is ~7-8% higher integer IPC and 8-10% higher floating point/SIMD IPC -all in non recompiled workloads. AVX 256/128brings ~10% more in floating point and FMA brings up to 2x more over AVX,but this is "what if case" and not the norm(we have to wait for applications to be written with FMA in mind). I guess XOP will bring similar speedups to AVX ,10% or maybe more,in integer recompiled workloads.

5) Zambezi's Sisoft results that were leaked are not correct. I don't know whether the Turbo was on or off in that test,but even if it was off the results are ~17-20% lower in integer part than what the opterons show. FP part is more or less correct since Opterons score the same per core and clock, but the test was run on ES platform with 1333Mhz DDR3 memory and unknown BIOS settings. Even if we take the legacy SSE score of 132Mpix/s,which was scored at fix clock of 2.8Ghz (100% sure about this) and correct it for launch clock of FX8150 part we arrive at 170Mpix/s.This is 54% more than 1100T.If we take best for Zambezi then it will be AVX. Now it is 71% faster than 1100T (stock versus stock => 189Mpix/s Vs 110Mpix/s).
Integer throughput is around 35% higher on FX8150 versus 1100T (stock+Turbo Versus stock=>88Gops Vs 65Gops).

I suppose numbers can be higher for desktop version,by about 5% ,compared to the ones I posted in point 5). As for Opterons,I'm 99% sure this will be the speed up that SPEC benchmarks will show. Oh and STREAM(memory BW) will be around 50% faster on Interlagos ,but this is already known.

PS And yes,this means Zambezi shouldn't score lower than Thuban 1100T in Cinebench... At least not according to above. But who knows,anything is possible. Leaks so far point that top Zambezi should get score of around 6pts in C11.5 64bit test. According to sisoft numbers it should get >9pts or close to 9pts.

