Tags:Microsoft, Windows XP x64, SP2, AMD, dual-core Opteron, 2224SE
To my horror I've found out that the new 3.2 GHz dual-core Opterons need about 30% longer for a typical computation than a year-and-a-half old 3 GHz dual-core Xeons.
The computations we do involve a lot of heuristics (memory thrashing due to poor locality of data) and take 6 to 12 hours to complete so 30% faster is hours faster.
I've been keeping an eye on AMD and Intel doing comparisons every time a new generation of processors hit the market. AMD was always significantly better than Intel, so I suspect there might be something wrong.
Intel leapfrogged AMD convincingly when the Core2 architecture was released, over a year ago, Hopefully AMD can pull a rabbit out of the hat soon, or we will be stuck with a single supplier marketplace. Now days Intel are the only CPU supplier worth looking at for most applications.
tomshardware.com is a good site to look at benchmarks, and compare CPUs.
I wrote this for a recent question r.e. why a Phenom didn't outperform a Q6600 ... but the same concept applies here: ------------------------------------------------------------ Since your goal here is performance you'll have a MUCH better system if you go with an Intel Core-based CPU than anything AMD makes ... or is likely to make anytime soon.
Intel was the "King of the CPU Hill" for many years until they took a wrong turn with the Netburst architecture in the Pentium IV's => and AMD did indeed take advantage of this misstep with the Athlon series. But Intel more-than-corrected that mistake with the new Core based architecture released about two years ago ... and have had a convincing lead in performance ever since, and are continuing to evolve the Core platform chips to every better levels.
AMD was supposedly going to get closer on the performance curve with their new "Phenom" series CPU's; but that simply didn't happen. The "Phenom" CPU's simply don't come close to the Intel Core 2 based chips ... even the very-AMD-oriented Maximum PC magazine had this to say in their recent review of the Phenom chips: "... After all the trash talking, all the 'true quad core' pimping, the result is a chip that's slower than Intel's cheapest quad core. And more expensive to boot."
And if you compare the AMD offerings to Intel's better quad cores, it's not even close !! A 1333MHz FSB motherboard (P35 or X38 based) with a Q6600 will beat anything AMD has to offer => and if you pop in one of the faster CPU's (e.g. a QX6850) you'll nearly double the performance !! Note that X38 based boards also support the forthcoming 1600MHz FSB chips, which are going to include quad cores with hyperthreading (8 logical cores !!).
... and that performance is with the current Intel CPU's. They just started shipping the 1600MHz FSB QX9770 :-) ... and in the fall they'll be shipping Nehalem 45nm chips => which provide direct core-core communications (if their current FSB-based communications chips already blow Phenom away ... just wait until Nehalem !!). ------------------------------------------------------------
As for your question: "... Does anyone have a clue what's wrong? " ==> Nothing. You simply selected a slower CPU. Wait until later this year and get a Nahalem-based Xenon and you won't be complaining :-) Or get a current quad-core Xenon like the E5450.
I hope AMD (or someone else) manage to get something compeditive out in the market place soon, at the moment Intel have no business case to keep improving their product, and can charge whatever they want! I have always used Intel, but realize that they have always been reactive to whatever AMD have been doing, particularly in regards to pricing.
I expect an announcement from AMD soon, either they have some new CPU they have been working on in the back room for a while, or they are calling it a day & folding. I suspect the latter, they will probably shrink & choose another niche.
I doubt AMD will fold anytime soon ... they're struggling; but as long as they accept that they're the low-end chip provider, and keep the costs low, there will be plenty of market. And their integrated CPU/Video chips expected later this year should be very popular in the low cost market.
But it seems equally unlikely they'll be competitive in the upper end anytime soon ... Intel is WAY ahead, and has already put out a roadmap of new chips that will almost certainly keep them in that position. And so far they've kept their prices at least reasonable ... I just built a new E8400 based system and only paid $210 for the CPU => which easily matches or outperforms the $1000+ X6800 of just a year or so ago.
There are, however, a lot of "AMD guys" who will buy AMD no matter what ... in some cases they're just AMD loyalists; in others, they haven't followed the technology and don't realize how much better the Intel CPU's are.
Loyalty is not a 4-letter word. Without AMD Intel would still be selling snake oil. Just a year or two ago It used to be that a 3 GHz P4 needed 30% longer to compute one of our typical computations than a 2.4 GHz Athlon64.
Granted Intel had to try a little harder but I still think there's something fishy here. A couple of years ago there was som noise about Intel's compiler refusing to honor SSEx generated code for 3rd-party CPUs but even today, after some new flags were introduced to specifically require generation of SSEx code, the objects still have to be statically linked to a bogus check function in order to release this hand-brake when the code runs on an AMD. I understand that AMD beats respective Intel chip 50% to 100% when you disable those pit-stops.
So how related are MS C++ compiler and Intel compiler, does anyone know? Does MS use Visual Studio to build Windows kernel? I suspect not. What is MS using instead?
Backroom deals between Windows and Intel are not beneath either of them. Remember how long after the introduction of x86_64 extensions Windows XP x64 was still in Beta ? Exactly until Intel managed to catch up with EMT64, that's how long!
I am honestly looking for clues why Intel appears to be so much better than AMD all of a sudden. Condescending leap-frog hypotheses won't do. I need some harder facts or methods to make sure it is a level field.
Intel beats AMD simply because the Core architecture chips abandoned the long-pipeline Netburst architecture used in the Pentium-IV series. Period. Have a look at the CPU benchmarks on Tom's hardware site. You'll note that ALL of the top 10 are Intel CPU's ... AMD finally enters the chart with their best Phenom at 11th (with less than 69% of the performance of the best Intel).
... and the next-generation Intel's will add on-chip memory controllers and hyperthreading, so those numbers will get better yet. You'd like to be sure it's a "level field" ==> problem is, it's not. Intel has blown AMD away in the performance arena, and that situation is simply unlikely to change in the near future.
You might also be interested in this Anandtech article from 2006 ... shortly after Intel released the Core 2's. Note their concluding remarks: "... Compared to AMD's Athlon 64 X2 the situation gets a lot more competitive, but AMD still doesn't stand a chance. " And that was with the original Core 2 lineup => the best of those (the X6800) is now 10th on the PCMark benchmarks noted above !!. http://www.anandtech.com/showdoc.aspx?i=2795&p=1
... To borrow Anandtech's title line, the Empire has indeed struck back :-)
The reason might not be complex. In "number crunching" operation, on-die cache memory makes a lot of difference. If the info I gathered is correct -- Dual core Opteron typically has 2MB L2 cache while dual core Xeon 3000 series (3GHz) has 4MB L2 cache. This could be the main reason for the performance difference. On die cache is expensive, that contributes to the price difference as well. I still remember those day when I had to wirebond those cache die on a separate cavity in the Pentium Pro ceramic package. Nightmare. But Intel insists on using on-die cache to get extra performance (and consumers pay for the extra cost anyway).
garycase & ewavefront: Thank you for the links. A very interesting point is that an Intel with 4 MiB L2 cache is at most 10% more efficient than the same frequency chip with 2 MiB L2 cache, (DivX test), but on average across all tests only 3,5% better. So the size of the L2 cache accounts for a possibly small fraction of the difference.
The Xeon 5160 is basically the Xeon equivalent of an E6850 desktop CPU => 1333MHz FSB, 3.0GHz speed, 4MB cache. It's not shown on the Tom's charts above, but performs slightly better than the older X6800 ... the 10th CPU listed (above ALL of the AMD CPU's). Note that 7 of the top 10 are quad core CPU's ... the other 3 are dual core models (that still outperform all of AMD's quad cores).
Intel's caching algorithms are very good, so the cache hit ratio for MOST applications gets pretty good with even 2MB. As you've noticed, the larger caches give some benefit, but not a huge amount. There are some "conjured" benchmarks that will show much better improvements ... basically if you have a large program loop that's just larger than the smaller cache [e.g. a 2.5MB program loop will perform MUCH better with 4MB of cache than 2MB of cache]. But on average, it's true that once you get above some threshold (and 2MB seems to meet that criteria) the improvements are nominal.
As for your specific questions here ... I presume you are now satisfied that they've been answered --> is that correct?
To wit:
"... Does anyone have a clue what's wrong? " ==> Nothing's wrong, the AMD's just slower.
"... I am honestly looking for clues why Intel appears to be so much better than AMD all of a sudden." ==> No secret clues needed; Intel simply has a superior architecture with the Core-based chips (and that lead is likely to grow even more with the Nehalem chips). For the foreseeable future, if performance is the objective, it's a "no brainer" choice ... use Intel Core-based CPU's.
Just to add a bit - The 4MB L2 cache in Xeon 5160 not only larger, but it is shared by the 2 cores, i.e. unified cache, while the Operons 2MB L2 cache is 2x1MB (non-shared) - one for each core. Hence, during heavy computation, the shared L2 cache can be fully utilized regardless the program is single thread or multi-thread. However, in the case of Opterons' L2 cache, its utilization level is less than that of Xeon and one can easily imagine what could happen when the program is single thread (Yes. Using one 1MB cache only, leaving the other under utilized). The benchmarked performance has to be categorized by diferent type of comuputing usage to sccurately present the performance of the specific type of operation. Therefore, for specific operation/specilized usage machine like dkrnic's, average benchmark score usually does not tell much if not inappropriate. Well, hope AMD will catch up in its new generation of chip design.
Here are a heap of benchmarks, including these 2 CPUs. Being Intel's site it could be biased, but seems to show the Intel CPU you are using as usually outperforming the AMD, on most but not all tests. Seems consistant with the results you are experiencing.
It appears that Intel's CPUs are at the moment indeed better than AMD's in tests under SuSE 10.3 Linux.
I want to make sure that the results are not somehow related to the quality of motherboards instead of the efficiency of CPUs. With an HT controller on the die the AMD's should be doing much better than what I have so far timed.