AMD's dual core Opteron & Athlon 64 X2 - Server/Desktop Performance Preview
by Anand Lal Shimpi, Jason Clark & Ross Whitehead on April 21, 2005 9:25 AM EST- Posted in
- CPUs
A Look at AMD's Dual Core Architecture
Even Intel will admit that the architecture of the Pentium D is not the most desirable as is two Pentium 4 cores literally glued together. The two cores can barely be managed independently from a power consumption standpoint (they still share the same voltage and must run in the same power state) and all communication between cores must go over the external FSB. The diagram below should illustrate the latter point pretty well:
Intel's Pentium D dual core architecture
AMD's architecture is much more sophisticated, thanks to the K8 architecture's on-die North Bridge. While we normally only discuss the benefits of the K8's on-die memory controller, the on-die North Bridge is extremely important for dual core. Instead of having all communication between the cores go over an external FSB, each core will put its request on the System Request Queue (SRQ) and when resources are available, the request will be sent to the appropriate execution core - all without leaving the confines of the CPU's die. There are numerous benefits to AMD's implementation, and in heavily multithreaded/multitasking scenarios, it is possible for AMD to have a performance advantage over Intel just because of this implementation detail alone.
The one limitation that both AMD and Intel have is bandwidth. In order to maintain compatibility with present day Socket-940 and Socket-939 motherboards, AMD could not increase the pincount of their dual core processors. The benefit is that AMD's dual core CPUs will work in almost all Socket-940 and Socket-939 motherboards (more on this later), but the downside is that the memory bus remains unchanged at 128-bits wide and supports a maximum memory speed of DDR400. So, while single core Athlon 64 and Opteron CPUs get a full 6.4GB/s of memory bandwidth, today's dual core CPUs are given the same memory bandwidth to share among two cores instead of one.
AMD's solution to the problem will come in the form of DDR2 and a new socket down the road, but for now there's no getting around the memory bandwidth limitations. Intel is actually in a better position from a memory bandwidth standpoint. At this point, their chipsets provide more memory bandwidth than what a single core needs with their dual channel DDR2-667 controller. The problem is that the Intel dual core CPUs still run on a 64-bit wide 800MHz FSB, which makes Intel's problem more of a FSB bandwidth limitation than a memory bandwidth limitation.
Backwards Compatibility
Intel's dual core Pentium D and Extreme Edition won't work in any previous motherboards, but as we mentioned at the start of this article, AMD has more bang. Here, the additional bang comes from the almost 100% backwards compatibility with single-core motherboards. We say "almost" because it's not totally perfect; here's the breakdown:- On the desktop, the Athlon 64 X2 series is fully compatible with all Socket-939 motherboards. All you need is a BIOS update and you're good to go.For desktop users, the ability to upgrade your current Socket-939 motherboards to support dual core in the future is a huge offer from AMD. While it may not please motherboard manufacturers to lengthen upgrade cycles like this, we have never seen a CPU manufacturer take care of their users like this before. Even during the Socket-A days when you didn't have to upgrade your motherboard, most users still did because of better chipsets. AMD's architectural decisions have made those days obsolete. The next generation of dual core processors will most likely need a new motherboard, but rest assured that you have a solid upgrade path if you have recently invested in a new Socket-939 desktop system or Socket-940.
- For workstations/servers, if you have a motherboard that supports the 90nm Opterons, then all you need is a BIOS update for dual core Opteron support. If the motherboard does not support 90nm Opterons then you are, unfortunately, out of luck.
144 Comments
View All Comments
Opteron - Monday, June 20, 2005 - link
mikeshoup - Wednesday, June 15, 2005 - link
I think a better option for testing compiling speed would be to pass a -j argument to make when compiling FireFox, and tell it to run as many parallel operations as the processor can take threads. IE: -j2 for a dual core or ht cpufritz64 - Thursday, May 5, 2005 - link
I know what I will be getting after fall this year. Those numbers are impresive!jvarszegi - Friday, April 29, 2005 - link
So they're reproducible, but only in secret. And you knew, as usual, about mistakes you were making, but made them anyway to, um, make a valid comparison to something else that no one can verify. Nicely done. Whatever they're paying you, it's not enough.Ross Whitehead - Thursday, April 28, 2005 - link
Zebo -You are correct you can not reproduce them, but we can and have 10's of times over the last year w/ different hardware. I do not believe that because you cannot reproduce them discounts their validity but it does require you have a small amount of trust in us.
We have detailed the interaction of the application with the database. With this description you should be able to draw conclusions as to whether it matches the profile of your applications and database servers. Keep in mind, when it comes to performance tuning the most command phrase is "it depends". This means that there are so many variables in a test, that unless all are carefully maintained the results will vary greatly. So, even if you could reproduce it I would not recommend a change to your application hardware until it was validated with your own application as the benchmark.
The owner of the benchmark is not AMD, or Intel, or anyone remotely related to PC hardware.
I think if you can get beyond the trust factor there is a lot to gain from the benchmarks and our tests.
Reginhild - Thursday, April 28, 2005 - link
Wow, the new AMD dual cores blow away the "patched together" Intel dual cores!!I can't see why anyone would choose the Intel dually over AMD unless all the AMDs are sold out.
Intel needs to get off their arse and design a true dual core chip instead of just slapping two "unconnected" processors on one chip. The fact that the processors have to communicate with each other by going outside the chip is what killed Intel in all the benchmarks.
Zebo - Thursday, April 28, 2005 - link
Ross,How can I reproduce them when they are not available to me?
From your article:
" We cannot reveal the identity of the Corporation that provided us with the application because of non-disclosure agreements in place. As a result, we will not go into specifics of the application, but rather provide an overview of its database interaction so that you can grasp the profile of this application, and understand the results of the tests better (and how they relate to your database environment)."
Then don't include them. Benchmarking tools to which no one else has access is not scientific because it can't be reproduced so that anyone with a similar setup can verify the results.
I don't even know what they do. How are they imporatant to me? How will this translate to anything real world I need to do? How can I trust the mysterious company? Could be AMD for all I know.
MPE - Wednesday, April 27, 2005 - link
#134How can it be the best for the buck ? Unless you are seeing benchmarks from Anand that says so how could come to the conclusion?
At some tests the 3800+ was the worse performer while the X2 and PD where the best.
You are extrapolating logic from air.
Ross Whitehead - Wednesday, April 27, 2005 - link
#131"no mystery unreproducable benchmarks like Anand's database stuff."
It is not clear what you mean by this statement. The database benchmarks are 100% reproducable and are real life apps not synthetic or academic calcs.
nserra - Wednesday, April 27, 2005 - link
You are discussion price and it's not correct since intel goes from 2800 to 3200 and amd goes from 3500+ into 4000+ (i'm ignoring amd new model numbers, still based on older).I complete disagree the AMD model numbers, the should be = to single core, the should just had the X2.
The TRUE X2 will be more performer than opteron, by 2% to 5%.