The CPU
Pentium 4 versus Athlon
No one gets fired for buying Intel…no one gets fired for buying Dell… but at one point in time, no one got fired for using i820 Rambus.
If you are being asked to build a system for someone else, don’t simply take the easy road out. Someone has entrusted you to determine the best system spec for their project. You’re a knowledgeable reader – you know what you’re talking about. In other words don’t blindly go with Intel or AMD. Look at your project needs and then pick the right product.
Pentium 4 based CPUs are great whenever you can use the SSE2 instructions. Since the SpecFP benchmarks can be auto-vectorized with Intel’s compiler, the Pentium 4 has superb benchmark ratings. However, not all code can be auto-vectorized, and we know that the Pentium 4’s FPU performance is not very impressive. Since we’re building a workstation, you probably won’t even bother looking at P4’s – XEON is what you want. While both the P4 and XEON are based upon a similar cores, the XEON offers multiprocessor support and larger L2 caches. Intel CPUs are a good choice if you need good real-time 3D graphics or if you’re doing a lot of media encoding that is SSE2 enhanced. Large L2 caches are important when you’re dealing with repetitive data and repetitive tasks. Dynamically changing data won’t work as well.
For our scientific research, it’s important to have extreme precision throughout the entire calculation. Although SSE2 is incredibly fast, it deals with numbers in 64-bit precision. The standard x87 FPU maintains 80-bit precision. In case you’re wondering about how that compares to the GeForce FX’s 128-bit precision, you need to remember that that’s only 32-bits per channel (red, green, blue, alpha). The FPU is dealing with a single number.
The other funny thing about research is that while, by definition, we’re at the cutting-edge of information, the tools we use are usually old (and reliable). So, most academic computing software is not optimized for SSE2 instructions, even if there are portions where 64-bit precision is sufficient. This is where the AMD Athlon CPU comes in handy. The Athlon FPU has three parallel floating point execution units – each which can add, multiply or store numbers. The Pentium4 has one full FP execution unit, and one that can only move or store numbers.
Clearly, in the case of pure high-precision FPU tasks, such as our project, the Athlon architecture is superior to the Pentium 4. The system you may need to build in the future may have different requirements. You might not care about FPU performance, and instead need a very large L2 cache. Choose wisely.
Athlon MP versus Athlon XP
Techies all know that these two CPUs are the same, right? They’re wrong.
It’s true that the Athlon MP and XP are built as identical cores at the fab. In fact, you can tweak your L5 bridges on your Athlon XP to enable support for dual processors. Nonetheless, this still doesn’t mean they’re the same thing. Have you ever wondered why the Athlon MP lags behind the Athlon XP in megahertz? The flagship Athlon MP is only at 2400+ while Athlon XP is at 2800+…
Athlon MP’s are binned Athlon XP’s. No two CPU cores fabricated are absolutely identical, and the Athlon MP represent AMD’s best product. The goal for the MP line of chips is to have lower temperatures for the same megahertz. This makes their clock ramp-up fall behind the Athlon XP line. Stability is the obvious benefit, however recall also that rackmount servers don’t have the same exotic cooling solutions that your desktop may have. Binning is one way AMD ensures a superior product for multi-processor systems. The other way AMD ensures the MP lineup is reliable is that the first Athlon MP 2400+ is going to have a later CPU stepping than the first Athlon XP 2400+.
For our system, we went with two Athlon MP 2400+ CPUs, AMD’s fastest MP processor.