FPUMark
The next test was FPUMark 99. The results of this one sure caused a stir! The prior AMD CPUs had adequate floating point capabilities, but Intel clearly had the advantage in this arena. The problem as it related to gamers is the heavy reliance that some games (3D) have on floating point processes. This problem was partially alleviated by 3DNow! implementation, but for the most part, the majority of us were waiting anxiously for the next AMD CPU and an improved floating point. In the test that we took, however, the K7 scored 1590. The PIII 500 got a whole 2550! Why was that?
Revised Floating Point
The K7's floating point, on paper, is nothing to scoff at. It has 3 fully pipelined FPUs, one for the addition, one for multiplication, and the last for FSTORE operations. The PIIs/PIIIs have 2 FPUs, one of which is fully pipelined. There are several good suggestions that we received on this, which we tend to agree with:
"Because the FPU is pipelined so radically different than the P2/IIIs I wouldn't be surprised if it is severely handicapped while running code generally optimized for the P2 core. While the K6 core didn't suffer all that badly when running FP code meant for the P2 I think you'll find that due to the heavy rearchitecting of the K7's FPU that it will have a tough time running code that has been hand optimized for the P2 core (which Quake 2 is)."
Well put! The K7 is a significantly different architecture than the previous AMD chips as well as Intel ones. As Intel has been the dominant force in the market, we see that most if not all consumer applications are written as Intel-optimized. Another reader, elaborates a bit:
"The P6, as was explained in the article, has only 2 FPU units, of which only one is fully pipelined. A compiler will only create machinecode to look for two of these units, since no processors (x86) exist with more. So why waste cycles looking for extra ones?"
Catalyst for change
This point elaborates on the argument that until there is optimized code to take advantage of the K7's architecture, we may not see the high results that we should be seeing. This applies especially to the redesigned floating point configuration of the K7. Finally, a great point made by another reader regarding "revolutionary" CPU architectures:
"If Merced existed today, you can be certain that existing benchmarks would show little if any improvement, simply because the object code spit out by today's compiler doesn't know how to take advantage of Merced's advanced architecture."