Intel Core i7 (Nehalem) Performance Preview
Itís been a little over two years since Intel introduced the world to their first Core 2 processors utilizing their next-generation Conroe microarchitecture. Based somewhat off their Pentium M ďYonahĒ CPU core, Conroe restored Intelís leadership position in CPUs. The chip boasted a wider execution core, allowing the processor to complete up to four full instructions simultaneously, along with a more efficient 14-stage pipeline improving IPC (instructions per clock) in comparison to Pentium 4/D.
If you recall, this was one of the chief weaknesses in Core 2ís predecessor, Pentium 4/D. Pentium 4 processors sacrificed the amount of work performed per clock in exchange for more pipeline stages, 31 in the case of latter Pentium D processors. Essentially Intel made a conscious decision to sacrifice IPC in exchange for higher clock speeds. Ultimately this decision came back to haunt them when Pentium 4/D had trouble scaling to higher clock speeds of 4GHz and beyond.
Core 2 never hit the clock speeds of Pentium 4, but because of its improved IPC, it didnít have too in order to achieve breakthrough performance.
But Intel didnít stop there. To further enhance performance, Core 2 also featured more accurate branch prediction, improved SSE/SSE2/3 performance, and a unified L2 cache with more advanced prefetchers residing in the L1 and L2 caches to reduce memory access.
Ultimately Core 2 was over two times faster than Intelís previous Pentium processor, and it also significantly outperformed AMDís fastest Athlon X2 and FX processors, all while generating very little power and with tons of frequency headroom for overclockers. It wasnít uncommon for Core 2 Duo E6300 and E6400 chips to push 3GHz.
Late last year Intel gave Core 2 a midlife upgrade with their Penryn architecture. Besides its smaller 45-nm manufacturing process, Penryn also featured double the divider speed over Conroe when handling math computations and a new super shuffle engine. This is a 128-bit wide, single-pass shuffle unit that improved Penrynís performance with SSE2, SSE3, and SSE4 instructions that have shuffle-like operations.
Penryn was also the first Intel processor to support SSE4.
The final ingredients Intel added to Penryn to improve performance were faster bus speeds and a larger L2 cache. Quad-core chips shipped with up to 12MB of L2 cache while dual-core parts featured 6MB of L2.
As a result of all these improvements, Penryn generally performed around 10-15% faster than Conroe/Kentsfield clock-for-clock. In apps that took advantage of SSE4, this advantage was even greater. In comparison, AMDís fastest Phenom CPU, the Phenom 9950, is just now approaching the performance of Intelís older quad-core Kentsfield CPUs like the Core 2 Quad Q6600 and Q6700.
And now, just as AMDís approaching the eve of the arrival of their first 45-nm CPUs, Intelís back again with the ďtockĒ of their tick-tock model
that follows every process shrink (in this case Penryn) with a next-generation microarchitecture (Nehalem) each year.
As you probably know by now, Intelís next-generation microarchitecture (previously codenamed Nehalem) was officially given a brand name by Intel in August of this year: Core i7. Over the course of the past 18 months, Intel has slowly divulged most of the tech goodies that make up Core i7 including its integrated memory controller, Intelís Quick Path Interconnect (Intelís equivalent of AMD HyperTransport that previously went under the codename CSI), its new L3 cache, the return of Hyper-Threading, and Nehalemís Turbo Mode, but weíre going to briefly go over these changes before we take a look at the new Core i7 platform and the processors behind it.