NVIDIA GeForce GTX 480/GTX 470 Performance Preview
The GeForce GTX 480
GeForce GTX 470 Reference Board
It’s been six months since ATI ushered in the DirectX 11 era of gaming with the launch of the Radeon 5870 last September. Since then, the red team has methodically introduced a top to bottom range of DX11 parts spanning price points from as little as $50 all the way up to $700. Meanwhile NVIDIA – the company that invented the six-month product cycle – has been left to counter with fun facts and performance teasers on facebook and youtube.
While we did get a really informative deep dive on the architecture back in January
, the information blackout since then has left a void that the rumor mill has been more than happy to fill in.
Fortunately that all changes today. All the remaining info you’ve been dying to know the last two months (clock speeds, price points, and most importantly, performance) is about to get answered.
But first, we’re going to give you a really brief crash course on NVIDIA’s new architecture. The most dramatic new feature – for gamers at least – is the new PolyMorph Engine. The new PolyMorph Engine has been designed to deliver breakthrough levels of tessellation performance; tessellation is arguably DX11’s most defining new feature. Rather than handling geometry processing at the front of the pipeline, where it’s traditionally done, NVIDIA has incorporated it directly into the shading clusters found in GeForce GTX 480. NVIDIA refers to them as streaming multiprocessors (SMs). Each SM has its own dedicated hardware for tessellation and other geometry processing units. All told, GeForce GTX 480 has 15 tessellation units total.
GeForce GTX 480 sports twice the number of stream processors as its predecessor, with 480 CUDA Cores compared to GTX 285’s 240. And while its memory interface is reduced to 384-bit, thanks to the use of GDDR5 memory it actually features more available memory bandwidth.
NVIDIA has also made several tweaks to GeForce GTX 400’s ROP subsystem. Each ROP partition now contains eight units, double that of prior architectures. With 6 partitions, this adds up to 48 ROPs total for the GTX 480, 16 more than GT200. The ROPs also boast more efficient compression and higher clock speeds.
Between more ROPS, better compression, and higher clock speeds, this should improve GeForce GTX 400’s AA performance when compared to GeForce GTX 200, particularly when it comes to scaling up to 8xAA.
Going forward, game developers are going to increasingly incorporate GPU Compute into their newest titles, especially now that it’s a part of DX11. Games like Just Cause 2 and Metro 2033 utilize GPU compute today for eye candy effects like depth of field, and of course GPU compute can be used to tackle more realistic game physics. Along these lines, GeForce GTX 400 boasts faster context switching, allowing the GPU to switch between graphics and say PhysX quicker, and can execute multiple kernels simultaneously. NVIDIA even envisions a future where select portions of a scene like reflections may be handled by the GPU with ray tracing.
That’s just the really quick synopsis of GeForce GTX 400’s new architecture though. We discuss it in much greater detail in our GF100 “Fermi” Architecture Overview article
, so you’ll want to head there for more specifics.