Summary: With a brand new shader model 3.0 architecture and blazing clock speeds (including a 1.5GHz memory data rate), ATI's RADEON X1800 XT and X1800 XL are built for performance. But are they enough to dethrone NVIDIA's GeForce 7800 GT and 7800 GTX? Find out in our latest article!
At the time it looked as if ATI had pulled off a major coup, as archrival NVIDIA had nothing to compete against ATI’s onslaught of new products, opting instead to cut prices and pitching their SLI solution for enthusiasts.
Of course by now we all know how the story ultimately played out, ATI’s new cards were late to market and when they did arrive street prices were well over MSRP. Ultimately the RADEON X800 XL proved wildly popular once prices dropped, but ATI’s $200 RADEON X800 128MB SKU remained missing in action for most of the early half of 2005 while the RADEON X850 XT Platinum Edition’s availability problems early on prevented the card from picking up many of the early adopters that are so critical when a new graphics card launches.
When NVIDIA shocked the world with immediate availability of their GeForce 7800 GTX and GeForce 7800 GT GPUs, ATI was forced to slash prices on all of their cards. This move killed the profit margins ATI had been enjoying on their high-end cards, but was necessary to keep moving cards. A perfect example of this is ATI’s stealth launch from earlier this summer, the RADEON X800 GT, and the X800 GTO. In case you haven’t heard of them, ATI announced two new mainstream SKUs in August, the RADEON X800 GT and X800 GTO. Both of these cards are merely re-badged R480 and R423 chips that went from going into $350+ RADEON X800/X850 boards into $150-$200 X800 GT/GTO cards. They both boast ATI’s robust 256-bit memory interface, only they’ve had their pipelines disabled so they don’t perform too closely to their more expensive cousins (the X800 GT features 8 pixel pipes, while the X800 GTO is paired with 12).
As you can imagine, moves like this have had a significant impact on ATI’s balance sheet as of late. Just last month ATI had to lower their revenues downward for this quarter by roughly $100 million dollars due precisely to declining margins: “ATI's desktop product line missed both in units and average selling prices for the retail and add-in-board (or System Integrator) channels. "This has clearly been a challenging and disappointing quarter for ATI and we are committed to resolving our operational issues," said David Orton, President and Chief Executive Officer of ATI Technologies. "Despite our short term difficulties, we are optimistic about the future. We continue to gain traction in our integrated and consumer businesses. We are also confident that our upcoming desktop product launch will allow us to reclaim top-to-bottom technology leadership in discrete graphics."
The “upcoming desktop product launch” that Orton is referring to in his statement above is finally ready enough for a formal introduction, as today ATI’s introducing an entire family of new products ranging from the low to high-end. ATI hopes these new products will not only help them get their financials back on track, but also re-establish the leadership position they enjoyed when they first introduced the world to DX9 graphics three years ago.
The cards we’re taking a look at, the RADEON X1800 XL and X1800 XT, are ATI’s first steps in that direction. But do they deliver enough to trounce the competition? That’s what we’re here today to find out!
Sure, each of these follow-up cards delivered substantially more performance than ATI’s original RADEON 9700 PRO, but ATI didn’t exactly reinvent the wheel in the process. Ultimately this was beginning to catch up with them as an increasing number of shader model 3.0 titles have shipped in the past six months (fortunately for ATI they were largely able to get around this dilemma with clever patches for the most prominent titles that added 2.0b shader support).
With RADEON X1800, all that has changed, as the R520 VPU it’s based on has been built from the ground up on an entirely new architecture. Let’s take a look at the specs:
90 Nanometer Technology
New Performance Architecture
New Memory Controller Design
Next-Generation Image Quality
Arguably the most notable items that stood out from the specifications list above are the X1800 XT’s high clock speeds (600MHz+, roughly 200MHz faster than GeForce 7800 GTX) and the number of pipelines. After rumors speculating that R520 boasted 24 or even 32 pixel pipelines, we now see that the chip sports the same number of pipelines as its predecessor (with one texture unit per pixel pipeline), although with approximately 320 million transistors inside (twice that of the RADEON X800 from a year ago), ATI obviously incorporated quite a few new features into R520. ATI is quick to point out that RADEON X1800 supports HDR with multisample AA; a feature which NVIDIA currently doesn’t provide with any of their GPUs. Apple Cinema display users will also be happy to know that the RADEON X1800 XT supports dual-link DVI.
Of course, with a new architecture comes new nomenclature to read up on. With that in mind, lets quickly go over a few of the most notable changes ATI has integrated into the RADEON X1800.
The RADEON X1800 is ATI’s first shader model 3.0 graphics part. As we learned with the GeForce 6800 launch a year ago, shader model 3.0 brings with it support for more instructions, thus allowing developers to write more complex shader programs. In addition to this, another important feature that shader model 3.0 added was dynamic branching (flow control), allowing developers to add loops to their programs.
This particular feature was designed to make writing shaders easier for developers, one common example used was multiple light sources. In previous shader models, the developer would have to write a shader for each light. Dynamic branching makes it possible for the developer to write one shader, which then loops through a certain number of vertex lights and exits once all the lights have been processed. This helps to reduce shader count complexity. Another potential advantage to branching is reducing the variety of shaders used (i.e. many different shaders versus one).
Besides eased development, shader model 3.0 also presents potential performance improvements. For example, developers can use dynamic branching to skip large portions of code that are determined to be unnecessary, and thus help to speed up the shader.
Branching, if not used carefully however, can introduce slower performance. With RADEON X1800, ATI sought to improve branching and also improve texture fetching. After all, if a pixel shader needs to look up a texture value that is not located in the texture cache, it must look in graphics memory, which can introduce hundreds of cycles of latency.
To improve flow control, ATI breaks down the pixel processing workload into a large number of small threads. ATI refers to this as ultra-threading. These threads consist of small 4x4 blocks of pixels (16) on which the same shader code is executed.
Secondly, ATI adds dedicated flow control logic. The RADEON X1800 features an ultra-threading dispatch processor which acts as a central dispatch unit that tracks and distributes up to 512 threads across the RADEON X1800’s shader processors. Each of these shader processors consists of four pixel shaders, what has traditionally been referred to as “quads”. Each of these processors is autonomous and contains its own dedicated branch unit to help eliminate flow control overhead in the shader processors.
Whenever the dispatch processor determines that a core has become idle, it is assigned a new thread to execute. If the idle thread was waiting for data, it is temporarily suspended until that data becomes available, thus freeing its ALUs to work on other threads. ATI claims that this enables the Radeon X1800 pixel shader cores to maintain over 90% utilization in practice, with negligible idle time regardless of the shader code being run.
In closing, ATI feels that by breaking the pixel processing workload into smaller threads, the RADEON X1800 works more efficiently. Ultra-threading also hides the latency normally encountered with texture fetching. Meanwhile, the X1800’s dedicated flow control logic minimizes shader processor idle times and wasted cycles. All this adds up to improved flow control, which will become increasingly important as developers continue to implement branching in their code.
One of the most important changes ATI has implemented is by increasing the number of memory controllers present in RADEON X1800. Whereas previous ATI products featured four 64-bit memory controllers, for RADEON X1800, ATI has doubled the number of memory controllers to eight, each 32-bits wide. With more controllers onboard, the X1800 can serve more read/write requests simultaneously and thus increasing efficiency.
In addition to reworking the memory controller configuration, ATI also switched to a ring bus architecture. According to ATI, this new design was necessary to allow the Radeon X1800 to hit much higher memory clocks more efficiently than previous designs.
The ring bus consists of two 256-bit rings and four ring stops, one for each pair of memory controllers. To simplify the routing of the wires and to provide a cleaner signal at high clocks, ATI routes the ring bus around the outer edge of the chip. Data then travels between ring stops until it reaches its destination. The two rings run in opposite directions to ensure that this happens as quickly as possible.
ATI has also moved from a direct-mapped cache to a fully associative cache for its texture, color, and depth/stencil buffer caches along with integrating new arbitration logic that’s more efficient at managing the read/write requests that are sent to the memory controllers. Finally, ATI has implemented better hidden surface removal techniques and better compression into the X1800. For instance, ATI claims that their implementation of hierarchical Z in the RADEON X1800 can catch up to 50% more hidden pixels than previous ATI architectures thanks to a more accurate visibility checking algorithm.
Looking over the specs, the memory subsystem of ATI’s RADEON X1800 XT card is certainly impressive: ATI’s delivering considerably more bandwidth than anything else on the market, although the fill-rate between it and the GeForce 7800 GTX are pretty close, with the 7800 GTX boasting a slight advantage. The X1800 XT and the GeForce 7800 GT are pretty evenly matched on paper in terms of raw performance metrics, although of course finding a stock GeForce 7800 GT board that runs at NVIDIA’s default frequencies may prove a little difficult – practically every manufacturer is overclocking their boards now.
On paper ATI’s pricing is competitive with NVIDIA’s but the reality is that the GeForce 7800 GTX can be easily found for under $500 while the GeForce 7800 GT listings on Price Watch hover as low as $351. Based on this, we’re inclined to give the price advantage to NVIDIA. We should also mention that ATI’s producing a 256MB RADEON X1800 XT SKU priced at $499 with the same clocks (625MHz core/750MHz memory). The board we’re reviewing today ships with 512MB of memory and is therefore priced a little higher at $549.
As you can see in the pictures, ATI’s new flagship board, the RADEON X1800 XT, is one daunting looking card. You’ve got dual-slot cooling, just like the RADEON X850 XT Platinum Edition, in fact ATI borrows the exact same fan that was used on the X850 XT PE once again for their RADEON X1800 XT.
Like the RADEON X850 XT, the fan operates dynamically, with the RPMs ranging from mild to wild based on temperature. Fortunately during our testing with the card, the only time we ever saw the wild fan setting was when we were booting up the system.
Once again ATI uses a ducted cooling system, with the RADEON X1800 XT’s fan sucking in the air within your PC, passing it across the VPU and its memory before it finally exits outside of your system’s case. ATI uses a combination of copper and aluminum to cool the graphics core and memory modules, while a bank of VRM circuitry is cooled with a second aluminum heatsink.
ATI continues to provide OVERDRIVE support solely for their XT cards such as the RADEON X1800 XT.
ATI’s RADEON X1800 XL board is a single-slot design with an all-new copper/aluminum heatsink combo that looks rather unassuming, but actually generates quite a bit of noise. It also has a fan that dynamically adjusts RPMs based on temperature only its fan spins at higher RPMs, resulting in more noise when it does crank up, a trait which unfortunately occurred frequently, even while at the Windows desktop. We’d really like to see ATI integrate a quieter fan on future X1800 XL boards.
So far ATI’s been pretty mum on CrossFire details for RADEON X1800. All they’ve acknowledged is that they’ve implemented a new compositing engine chip which supports resolutions of 2048x1536 @70Hz+ for the X1800 and X1600, while the RADEON X1300 will use the PCI Express bus to link the two cards together. ATI expects to deliver more news on the CrossFire topic sometime next month.
Like NVIDIA, ATI provides a new anti-aliasing mode that’s designed to improve the anti-aliasing image quality of thin-lined objects such as chain-link fences and jungle foliage. Dubbed “adaptive anti-aliasing”, ATI uses the same trick NVIDIA does, by combining the image quality of supersampling with the speed of multisampling. We’ve provided a few screenshots taken from Half-Life 2 illustrating the image quality improvement adaptive anti-aliasing brings:
You can probably see the difference just scrolling through the screenshots, but we’ve included a crop to really highlight the difference adaptive AA brings:
RADEON X1800 XT 4xMSAA
RADEON X1800 XT 4xMSAA w/Adaptive AA
And now lets see how ATI’s adaptive anti-aliasing compares to NVIDIA’s transparency AA:
GeForce 7800 GTX with 4xTransparency AA Supersampling
RADEON X1800 XT 4xMSAA w/Adaptive AA
Both cards do a really good job of smoothing out the jaggies, although the edges on the RADEON X1800 XT appear a bit softer. The bar that runs across the tree has less jaggies on the GeForce board however.
We also ran a quick performance comparison:
Pacific Fighters - OpenGL
Far Cry – Direct3D
IL-2: FB – OpenGL
Half-Life 2 – Direct3D
Battlefield 2 – Direct3D
F.E.A.R. Beta – Direct3D
F.E.A.R. Beta – Direct3D
After shocking the graphics world by introducing the world’s first DirectX 9 graphics processor, ATI’s finally released a brand new part that’s designed from the ground up to take advantage of the latest shader model 3.0 games. And one thing is for sure, just looking at the paper specs of ATI’s latest high-end offerings, it’s pretty clear that ATI was shooting for something big.
ATI starts with TSMC’s 90-nanometer manufacturing process. Whereas ATI had previously played it conservative with RADEON 9700, sticking with TSMC’s proven but larger 0.15-micron process, and only moved to 0.13-micron on the high-end after testing the process out first with a variety of mainstream parts, this time around ATI shot for it all, manufacturing a top-to-bottom lineup of graphics parts all based on 90-nanometer. Since the RADEON X800 days, ATI has been adamant that they couldn’t do shader model 3.0 properly until they hit 90-nanometer. They felt that shader model 3.0’s increased precision (among other things) required too many additional transistors to affordably produce a top-to-bottom lineup of parts on a larger manufacturing process, particularly for the mainstream and value segments. Clearly NVIDIA felt differently on the matter, releasing a slew of shader model 3.0 products last year.
It’s because of this smaller manufacturing process that ATI also shot high on their clock speeds. The X1800 XT runs at 625MHz, roughly 200MHz higher than NVIDIA’s GeForce 7800 GTX, and the chip is paired with 1.5GHz memory. It’s rumored that ATI shot for even higher clocks, but ultimately had to settle for the slower speeds in order to go into full production.
So do all these new technologies add up to a better product?
Clearly the RADEON X1800 XT and XL shined in many of our benchmarks. Half-Life 2, Far Cry, and Battlefield 2 all tended to favor the RADEON cards, on the other hand, the OpenGL titles we tested with, including the flight sims and DOOM 3 all performed better on GeForce. ATI’s new adaptive AA mode is a worthy competitor to NVIDIA’s transparency AA, with both cards using the same method to accomplish the same task. In our opinion, the most critical deciding factor is going to come down to price and availability.
It’s here where NVIDIA’s GeForce cards excel. Not only are the GeForce 7800 GTX and GeForce 7800 GT already available on the market, they’re also selling for prices well below MSRP. The GeForce 7800 GT can already be found online for around $370, while the 7800 GTX sells for roughly another $100 more. In addition, the GeForce 7800 GT and GTX boards we tested today were running at NVIDIA’s stock clock speeds: most of the GeForce board partners aren’t running their cards at stock clocks, instead they’re overclocking them by 10% or more. And here’s where we run into the big unknown. When will X1800 XT and X1800 XL be available to the general public?
According to ATI, the X1800 XL should be hitting shelves now, but a quick scan of online retailers reveals nothing. Meanwhile, the X1800 XT won’t be available for another month. Considering ATI’s recent track record in regards to availability, we’ll believe all this when we actually see it. We’re also eager to see how the 256MB RADEON X1800 XT performs. With its blazing clock speeds and a $500 price point, we have a feeling that it may end up being the most attractive of the X1800 offerings ATI has.
Make no mistake about it, the story isn’t over as far as we’re concerned. With CoD 2, Serious Sam 2, F.E.A.R. and Quake 4 all set for introduction in the coming weeks, we’ll be revisiting this topic in the days ahead.
|© Copyright 2003 FS Media, Inc.|