New clock speeds
The most prominent change between the RADEON 9800 PRO and the RADEON 9700 PRO are the clock speed improvements. Core clock frequency jumps from 325MHz in RADEON 9700 PRO, to 380MHz in RADEON 9800 PRO. This enhancement was made possible by optimizations to the core’s internal structure. Timings have been improved while signal integrity has been increased, allowing the core to scale to higher clock speeds without excessive noise and heat.
Meanwhile, memory has been boosted from 310MHz to 340MHz in the case of the 128MB 9800 PRO. To account for the additional latency of its larger frame buffer, the 256MB RADEON 9800 PRO variant utilizes 350MHz DDR2 memory. Altogether, these changes boost fill-rate by 14% while memory bandwidth jumps from 19.8GB/sec in RADEON 9700 PRO to 21.8GB/sec in RADEON 9800 PRO (22.4GB/sec on the 256MB card).
Besides the clock frequency changes, ATI has also made improvements to its pixel-shading engine, occlusion culling technology, and the memory controller itself. We’ll start with the updated pixel-shading engine, dubbed SMARTSHADER 2.1.
The key addition to the RADEON 9800 PRO’s pixel shading prowess is the new F-buffer present in SMARTSHADER 2.1. The F-buffer works like a form of cache memory, storing pixels that require multiple passes rather than writing them out to the frame buffer each time. This feature in particular was meant to address the shortcomings of RADEON 9700’s instruction length. With RADEON 9700 limited to 64 instructions, some complex shader effects required the pixel shading engine to make multiple passes. While this produced lifelike images, performance is crippled in the process. The F-buffer eliminates some of the redundancy from the graphics pipeline, saving time and reducing memory bandwidth requirements.
ATI also likes to point out that the F-buffer allows them to support fragment shader programs of unlimited length.
HYPERZ is the term ATI uses for its occlusion culling technology, meant to prevent the graphics core from rendering objects that are hidden from the end user’s view. For example a poster placed on a wall, rather than rendering the entire wall and the poster, with HYPERZ the graphics core only renders the visible area of the wall as well as the poster, the area behind the poster is not rendered, making more efficient use of the graphics core and more importantly, its precious memory bandwidth.
HYPERZ III+ maintains the 24:1 lossless Z-buffer compression, Fast Z-buffer clear and 3-level Hierarchical Z-buffer first introduced in RADEON 9700, and adds an enhanced Z cache that has been optimized to work better with stencil buffers. This addition is meant to enhance RADEON 9800 PRO’s performance with next generation games that will use real-time shadow volumes extensively. Doom III is the most notable example.
The final piece to the RADEON 9800 puzzle is its enhanced memory controller. If you recall, the RADEON 9700 core utilized four 64-bit memory controllers. Each of the controllers can be simultaneously writing data to memory, or reading data back into the graphics processor. RADEON 9800 PRO’s controller has been optimized for greater efficiency, resulting in greater performance in 4x and especially 6x AA modes.