FiringSquad: Home of the Hardcore Gamer - Games, Hardware, Reviews and NewsSubmit your own or view users' CPU overclocking results!

  
 Home   News   THE MATRIX   Deals   Hardware   Games   Features   Media   Products   Forums   FS China 
AddThis Social Bookmark Button

Home : Hardware : Video Cards : AMD Radeon 6950/6970 Performance Preview
» Join the Greatest Gaming Community NOW! (It's free)

Already a member? Login
 



Random Gallery >> 
Click to view high-res Image!
Crysis 3 Leaked Screenshots and Concept Art [6] (0)

Whoz's Cranking that S#!T (13) by whozthisguy
Crank That S#!t Up! ENTRY :) (2) by CamoDaGreat
Nvidia+Socom Cranks that $#%^ UP!!!!! (4) by mrinfinit3
Crank That PhysX UP! (10) by mohawkade
Superlative Computer (6) by arvernis
CRANG That S#!T Up! (15) by ElwinRansom
Crank THIS sH!t up! - 3DforREAL (71) by nGAGE
My Entry For The Contest. (6) by D4rk Force
The Nvidia "Crank That S#!T Up" Quiz Show! (21) by mohawkade
My crank that S#!T up entry (9) by iamcj

More Blogs >>




AMD Radeon 6950/6970 Performance Preview
December 17, 2010   Darren Clowns Polkowski > [View My Other Articles]
Product Info | User Reviews | Article Images(44) | Image Gallery | Comments | Forum Thread
Dual Graphics Engines, Tessellation, ROPs and More



AMD is introducing asynchronous dispatch for GPU compute for the 6900 series. It can execute multiple compute kernels simultaneously where each kernel has its own protected virtual address and its own command queue. While Nvidia’s Fermi architecture can handle parallel kernels, it must switch between them in order to process them. For AMD and asynchronous dispatch, this means not only can it have multiple kernels spawn from a single thread, but that it can actually manage multiple different applications completely and independently at the same time. This is something of interest as the GPU could start running various applications at the same time. While this is NOT something covered under Direct3D 11 (DX11), Baumann stated that “AMD will look to add extensions to expose this through OpenCL.”

AMD Radeon 6950/6970 Performance Preview [  @ 764 x 800 ] > View Full-Size in another window.



AMD is utilizing two bidirectional DMA engines on the PCI express interface. This means that the 6900 series cards can do multiple concurrent data transfers across the PCI Express bus. This shows up as data rates of 5.5Gbps on HD 6970 and 5.0Gbps for HD 6950. Additionally, AMD has improved the way each SIMD can deal with storing data locally. In Radeon HD 48xx (RV770), AMD introduced a Local Data Store (LDS), or Local Data Share in AMD’s own terms, for each SIMD array. This allowed the array to store information for other threads to access while in the array. It also had Global Data Stores for other arrays to share between. In Cayman, AMD has enlarged the LDS to 32KB but it also gave the arrays the ability to fetch directly from the LDS.

AMD Radeon 6950/6970 Performance Preview [  @ 1471 x 1585 ] > View Full-Size in another window.



Looking at the chart above, you can see that the 6970 has the same number of ROPs as the HD 5970 and HD 5970, however, but AMD has modified them to process INT8 and FP-16 operations faster. The ROPs handling color can now process INT8 16-bit (unorm and snorm) operations up to two times faster while FP16 32-bit single and double component operations up to four times faster. AMD also added new efficiencies how ROPs coalesce and then write data. What this means is that ROPs take fragments and blocks of data and put them together in one write operation instead of across multiple writes. AMD uses coalescence enhancements to ALUs read operations as well.

Previously AMD added a second rasterizer to improve performance. This time around AMD duplicated two entire geometry blocks. This means that the 6900 series cards can process two primitives per clock. Theoretically it would equate to twice the performance for transform and backface culling (eliminating the work load for geometry that is facing away from the camera). By having two rasterizers, AMD again can process up to 32 pixels per clock.

AMD improved the tessellation unit in each of the geometry blocks. AMD states that the improvements to the tessellation unit and the fact that AMD duplicated the units should provide three times the overall performance. When we get to the Unigine Heaven 2.1 benchmark results you will see a dramatic improvement to tessellation.




Back! Architecture & Theory     PowerTune Technology Next!
Blog + Share: Digg Del.icio.us Reddit SU furl • More: AddThis Social Bookmark Button
Send This Article to a Friend!  
Table of Contents
  Print Entire Article  

MATRIX CONTENT » RANDOM MEDIA BLOG More Blogs >>
No ratings yet
» Please rate this
Read this Media-Blog entry!» Nvidia+Socom Cranks that $#%^ UP!!!!! (4)
by mrinfinit3 (2) Talk with this user on their Shout Box (My other blogs) Posted 34 months ago


 Latest Headlines
South Park: The Stick of Truth VGA gameplay trailer (0)
New Hawken cinematic trailer heralds an open beta (0)
BioShock: Infinite VGA 2012 gameplay trailer (0)
New SimCity trailer highlights Multi-City gameplay (0)
Tomb Raider reboot gets new gameplay trailer (0)
Today's News >>
Today's Siteseeing >>


 Table of Contents


FiringSquad is powered by... Back to Top Site MapContact UsAdvertise With Us Privacy StatementAbout Us  
News RSSSiteseeing RSSArticle RSS   © 1998-2013 FS Media, Inc. All Rights Reserved