DELTE ME - Scripting and Programmin...
FS Demo (43)
lasan of twain
||5 entry(ies) in this category
| NVIDIA 8800GTX EXPOSED AND EXPLOITED (25 comments )|
by: indigo196 (258) | Posted in cluster Round 3 Editors Challenge Sponsored by Intel
Posted 74 months ago ( edited 74 months ago ) in category DEFAULT
|» MEDIA (9)|
GPU vs CPU Floating Point Operations
GPU vs CPU
2D Complex FFT
European Option Pricing Black-Scholes
7900 GPU Diagram
8800 GPU Diagram
CUDA software stack
The G80 series of cards being produced by NVIDIA provides raw processing power that was previously only available in server clusters or mainframe computers. This power is most often used to produce visually stunning and detailed environments for games, but recent advancements by companies such as RapidMind, PeakStream and Havok prove that these GPUs can be used for a great variety of math intensive computing. I appreciate gorgeous graphics as much as the next gamer, but I would love to see advances made in AI, environmental physics and other elements that improve the immersive quality of the games I play.
General-Purpose computation on GPUs (GPGPU) has recently come to the forefront of technical news, despite getting its start back in the later 1970s. In fact, some experts have labeled GPGPU as one of the “5 Disruptive Technologies To Watch in 2007”. Companies such as PeakStream, Acceleware and RapidMind have achieved astonishing results on 7900GTX cards, the predecessor to the G80 series, with implementations running 120x faster than CPU code. Havok announced a partnership with NVIDIA in early 2006 to produce Havok FX that leverages Shader Model 3.0 class GPUs to enable collisions of thousands of objects in real-time using the GPU instead of the CPU.
To demonstrate the potential that the 8800GTX holds to improve games, I have included some detailed information on the GPGPU achievements that were made using the 7900 series of cards. These achievements are astounding in their own right, but when you compare the architecture of the 7900GTX to that of the 8800GTX you may find it hard to contain your enthusiasm. The fact that these examples are all non-gaming applications should make even the most dedicated gamer proud that their hobby may assist man in solving medical mysteries.
Acceleware was established in 2004 and provides solutions that leverage the power of GPUs to increase performance and processing power. Their intended markets are cell phone manufacturing, energy, seismic, biomedical, fluid dynamics, pharmaceuticals, industrial, and military companies. They created a solution for Boston Scientific that supercharged their simulations by a factor of 25 when compared to CPU based simulations. These simulations allow Boston Scientific “to investigate the influence and mutual dependency of several design variables”. The result is the improvement of MRI devices that will improve the ability of doctors to diagnose patients.
RapidMind is a company based in Waterloo, Canada, that is built on over five years of advanced research and development. The company was formed in 2004 to commercialize the research of Sh that was started at the University of Waterloo. Sh is a library that acts as a language embedded in C++ that allows programmers to use GPUs for general purpose computations. RapidMind has taken the knowledge gained from the development Sh and created the RapidMind Development Platform that makes parallel programming as easy as single-threaded, single core programming. To show the strength of their solution, RapidMind produced three benchmarks: BLAS SGEMM routine, 2D complex-to-complete FFT routine and a quasi-Monte Carlo evaluation of the Black-Scholes option pricing model. These benchmarks were run on a 7900 GT based GPU and high-end workstation or server-class CPUs. The most impressive result was obtained in the European Option Pricing benchmark which showed the RapidMind GPU implementation to be 120x faster than the original CPU code. RapidMind itself claims that “RapidMind–enabled applications have achieved performance increases of 3x to 30x”.
Example: Havok FX
Havok was founded in 1998 in Dublin Ireland and provides software and services for digital media creators in both games and movies industries. At GDC06 Havok FX was announced jointly by Havok and Nvidia. Havok FX is an add-on which allows programmers to leverage the power of GPUs supporting Shader Model 3.0 to produce stunning effects that behave correctly. At GDC06 Nvidia claimed that “Havok FX running on a pair of GeForce 7900GTX graphics cards in SLI is more than ten times faster than software physics calculations running on a Pentium Extreme Edition 955”. Havok FX was released in Q2 of 2006. The list of titles that use Havok software includes The Elder Scrolls IV: Oblivion, F.E.A.R. and Age of Empires III.
The G80 in perspective
All of the above examples were based on GPGPU running on GeForce 7900 graphics cards and the results are nothing short of astounding. GPGPU computation makes use of ALUs in the GPU. The 7900 GT cards had 96 ALUs clocked at 450Mhz [7900 GPU Diagram] while the 8800GTX has 128 ALUs clocked at 1.35Ghz.[thread processor] Let that sink in slowly - 1.3x the number of ALUs each running at 3x the speed. The GeForce 8800GTX actually divides those 128 processors up in to 16 multiprocessors [8800 GPU Diagram]. The 8800GTS has 96 ALUs clocked at 1.2Ghz each grouped in to 12 multiprocessors.
I found some very technical benchmarks done that compared the NVIDIA 7900GTX (G71), NVIDIA 8800GTX (G80) and the ATI X1900XTX (R580) published by Mike Houston of Stanford University. These benchmarks are very technical but do show that the 8800GTX is more powerful than either the 7900GTX or X1900XTX cards.
Thanks DirectX 10!
The reason for the explosion in the useable shaders on the 8800GTX is the DX10 requirement of unified shaders, the geometry shader requirement and no more fixed function components. This resulted in GPUs that are not divided up into ‘x’ number of vertex shaders and ‘y’ number of pixel shaders. The elimination of capability bits will also force vendors to produce cards that meet the same basic requirements, removing the variations in floating-point formats that existed under DX9. This consistency will reduce the confusion that developers faced in utilizing the previous generation of hardware.
CUDA: A New Architecture for GPU Computing
CUDA stands for Compute Unified Device Architecture and is a new hardware and software architecture that enables the GPU to be used as a data-parallel computing device without the need to map to the graphics API. CUDA is an extension of the C programming language which should allow for a minimum learning curve for developers. CUDA is available on the GeForce 8800 series and future products.
Game Development Potential
The GPGPU results from Acceleware and RapidMind coupled with the work of Havok in the arena of games proves that there is potential in harnessing the power of the GPU beyond making games visually stunning. Havok has already started to improve the implementation of physics in game environments, but that is only one part of a game. This next part is theoretical on my part and I will suggest areas in which some of today’s games could be improved by tapping in to the power of data-parallel programming on the GPU.
Neverwinter Nights 2 and other single player games
The single player experience in Neverwinter Nights 2 is hampered by the poor AI that controls your companions. The path-finding AI works for most of the open areas, but fails miserably when your party is exploring dungeons, underground caverns or building interiors. Computer controlled companions often get stuck on terrain or simply lost leaving you to get trounced by encounters created for a party of four. While you could simply pause the game and make individual adjustments, that process breaks the level of immersion in the game. The AI also has problems while controlling spell-casters, allowing your companions to burn through their offensive spells in situations that do not require them and failing to have them use healing spells when party members are on the brink of death. Given the performance improvements that were shown above, I have to wonder how much more realistic the AI could have been if the developer had been able to make use of the computational power in the 8800GTX.
F.E.A.R. and other FPS games
F.E.A.R. is a game that relies heavily on the spooky factor. As a player you are immersed with creepy atmospheric environmental effects such as steam, smoke and particles floating in beams of light. AI in F.E.A.R. was some of the best seen in recent shooters as well as the physics effects from shooting bad guys. Injury, death and environmental damage were handled elegantly. So why do I bring this game up? Simple. More could still be done. Imagine using the power of the GPU to generate a dynamic map as the result of chemical spills or burning liquids applied in real-time. The immersion level in the game would be greatly increased.
World of Warcraft and other MMOs
Economy is always an issue in MMOs and no one ever seems happy about how an in-game economy is modeled. Certainly game developers struggle to achieve a realistic economic system that can react to unanticipated fluctuations caused by players. In this context I think about the RapidMind benchmark for European Pricing Options that ran 120x faster on a 7900 based GPU than the original benchmark did on a CPU. Apply this muscle to controlling the actions of NPC traders and MMO economies would take on a complex life of their own that react to player-induced trading frenzies.
• The 8800GTX has 128 ALUs vs 96 ALUs on the 7900GTX and they are also clocked 3x higher
• NVIDIA has released the CUDA SDK to assist developers in exploiting the power of the GPU in GPGPU programming
• Companies like RapidMind, Acceleware and Havok are making it easier to implement GPGPU strategies
• GPUs have far outstripped CPUs in processing Floating Point Operations
• GPUs have a large installed base that add-on cards would have to build
• GPUs would be cheaper to use on server-side implementations than buying server clusters
• GPGPU programming remains difficult and requires programmers to think differently about their applications
• The power of DX10 compatible parts is crucial to expanding GPGPU implementations due to explosion of shaders required to meet the specification, but the installed base of DX10 cards in the near future will be low
Final Verdict – 100% excitement about possibilities
GPGPU implementations show greatly improved processing capabilities over CPU solutions and the introduction of DX10 compatible parts should increase that. Companies like RapidMind, Acceleware and Havok are making it easier for traditional programmers to leverage GPUs in their applications. NVIDIA and ATI, with CUDA and CTM respectively, are building tools to expose their GPUs to a greater extent to GPGPU programmers. The 8800GTX is a tremendous leap forward in computational power for GPGPU applications both in the world of computer games and in real-world simulations. It gives me a warm fuzzy feeling knowing that the power of my GPU, which so often sits wasted while I perform common task like reading Firingsquad.com, could be used in programs similar to Folding@home to cure diseases.
 History of GPGPU -- http://www.gpgpu.org/data/history.shtml
 5 Disruptive Techologies To Watch In 2007 by David Strom -- http://www.informationweek.com/internet/showArticle.jhtml?articleID=196800208
 Acceleware and Boston Scientific -- http://www.nvidia.com/object/acceleware_boston_scientific_success.html
 RapidMind GPU Evaluation -- http://rapidmind.net/case-gpu.php
 RapidMind -- http://rapidmind.net/product.php
 The Tech Report -- http://www.techreport.com/onearticle.x/9610
 Understanding GPUs Through Benchmarking -- http://www.cse.ohio-state.edu/~kerwin/GPGPUPerformance.pdf
|25 User Comment(s) • 13 root comment(s)|
| Arturo02 (5) Apr 05, 2007 - 02:03 am|
|I think the writing in this article is the best so far Buc has written. It flows well, and has a good synergy. |
If I knew more about computers I could be more technical but my forte is writing. That part you did well on, the tech stuff I defer to others to comment on. :)
» Login to reply to this
| GrapeApe (36) Apr 04, 2007 - 11:30 pm | Edited on Apr 06, 2007 - 06:12 pm|
|» Good style just needs a little tweak.|
Good writing style, but needs some minor fact checking, of which I'm sure Brandon et al. will gladly help in the future.
Just some quick points for FYI;
- Havok FX is part of the Havok 4 engine, there are no games out yet based on Havok 4 yet (although a UBI demos were shown in CA last month), all the examples you listed are based on Havok 3 which do not have support for Havok FX (although they could be added with alot of effort). Crysis is supposed to use it's own physics engine that's supposed to support VPU physics, which would've been a good example to go just outside the Havok realm.
- The statement "The 8800GTX is a tremendous leap forward in computational power for GPGPU applications both in the world of computer games and in real-world simulations." actually is not true in overall general computational power as the R580 has more power when using CTM and the MADD + ADD scenario (about 5-10% more).
But overall good article, good flow and nice combination of examples and good projection of future application of the technology.
Edited because after re-reading the opening statement looked harsher than I intended it to be. It's not a big criticism, since often we have to remind FS and other reviers of things to tweak/correct, and they update their articles. Too bad you guys can't do FS approved changes to update stuff in the same manner they do after their reviews have launched I know you and Dave would both like to just tweak stuff.... like adding more flair. >B~)
» Login to reply to this
» Note: You need to be logged in to write a comment!Login here, or if you don't have an account with FiringSquad, register here, it's FREE!
My Media-Blog categories