As the entire world knows, nVidia unveiled NV100 i.e. “Fermi” architecture on September 30, 2009. As it goes, nVidia’s representatives were disclosing to press and analysts that the GeForce parts will see the light of day in November, Quadro to follow in April 2010 and Tesla launching in June, 2010. However, it took almost six months from Jen-Hsun’s keynote for the first product to reach our hands, and it will take another two weeks to reach global markets.
Given that we have already covered the GF100 in our Deep Dive Article in January 2010, we’ll just address the major changes that have happened between then and now.
Regardless of how inside the 3D architecture you are, the birth of the GeForce GTX 480 was a troublesome one. First, we owe you a bit of an explanation. There were rumors that Fermi was late because in 2008, Jen-Hsun went into war of words with Intel and their then upcoming Larrabee architecture. However, the Fermi architecture belongs to the NV30-NV50/G80-NV100/Fermi cadence. NV30 or GeForce FX was the cornerstone architecture for follow-ups, GeForce 6800, 7800, 7900 and the GPU inside Sony PlayStation 3, RSX [i.e. GeForce 7900 with a cut-down 128-bit memory controller]. NV50 or G80 was the base architecture for GT200/Tesla architecture and now NV100 or Fermi will be the base upon nVidia will build its parts in 2010, 2011, 2012 and maybe even 2013, depending on when DirectX 12 will arrive on market. This cadence is practically set in stone by both nVidia and AMD, and the only time this cadence changes is when the parts are late, such as ATI’s R520 “Fudo” [Radeon X1800], which delayed the development of R600 [Radeon HD2900]. However, both AMD GPU subsidiary and nVidia are home to the world’s brightest engineers that seriously push the envelope with each and every architecture.
Getting back to NV100/Fermi/GF100, nVidia had numerous issues with getting sustainable yields from TSMC and their troubled 40nm process node. Manufacturing a 529mm2 die [23x23mm] is a massive undertaking, but with 40nm GPUs, we saw both nVidia and AMD breaking their habits: AMD said that their die sweet-spot is 256mm2, only to go 334mm2 with their Cypress architecture, while even three billion transistor monster wasn’t enough for nVidia to break the previous die record: original 65nm NV60/GT200 chip featured a massive 576mm2 die.
GF100 packs 16 Core clusters consisted out of four special units and 32 cores each, for a grand total of 512 cores. However, TSMC was only able to produce a limited number of dies with all 512 cores enabled. nVidia decided to split the line-up into three GPUs: 448-core one which serves as Tesla C2050 3GB and C2070 6GB, upcoming Quadro FX 4900 1.5GB and as GeForce GTX 470. Second variant is the 480 core one, which is now debuting as GeForce GTX 480 and will debut as Quadro FX 5900 in only a few weeks time. The third variant, fully fledged 512-core one will remain unused for now. As the chip production ramps up, there is a possibility for GeForce “GTX 490” and Quadro “FX 5950”. Third product line, the Teslas will remain unchanged for the 40nm generation.
The decision to go with only 480 cores resulted in large number of qualified dies, as nVidia should have in excess of 30,000 GTX 480 chips alone for the hard launch in April. However, when ATI launched Radeon HD 5800 series in late September 2009, the company only had around 20,000 boards worldwide. The number of nVidia boards available at launch tonight is equal to zero, and ATI wins DirectX 11 availability hands down. Unfortunately, the whole timeline how nVidia’s management and representatives handled this is a text-book example of how not to treat their partners, analysts and media. We’ll see if the company will grow into a trustworthy one or not. For now, the answer is unfortunately negative for nVidia.
GeForce GTX 480
Getting away from all the intricacies and finally the delivering of the Fermi chip to life, the subject of today’s review is called the GeForce GTX 480. We managed to acquire two boards and will be taking a very good look into the scaling part, as nVidia claims to have 90% scaling in most cases.
First and foremost, GTX 480 is a bit shorter than its direct competitor, the Radeon HD 5870. The board is 26.7 centimeters [10.5 inches] long – about the same as its predecessor, GeForce GTX 280/285. The whole card is covered with one interesting heatsink. For the GTX 480, nVidia decided to ditch the plastic cover over the GPU and created a nickel-plated aluminum cover that looks spectacular – our photographer said to us that this is the most photogenic piece of hardware he has seen in years. Looking from a technical perspective, nVidia definitely needed every square millimeter for heat dissipation. We also have to warn you not to touch the metal part of heatsink after the board has worked for a while – you’re guaranteed a burnt finger. The interesting bit about the fan is the fact that original card used to intake air from the holes on PCB and the regular front side. However, nVidia changed this design literally at the last minute – only a few weeks ago, the company mounted a larger fan, which rendered the back holes useless. You can expect that these holes will be closed shut on the PCB Rev C [tested board was a PCB Rev B with GF100-375-A3 chip].
Looking at the official spec sheet, nVidia states GPU ticks at 700 MHz, 480 Cores tick at 1.4 GHz and 1.5GB of GDDR5 memory work at… 7.4 “GHz”. Say what? In reality, this is a way for nVidia to beat “1.2 GHz” on HD 5870’s spec sheet.
Looking at the PCB, we notice the clean design – there is no NVIO chip present, as nVidia decided to sacrifice a portion of the die for display connectivity. This is also the reason why even Tesla C20 Series boards come with DVI connector. The board features 12 memory chips which interestingly don’t say a thing who is the manufacturer of the chips. We know that nVidia contracted Samsung to manufacture custom-built ECC GDDR5 memory, but is this memory from Samsung or somebody else, like the now-defunct Qimonda – remains to be seen. Sadly, the DRAM cells tick at only 924 MHz. nVidia claims DRAM speed is “1848 MHz” but in reality, GDDR5 memory is no longer a DDR memory – data moves four times a cycle, two bits up and two bits down. Thus, we weren’t surprised to see AMD using “Gbps” and “Gtransfer/second” monikers in the past, while board manufacturers mostly default to “GHz”. GQDR is sadly, a bit hard to pronounce. In reality, nVidia has 1536MB GDDR5 memory ticking at 924 MHz in Quad Data Rate, or 3,696 Billion Transfers per second. Combined with a 384-bit memory controller, we come to the video bandwidth figure of 177.41 GB/s. Thus, GeForce GTX 480 is a step down from GeForce GTX 285, which combined 512-bit memory interface with GDDR3 memory at 1.24 GHz DDR clock [158.98 GB/s], with overclocked cards such as EVGA GTX 285 FTW and their 1.39 GHz DDR memory resulting in 177.92 GB/s, which is still a record for a single-GPU board. Looking at the other camp, ATI’s Radeon HD 5870 packs 1GB GDDR5 clocked at 1,2 GHz QDR [4800 GT/s, “GHz” etc.], offering 153.6GB/s of video bandwidth.
Opening the heatsink is quite easy compared to the GTX 280 – when you dig inside it, you’ll find five heatpipes, out of which four exit the heatsink, while one heatpipe remains hidden. nVidia decided to use direct heatpipe contact with the IHS [Integrated Heat Spreader], as first seen on OCZ CPU coolers – a world’s first for GPUs. Truth to be told, we cannot stop wondering why nVidia didn’t go with a Vapor Chamber design, as that design saved ATI Radeon HD 5970 from overheating. Then again, since the NV30 fiasco – nVidia really likes to play conservative with the design elements of the graphics card – we can’t say the same for the silicon gang, given that they really push the envelope of what is physically possible with each and every architecture.
There is one sad part about the final heatsink design, though. A week ago, we spoke with Tom Petersen, Director of Technical Marketing at nVidia about all the changes nVidia made in the past three weeks. One of very sad changes was a LED vendor that will remain unnamed here [we might do a separate story, though] that screwed nVidia up. If you recall, original GF100 boards came with a lit “GeForce” logo next to PEG connectors, but the final cards just have a silk-screened GeForce logo. nVidia decided to leave the power header open though, so case modders and AIC vendors can put their logo or GeForce logo and light it up. Additionally we spoke with several board partners who told us that this change is not possible for the initial batch of heatsinks/cards, but there will be some LED-backlit designs coming to Computex Taipei 2010.
The board features one eight-pin and one six-pin PEG [PCI Express Graphics] connector for a total delivery of 225 Watts. Add 75W from the motherboard and you reach a peak of 300W. As a comparison, the only other single-GPU boards that had this power-hungry configuration was GeForce GTX 280. From ATI’s side, only the Radeon 2900XT and the upcoming Radeon HD 5870 Eyefinity6 feature the same 8+6 configuration.
The back of the card features two Dual-Link DVI-I connectors and a single mini-HDMI 1.4 connector. We haven’t received the adapter to regular HDMI so we couldn’t test the setup on our reference 46″ Panasonic VIERA plasma TV, but the HDMI 1.4 is maybe the key differentiator between current generation of AMD and nVidia hardware: this summer will see the arrival of large number of 3DTV panels, both plasma and LCD ones. In order for 120Hz refresh rate to work, you either have to have a dual-HDMI or Dual-Link DVI TV, or HDMI 1.4. Given that AMD has no HDMI 1.4 capable models [for now], the advantage for 3D experience remains on nVidia’s side.
You can connect the cards in 2-Way, 3-Way or 4-Way SLI. You’ve read it correctly, according to Tom – nVidia will officially support 4-Way SLI designs coming from EVGA and ASUS. Given that EVGA is preparing its GTX 480 Classified and ASUS is burning midnight oil to create a single-PCB Dual-Fermi board, Computex could really be the place to see all the 4-Way SLI designs. In my mind, there was one other number: four GTX 480 boards would result in over 1000W, or one kilowatt of power consumed just for the GPUs!
We pitched ATI Radeon HD 5870 1GB versus a single GeForce GTX 480. While there is a $100 difference in price between the two, this is the battle that two vendors will lead. AMD hasn’t disclosed the pricing of Radeon HD 5870 2GB Eyefinity6, and it will be interesting to see performance delta between 1GB and 2GB of video memory [if any, if history is anything to go by].
All of the applications used here had their detail settings cranked to the maximum. If the option was between High and Ultra, we would select Ultra level of detail. We made sure same settings were used for all cards tested. We used no filtering [No AA/AF] and 8xAA/16xAF, with Anti-Aliasing being in Multi-Sampling mode – to ensure apples to apples. From game to game, we also tested highest level of filtering available, which in case of nVidia is the new 32xAA – a combination of traditional filtering [8x] and computation [24x], while on ATI we used 24xAA by enabling the Edge Detect mode [8xAA plus 16x samples].
Battlefield: Bad Company 2
DICE’s Hit title brings advanced Havok physics and DirectX 11 features. Yet, GTX 480 speeds ahead as soon as AA/AF are deployed
There are no big secrets here – Battlefield: Bad Company 2 brings fantastic gameplay and level of immersion that puts you right in the action. We tested the game with all the bells and whistles turned on. Our test consisted out of 11 minute run through Cold War level, from the beginning of the mission until the helicopter crash in the tunnel.
Worth noting is that if you max out Anti-Aliasing – with 32xAA and 16xAF, GeForce GTX 480 scores 45.98 frames per second, while HD 5870 at 24xAA/16xAF achieves “only” 31.66fps.
Call of Duty: Modern Warfare 2
We test Call of Duty: Modern Warfare by playing the first mission from the begging [action on the bridge] to the end, where you’re supposed to enter the helicopter with General Shepherd. There is not much to report here – this console port achieves very high framerates regardless of the setting you throw them in. Even forcing maximum level of AA/AF would not result in a significant drop in frames, as both cards churned this title.
Colin McRae may be gone, but his legacy lives. AMD had a good nose for this title, and we were not surprised when the company sent a check with several million dollars to bundle DiRT 2 with their Radeon HD 5800 and HD 5700 Series cards. The game packs DirectX 11 effects such as the omnipresent Tessellation. We test the game by running a race on London Auto-Cross track, measuring from the beginning of 3D scene to end of the race. As you can see for yourself, GTX 480 scored consistently better than HD 5870, and even Catalyst 10.3A drivers, which caused us to re-test the HD 5870 at the last minute – didn’t change level playing field. With no AA/AF, the difference was 26% for the green team. By using 8xAA and 16xAF, the difference melted to less than 10%, only to grow back again after highest levels of filtering were applied. On 24xAA/16xAF, HD 5870 records 42.2fps, while nVidia scores 69.5fps running its 32xAA.
Metro 2033 shows that ATI and nVidia are neck to neck in this next-gen title
According to nVidia representatives, Metro 2033 should utilize nVidia’s technology to the max to enable the next-gen gaming experience. Personally, I viewed this as S.T.A.L.K.E.R. with even better graphics and more annoying scripted scenes. The blood line with S.T.A.L.K.E.R. isn’t here by accident; Metro 2033 is powered by 4A’s own engine, which originally served as the engine behind GSC’s Stalker – however, some team members left and as a result, we have two post-apocalyptic games.
We tested the title with all the settings to the max, and for some reason, there was no difference in performance between forcing 8xAA and built-in 4xAA mode. Still, if you want playability in 50-60fps, either get a second card for our level of detail or tune the details down.
Need For Speed: SHIFT
After spending several years in Fast & Furious mode, Need For Speed: SHIFT returns the franchise to its roots of arcade racing. In order to test NFS: SHIFT, we cranked all the visuals to the max, selected legendary Spa Francorchamps race track, placed 15 cars ahead of us, locked the camera on the hood and started FRAPS. Due to the fact that we have driven the lap several thousand times through multiple games, we can consider this to be quite a repeatable task – AI level was set to Serious, btw. Each race was repeated three times, upon which an average was calculated.
As you can see for yourself, there is a large difference in minimum framerate achieved. Unlike games where ocassional slowdowns and dropped frames are invisible, the weirdest thing is where the slowdown occured on AMD card as soon as we would enable any form of Anti-Aliasing: on a short hill drop from Malmedy to Rivage hairpin. There would be noticable stuttering on HD 5870 1GB and 2GB boards, and we hope that AMD can work out a fix in their drivers. After all, in a single lap race, exit from Les Combes and a short run down to Rivage means the difference between finishing first and third. We would usually take the lead by Les Fagnes, leaving the whole third sector of the track to getting those framerates up.
In case of max AA, nVidia scored 89.5fps while running 32xAA/16xAF, while ATI dropped from 76.58 to 64.51, smallest drop from them all. In terms of minimal framerates, 24xAA meant the repeat of stuttering on Malmedy-Rivage, but the minimal framerate remained 24 frames per second, which in theory means our eye should be fooled. It wasn’t.
S.T.A.L.K.E.R.: Call of Prypat
In order to benchmark S.T.A.L.K.E.R: Call of Prypat, we used separate benchmark version of the code, featuring Day, Night, Rain and Sun Shafts.
SLI: Is this scaling for real?
During all the briefings, nVidia touted SLI scaling as “finally resolved”, offering “up to 90% across the board”. Naturally, that level of scaling depends of the CPU you use in your system, because it has to feed two GPUs rather than one. According to nVidia, they solved that issue and should offer surprisingly high scaling figures.
Looking at 3DMark Vantage in Xtreme settings [1920×1200, Max. length shader programs], single GPU scores X9155, while SLI setup gives you X16665 – an 82% boost. This was just the beginning of a very impressive experience. The SLI setup does indeed consume twice the power of a single card, but the noise level remained the same given that the fan on GPU1 spin at lower RPM than the GPU0 one.
GeForce GTX 480 Single to 2-Way SLI Scaling in Unigine Heaven 2.0 Benchmark
Unigine’s Heaven 2.0 Benchmark shows you just how good the scaling goes: in DirectX 9 mode, SLI will score 86.2 to 65.6 fps on a single card. However, when you turn DirectX 11 mode and turn Tessellation on, the ratio changes to 74.9 fps on SLI and 39.4 fps on a single card – scaling changed from 32% gain to 90%! Thus, it is more than obvious that the resolution is no longer a bottleneck of the video rendering system. Today, computational power is what matters and don’t expect that to change anytime in future. The confirmation of how nVidia’s software engineers pulled a rabbit out of a [green] hat came in Metro 2033. Personally, I don’t like the game because it is filled with scripted sequences that frankly, jerk my chain during those 10-15 hours on weekends that I can play games.
SLI scaling in Metro 2033
But in Metro 2033 at 1920×1200 with all the details cranked to the max, a single GTX 480 returned 26 frames a second. A GTX 480 in SLI returns 51 fps i.e. 98% boost. We don’t know where the 2% disappear, but this is seriously impressive. When we removed the Adaptive Anti-Aliasing and replaced it with 4xAA, the framerate dropped to 25fps on a single card [kinda leaves us wondering what’s the purpose of an AAA/4xAA mode when there is no tangible difference in framerate on a GTX 480], while GTX 480 SLI setup returned 46 frames a second – 86% scale.
Do you want to see “Beyond 100% SLI Scaling”? Then just start S.T.A.L.K.E.R: Call of Prypat
Using 1920×1200 with Ultra Preset, Enhanced Full Dynamic Lighting in DirectX 11 path, we experienced exactly 100% scaling in Night test – an Average of 60 vs. 120 frames per second. This was no fluke – maximal frame rate was 46 vs. 51fps, but maximum was 111 versus 246 for the SLI setup. Rain scene gave out 64fps for single and 139fps for dual-GPU setup, a 117% boost! We ran the test no less than eight times, including two reboots to make sure we weren’t on some “reality distortion” medicaments. Even Shafts test confirmed “beyond 100% scaling”, with 150 fps maximum achieved framerate, compared to 65 on a single GPU. As you might imagine, average framerate was 44fps for single and 89fps for 2-Way SLI.
Numbers seen on three DirectX 11 applications only leave us to wonder can this become a norm in DirectX 11 and other computation-heavy applications. Judging by our week of experience with GTX 480 in SLI mode, the future is bright indeed. After all, AMD also saw the future in multi-GPU configs, given the fact that their high-end boards in the past thre generations were all Dual-GPU boards [HD 3870X2, HD 4870X2, HD 5970].
I want 3D Vision Surround!
When ATI launched Eyefinity on USS Hornet, we met with several nVidia executives on the deck and discussed what AMD has announced. Our sources were shocked at Eyefinity, because nobody in the industry saw that coming. Yes, you’ve read it clearly – nVidia executives showed up at ATI launch party [it also works the other way] – even though PR managers from both companies might get goose bumps. I often read the words of hate in comments on our articles and sadly for most fanboys, most of AMD and nVidia engineers and executives are actually very open-minded individuals who try to do their very best on delivering unimaginable experiences.
Getting back on subject, it took three months for nVidia’s software team to come up with an answer to ATI’s Eyefinity. After even Samsung decided that six-display setup would have a marginal revenue impact and slowed/stopped the introduction of their six-display narrow-bezel LCD setup, a focus was put on three displays. However, nVidia cannot enable three displays on a single card – thus, you have to use two cards in SLI. Under the name nVidia Surround and 3D Vision Surround, nVidia is touting three display gaming with all the bells and whistles turned on. Given that we had a three-display setup, I was genuinely looking forward to putting three displays into action.
Unfortunately, I omitted the slide below:
Want to play with three displays, regardless of them being 60 or 120Hz? Well, you can’t – until Release 256 shows up.
We spoke with Lars Weinand, who told us that Release 256 Driver will come in the next two to three weeks time. Given the retail availability of GTX 470 and GTX 480 is week starting April 5th in Asia and week starting April 12 in EMEA/North America, we’d say nVidia is timing the two together.
Originally, nVidia priced the GeForce GTX 480 at 1799 Lithuanian Litas with tax. Yes, you’ve read that correctly. We’ll deliberately leave you guessing who is the gentleman in the dark that came up with that price. Translated into convertible currencies, GeForce GTX 480 will set you back for $499.99 in United States, or 479.99 Euro inside EMU [European Monetary Union]. If you’re wondering where the difference in price comes, the reason is simple: these are Manufacturer Suggested Retail Prices which differ in one major thing: Sales tax in added in United States [where applicable], while 480 Euro has to be the selling price in Euro-zone – tax included! Thus, two of these will set you back for $998.98 or 959.98 Euro.
Looking at the GeForce GTX 480 as a single product, it is not hard to say is if worth your money or not. If you want to play the latest titles with playable framerates with Anti-Aliasing set at 32x and Anisotropic Filtering set at 16x, there isn’t any competitors that can challenge that. The price that you have to pay is the noise that the board outputs, and we would suggest liquid-cooling. Then again, we also suggest liquid cooling for AMD’s Radeon boards, because neither can be classified as “silent” gaming boards. However, it is much easier to dampen the sound on AMD GPUs.
When it comes to SLI, we witnessed some downright incredible scaling, including “beyond linear” in S.T.A.L.K.E.R: Call of Prypat, Metro 2033 and ElcomSoft’s password cracking. We have never seen two GPUs achieving higher scores than a doubled score of a single card – this is exactly what happened in few cases. Given that those “few cases” were DirectX 11, effects-laden Metro 2033, S.T.A.L.K.E.R: Call of Prypat and more importantly, GPGPU application which loads the GPUs to 100% – there is a bright future ahead for SLI scaling.
Conclusion: GeForce GTX 480
If we talk in automotive terms, one might argue that GTX 480 is akin to Ferrari Enzo: fast, extreme, loud and expensive. In contrast, ATI Radeon HD 5870 should be considered as an Mercedes SLS: still fast, somewhere even faster and more comfortable to ride. Even though AMD sponsors Ferrari, we often feel that combination is not exactly balanced out, especially after talking to executives from both companies.
nVidia delivered one truly controversial part: it consumes a lot of power, it is noisy and in some tests – loses to a six-month old ATI Radeon HD 5870. However, when you crank the AntiAliasing settings to the maximum, you can play today’s hit titles with no hitches: Battlefield: Bad Company 2, DiRT 2, Need For Speed: SHIFT – not a single slowdown and the result of Unigine Heaven 2.0 DirectX 11 benchmark speaks for itself.
We’ll leave the conclusion for SLI mod until we’re able to test ATI Radeon HD 5870 in CrossFire with the new drivers. As far as GeForce GTX 480 goes, the question is – does this product deserve an innovation award, based on all innovations that nVidia placed into the chip? The answer is No. We review this product as a whole and while the chip is revolutionary, the product itself has drawbacks we cannot overlook.
Does the GeForce GTX 480 deserve our Editor’s Choice award? The answer is without any reserve – Yes. Because if you want to play games in Max. AA/AF – there is now a single card that enables you to do so. If I want to do the same on an ATI setup, you need two cards or shell out at least $100 more for a card that looks to be made out of Unobtanium – at its MSRP, HD5970 is currently available just as GTX 480 is, and the product was on market for how many months now?
However, if you don’t want to cope with the quirks of a supercar, compromises that owning a Ferrari will put in front of you, there is an easy way to stay elegant and yet, have bragging rights. AMD Radeon HD 5870 is indeed, a Mercedes SLS of GPU world.
Ferrari or Mercedes, it is completely up to you. I’d take both.
Original Author: Theo Valich
Webmaster’s note: This news article is part of our Archive, if you are looking for up-to date articles we would recommend a visit to our technology news section on the frontpage. Additionally, we take great pride in our Home Office section, as well as our VPN Reviews, so be sure to check them out as well.