In Theory: Can next-gen Nvidia tech offer Titan X power for GTX 970 money?
Early spec analysis on the next-gen Pascal architecture - and how it could translate to mainstream GPUs.
If high-performance graphics cards like Titan X, Fury X and GTX 980 Ti aren't enough to satisfy your lust for top-tier PC hardware, this year will see the arrival of new hardware with the potential to take gaming visuals and performance to the next level. Historically, AMD and Nvidia have worked hard to push PC graphics year after year, but the arrival of 14nm and 16nm chip fabrication technology using 3D FinFET transistors offers GPU vendors the first real innovation in manufacturing technology for five years. And recent data released by Nvidia suggests we're in for something really special with its upcoming Pascal architecture.
All the signs suggest that it's Nvidia that will take point with the arrival of new graphics hardware based on the 16nm process provided by long-time partner TSMC, with rumours strongly suggesting that product will be seen at Taiwan's Computex show at the end of May. There's been a range of leaks and rumours reported by the Far Eastern press in recent weeks, but the best indication we have of Pascal's make-up comes from the reveal of the Tesla P100 accelerator at Nvidia's GTC conference earlier this month, complete with an exhaustive list of specs.
The new product is aimed at large datacenters and other consumers of so-called super-computer technology, but crucially, the new Tesla is built on Pascal technology and the specs strongly suggest that this processor will eventually end up as the next generation Titan, or equivalent. The chip's name is GP100, echoing the GM200 of Titan X and the GK110 of the original Titan - and some of the raw stats buried within the data are absolutely remarkable.
First up, check out the size of the chip itself. There have been concerns that the 16nm process may require time to mature, that larger, more difficult to make processors may take years to appear. However, GP100 is actually larger than GM200 - 610mm2 vs 601mm2. Confirmation of 16nm's manufacturing advantage is also confirmed by a 15.3bn transistor count - up from 8bn in today's top-tier product. Perhaps most surprising of all is the boost clock - the peak speed of the chip. It's rated for 1480MHz, which is actually higher than what you can reasonably expect to achieve from Titan X pushed its absolute limits. And this is for an industrial product, which usually has quite conservative clocks compared to the consumer graphics cards.
Tesla M40 | Tesla P100 | |
---|---|---|
GPU | GM200 Maxwell | GP100 Pascal |
SMs | 24 | 56 |
Base Clock | 948MHz | 1328MHz |
Boost Clock | 1114MHz | 1480MHz |
Texture Units | 192 | 224 |
Memory Interface | 384-bit GDDR5 | 4096-bit HBM2 |
L2 Cache | 3072KB | 4096KB |
Transistor Count | 8bn | 15.3bn |
Die Size | 601mm2 | 610mm2 |
Process | 28nm | 16nmFF |
TDP | 250W | 300W |
On paper, GP100's leap over GM200 is absolutely remarkable. Processing power typically scales with transistor count. Not only has the 16nm process delivered this, but the overall speed of the processor has increased too. And there are other reasons to believe we're in for a huge boost in performance - many believed that the Pascal architecture would be a die-shrunk version of Maxwell. That isn't the case, with a restructuring of the CUDA cores along with another big boost in L2 cache. How that translates into enhanced performance remains to be seen, of course.
The Tesla P100 uses 16GB of HBM2 memory too, accessed via an ultra-wide 4096-bit bus - a vast improvement in memory bandwidth compared to the 384-bit GDDR5 utilised in Titan X. We'd expect a next-gen Titan to retain the HBM2 (it's already confirmed for the AMD competitor, codenamed Vega), but the question is how much VRAM we will see in the inevitable cut-down version of the card aimed at the gaming audience - today's equivalent to the GTX 980 Ti.
What's fascinating about Nvidia's GTC announcement is just how much the firm shared, to the point where we are seemingly getting an extremely early preview of a top-tier consumer GPU we're unlikely to see until well into 2017 at the earliest. It's unlikely we'll see a GeForce GP100-based product this year, so what will we be getting instead? It's at this point where the rumours from the Far Eastern press come into focus.
The leaks suggest that we'll see Pascal gaming cards in July this year, showcased at Computex in Taipei the month before. At least two cards are mooted - seemingly called GTX 1070 and GTX 1080 - designed to replace their Maxwell equivalents. The naming may seem rather odd, but another leak - showing 1070 and 1080 casing on the actual production line does seem to be compelling. Now, here's the thing - each of these products is said to be derived from another, smaller Pascal chip: GP104.
Nvidia has demonstrated that its next-gen 'smaller' chip can outperform its last-gen 'big' chip - exactly what we saw when the GTX 980 outperformed GTX 780 Ti (the ultimate iteration of the original Titan). The real question is just how small GP104 actually is. Another leak, purporting to show the actual die suggests that it's actually smaller than the GTX 980 equivalent, GM204 - anything from around 317mm2 to 330mm2, compared to the older chip's 398mm2.
But it's the GTX 1070 that is almost certain to be the volume card in the line-up. The question is, just how daring will Nvidia be with it? When the GTX 970 was released, the green team redefined the high-end GPU market. It could overclock up to and beyond stock GTX 980 performance. It handily beat everything AMD had to offer - products that were over £200 more expensive at the time. The gambit paid off with phenomenal sales success, to the point where at its peak, the GTX 970 commanded over five per cent of the entire Steam userbase. Indeed, the March 2015 hardware survey still has it at 4.93 per cent overall. Bearing in mind the vast amount of GPUs on the market both old and new, that's a remarkable statistic. Will Nvidia aim to pull off the same trick a second time? Could a factory overclocked GTX 1070 outperform GTX 980 Ti in the same way that the 970 could beat the 780 Ti - the ultimate iteration of the first-gen Titan?
We'd like to think that Nvidia would aim to be just as audacious this time around. As phenomenal as GTX 970 has been, AMD's Radeon R9 390 has made a big comeback for the red corner. Dark Souls 3 aside, it's run most of the big games released this year on par or faster than the GTX 970, with titles like Quantum Break and Far Cry Primal in particular posting some highly significant increases in performance. And there remain question marks over Nvidia's DX12 performance too - we've seen big gains on AMD, but Nvidia's DX12 showing in titles like Hitman and Ashes of the Singularity hasn't exactly been overwhelming.
There are plenty of other question marks we hope to see addressed soon too. For example, we know that GP100 - the 'big Pascal' chip - is designed for next-gen HBM2 memory, but what's the score with the upcoming consumer cards? Titan X and GTX 980 Ti took GDDR5 memory pretty much to its limits with a 384-bit bus paired with 7gbps modules. Will Nvidia stick with tried and tested technology, or will it go for Micron's new, higher bandwidth GDDR5X? The recent leaks of PCB shots of the GP104 see it paired with currently unidentifiable Micron chips strongly suggest that at least one of the consumer level Pascal cards will ship with the upgraded RAM [UPDATE 28/4/15 11:33am: Looks like the Micron chips have indeed been identified as GDDR5X].
Only time will tell, but assuming the rumours and leaks of a June Computex reveal and a July release turn out to be true, we shouldn't have that long to wait - and we'll be on hand to review any and all Pascal products that come our way, with a revised benchmark suite encompassing newer titles running across both DirectX 11 and DX12.