AMD architecture performance analysis: memory bandwidth scaling and conclusion
What have we learned?
Before summing up the findings here, I wanted to include one further test. In attempting to chart performance improvements across AMD hardware generations, it seemed logical to equalise core clock frequencies and memory bandwidth in order to get the closest lock to overall efficiency gains. However, this does ignore the reality that Navi has access to vastly improved GDDR6 memory up against GCN's closest counterparts, running GDDR5.
The Radeon RX 580 is the closest prior-gen equivalent we have to the RX 5700, but AMD's new product ships with a big 75 per cent uplift to bandwidth, with the new card's 448GB/s outstripping the 256GB/s found in Polaris. And this led me to wonder what kind of improvement to overall performance we'd see if we left core clocks equalised but allowed Navi to tap into the full potential of its GDDR6 - the same memory we should expect to see in the next-gen consoles, remember.
Ultimately though, results from a huge uptick in bandwidth proved a little underwhelming - with improvements ranging from just six per cent to around 15 per cent at the maximum. As a representative result, our old friend Crysis 3 gives you a good outlook of how things pan out. Here we can see three results at two resolutions - a 1.0GHz core clock paired with underclocked and stock memory, and the fully enabled RX 5700 too - which tends to broadly operate in 1.7GHz territory. We don't see performance scale entirely with frequency, but I suspect that the extra bandwidth gets more of a workout with the Navi processor running at higher clocks.
Crysis 3: Very High, SMAA T2X, DX11
GCN vs RDNA conclusions: what have we learned?
There's a lot of data to wade through and results vary drastically. The best-case scenario I found of generous scaling across the generations comes from Ghost Recon Wildlands. The very high preset there is a challenging workout with an emphasis on GPU compute and the end result sees a 25 per cent increase in performance at 1080p between GCN 1.0 Tahiti to GCN 4.0 Polaris at 4.1TF, with a further leap of 27 per cent between Polaris and Navi at 4.6TF. What we're seeing here is 60 per cent improved performance overall from the same level of compute power. Crysis 3 also saw good scaling - a 23 per cent increase at 1080p between Tahiti and Polaris at 4.1TF, and a further 22 per cent between Polaris and Navi at 4.6TF. Compounding the architectural steps, a GCN 1.0 teraflop goes 50 per cent further with Navi at 1080p.
Results elsewhere were less impressive under DX11, but even in our worst case scenario (a mere 30 per cent of additional performance) we proved that a GCN 1.0 teraflop is considerably less potent than an RDNA 1.0 equivalent - and there's also evidence here that AMD's architecture has evolved considerably over time in other directions - with geometry processing in particular delivering vastly improved results.
Perhaps the biggest disappointment from the data was the lack of meaningful data under DX12, which means that our testing on some of the most modern gaming engines failed to deliver any decent results from GCN 1.0 - perhaps not surprising when you consider that GCN's origins predate all of the lower-level APIs used today - and even their predecessor, Mantle. However, clock for clock, Navi's performance boosts up against Polaris still look good - and with a circa 20 to 35 per cent uplift depending on the game, there is consistency here with Navi vs Polaris DX11 results.
Ghost Recon Wildlands: Ultra, TAA
In PC terms, the equalised results aren't bad at all for Navi but it's AMD's push for higher frequencies that really helps deliver meaningful boosts to overall performance and that's where the Ghost Recon benchmark above comes in. A stock RX 5700 easily bests the RX Vega 64 - despite the last-gen Radeon possessing vastly higher theoretical compute power. In addition to the improved architecture and feature set, Navi also benefits from increased core clocks, which means that it's not just compute power that improves - all aspects of the GPU run faster.
As for where this leaves the next generation consoles, it's all good news. On the face of it, the results may suggest that a six teraflop GPU like Xbox One X's would translate to an 'equivalent' 8.1 to 9.5TF GPU with a Navi-based successor. It's an overly simplistic calculation to make when there is so much more to the make-up of a GPU than its compute power alone - and the uptick we've measured here may actually be significantly higher depending on the workload. However, it would certainly go a long way towards explaining why Microsoft is considering the Project Scarlett box codenamed Lockhart rated at 'just' 4TF of GPU compute - and we may yet see this machine hit the market: Microsoft's PR focus is all on the higher-spec Anaconda box but I've yet to get any firm confirmation that Lockhart has been cancelled.
As things stand, a 40 to 60 per cent architectural improvement in performance is impressive, especially when further amplified by the inevitable increase in GPU frequency and the big increase in memory bandwidth. And this is just talking about hardware specs, when much of the magic comes from game developers. It's difficult to imagine that the likes of Uncharted 4, God of War and Horizon are essentially running on a customised Radeon HD 7850, while Forza Horizon 4 and Gears 5 are delivering a phenomenal return from what is basically an underclocked R7 360. And with that in mind, no matter what the configuration is that the next-gen consoles end up with, Navi-based silicon should be phenomenal.
AMD RDNA vs GCN Analysis:
- Introduction, video analysis, synthetic benchmarks [This Page]
- Gaming benchmarks DX11: AC Unity, Crysis 3, Ghost Recon Wildlands, Far Cry 5
- Gaming benchmarks DX12: Rise/Shadow of the Tomb Raider, Strange Brigade, Wolfenstein 2
- Gaming benchmark problem children: Battlefield 1, Forza Horizon 4, The Witcher 3
- AMD architecture: Navi memory bandwidth scaling and conclusion