Unreal Engine 5.2 features analysed: is this the answer to #StutterStruggle?
Plus: how procedural generation saves development time and boosts performance.
Nearly three years after Unreal Engine 5 was first revealed, we're on the cusp of the first major UE5 game releases, including Immortals of Aveum, The Lords of the Fallen and Stalker 2: Heart of Chernobyl. With the release of Unreal Engine 5.2, the time seems right to take another look at what new features have been added in the latest revision and how these additions will colour the games of the future - including titles from developers that previously built their own engines like CD Projekt Red and Crystal Dynamics.
Procedural generation is the headline addition of UE5.2, as showcased through the Electric Dreams demo back in March. If you recall, the spaces featured in the original Unreal Engine 5 reveal with Lumen in the Land of Nanite and Valley of the Ancient were built in a very particular way. Artists manually placed and arranged each and every bit of the environment from prefabricated assets, often spending time copying-and-pasting these assets with rotation and scale changes to make the sparsely populated rocky environments found in these demos. While this technique can be an effective way to build smaller-scale projects like these reveals, this kind of “kit bashing” as it's called is perhaps impractical for a real video game production. It takes plenty of manual labour and is limiting on the engine side too, as the wasteful overlapping of many meshes tanks performance for hardware-accelerated Lumen ray tracing.
In the later Matrix Awakens demo, Epic showcased a procedural tool to populate urban environments, but with 5.2 they've released another system for natural outdoor environments, like the one seen in the Electric Dreams demo. Here, Nanite is used not only for opaque objects like rocks, but also for objects like leaves and bushes that use alpha-masked transparencies. Based upon my first contact with the demo and in the editor itself, this technique seems effective in generating convincing high-quality environments from a limited number of assets and little artist intervention. This should make it easier to populate large worlds with a convincing amount of detail, with Nanite supplying the requisite detail.
Another positive side effect of this more systemic placement method comes in performance terms, where hardware Lumen can now run measurably better at its default epic setting. In one example, I measured a 14 percent frame-rate improvement compared to software Lumen at the same resolution, while providing a noticeable uptick in reflection detail - allowing for the invidual leaves to be shown versus more amorphous blobs in the software version. Diffuse lighting quality is also improved in the hardware implementation, as the software solution tended to over-darken shadowed regions. This is a solid improvement over the kit-based environments like Valley of the Ancient, where hardware Lumen ran significantly worse, making it essentially unusable despite its higher quality.
While this is impressive, it's important to note that these generalisations tend to hold true when GPU-limited, but CPU-limited scenarios can show different results. For example, running the UE5.2 demo at a lower resolution, software Lumen provides a not insignificant 10 percent performance increase over hardware Lumen. CPU requirements are likely to be steep too, as even with a Core i9 12900K and 6400MT/s DDR5 RAM, the demo runs just over 60fps on average. When traversing the world at a higher speed, the demo becomes increasingly CPU-limited and performance suffers - and stutters occur - as a result.
Interestingly, despite being a modern engine, UE5 doesn't yet seem to scale well on CPUs with higher core and thread counts - echoing results from last year. For example, going from six to eight cores on the 12900K increases CPU-limited performance by only six percent, while turning on hyper-threading increases performance a further four percent in this test sequence. Turning on eight more Efficient cores doesn't improve frame-rates either.
Given how commonplace UE5 seems likely to become over the next few years, this is a bit disappointing - especially as average CPU core counts continue to climb. For context, in Cyberpunk 2077 we see an 88 percent increase in frame-rate when going from four cores to 16 cores on the 12900K, whereas in the Electric Dreams demo we see only a 30 percent improvement. Based on this, UE5 still has a lot of room to grow in terms of taking advantage of modern multi-threaded processors.
If you are on an Ada Lovelace (Nvidia RTX 40-series) GPU though, DLSS 3 Frame Generation can be an effective countermeasure and is easy for developers to implement, taking only 11 clicks in total after finding the plugin on the Unreal Engine Marketplace. With it on, I measured a 97 percent frame-rate improvement in this CPU-limited scenario. I think that makes DLSS3 (and future equivalents from AMD and Intel) a no-brainer inclusion for developers creating UE5 games.
Another key performance update in UE5.2 is an improvement to shader compilation behaviour - something regular readers know is a constant bugbear for me. The only way for developers to prevent shader compilation stutter in UE4 and UE5.0 is to provide a pre-compilation step before the game starts. This is available in a handful of UE4 titles, but it requires developers play through the game methodically to build a complete library of all shaders encountered by players - and if anything is missed, stutters still occur.
With UE 5.1 and a matching Fortnite update, Epic added an asynchronous shader compilation scheme which worked in real-time, pre-compiling shaders in the background on the CPU during play to hopefully prevent stutter. This technique isn't quite perfect, as if a shader needed to be drawn but wasn't ready the the game would stutter. In UE5.2, this asynchronous system is more accurate and critically adds the ability for the developer to delay the shader display until it fully compiles, thus potentially eliminating all shader related stutter completely - but with the potential that a visual effect or material could display a bit later than it might do otherwise.
This improved asynchronous shader pre-caching and the new skipdraw feature from 5.2 have a transformative effect based on my testing, eliminating the biggest (~500ms) stutters and dramatically improving fluidity. However, this doesn't completely eliminate stutters, with some 30-50ms examples persisting that aren't found in a completely 'warm' cache. Some of these could be attributable to traversal stutter, which UE5 has inherited from UE4 - and can still be found in the latest version of Fortnite running Unreal Engine 5.2.
In terms of stutter, Unreal Engine 5.2 is certainly an improvement then - but traversal stutter needs work and even the new asynchronous shader caching system is not a silver bullet that developers can wholly rely on for a smooth player experience. For one, it doesn't seem to be on my default, which some developers might miss, and secondly it produces some stutters that are fixed by the more traditional shader cache method. Therefore, it probably makes sense to combine this new asynchronous system with the older offline precaching system to produce the smoothest experience on PC.
It'll be fascinating to see how these two new features from Unreal Engine 5.2 are deployed in shipping third-party games, from Immortals of Aveum in August to The Lords of the Fallen in October and Stalker 2 in December. 2023 hasn't quite shaped up into the year I'd hoped for PC gaming - but there's still time for that to change, and UE5.2 might play a key role.