Tech Interview: Trials Evolution
The making of RedLynx's latest Xbox 360 classic.
In an age of cross-platform development, console exclusives are somewhat thin on the ground, and especially so on Xbox 360, where Microsoft has scaled back first-party studios, occasionally paring back on in-house engine development in favour of third-party middleware - so much so that the next Fable title is running on Unreal Engine.
A glorious exception to the trend is RedLynx's phenomenal Trials Evolution - a game designed from the ground up for the 360 hardware by some of the most respected developers in the business and available now on Xbox Live Arcade, and earning an impressive 9/10 Eurogamer review score.
In this special deep-dive Digital Foundry tech interview, we go head-to-head with RedLynx's Sebastian Aaltonen (aka sebbbi) to discuss the lessons learned from Trials HD, to get the thinking behind the ambitious Track Central editor, and to uncover the full story on the rendering improvements that power one of the most technologically advanced titles we've seen on the Xbox 360.
Virtual texturing, lighting, resolution, anti-aliasing, post-processing technology, GPGPU techniques - read on to discover why Trials Evolution isn't just a gameplay masterpiece, but also a technological showcase for cutting-edge Xbox 360 game development.
There were actually two things related to DLC that we wanted to improve on. We didn't have any scripting system in Trials HD, and so we couldn't add new skill games to the DLCs (DLCs are just content packages in Xbox Live). Trials HD loaded everything to system RAM at start-up, and this limited the amount of new objects and textures we could add in our DLC content.
In the sequel, we wanted to solve these two problems. Data streaming solved the limited memory issue and we implemented a very complex visual scripting system to solve the skill game issue. Needless to say, the visual scripting system quickly expanded to a whole game-creation system that allows you to create wide variety of games (such as first-person shooters and car racing games).
"User feedback has always been important for us. Luckily for us, we often find our own ideas to match very well with our players' ideas... The huge success of Trials HD made it possible for us to finally create the game of our dreams."
User feedback has always been important for us. Luckily for us, we often find our own ideas to match very well with our players' ideas. Multiplayer, global track-sharing and outdoor environment have all been frequently requested by our user base since Trials 2 Second Edition. The huge success of Trials HD made it possible for us to finally create the game of our dreams.
We didn't have automated metric collection in Trials HD, but we analysed data from the track leaderboards and from our user forums. This information was crucial in learning how difficult the game was and what levels ended up being too difficult for the majority of the players. The most important thing we learned was that the game required proper tutorials and a smoother learning curve.
"Our graphics engine now uses a 'close to hardware' low-level GPU interface instead of the higher-level DirectX API to submit draw calls and the GPU state."
Increasing the draw distance from 40 metres to 2000 metres meant that we had to render over five times more objects per frame than we did in Trials HD. Many things in the engine got completely overhauled to cope with the vastly increased object count. For example, our graphics engine now uses a 'close to hardware' low-level GPU interface instead of the higher-level DirectX API to submit draw calls and the GPU state.
We fully optimised our particle engine with VMX128 instructions, and this freed up one of the six hardware threads just for visibility culling purposes, while still allowing us to double our particle counts. We now have a dynamic depth buffer pyramid-based occlusion culling system that discards all occluded objects very quickly, and gives a nice boost of performance for complex scenes. We also implemented object and terrain geometry LOD (level of detail) systems to scale down polygon counts based on distance to the camera.
The shadow mapping system was also improved. The new system calculates very tight bounds for shadow map cascades based on depth buffer analysis (inspired by the SDSM algorithm by Lauritzen, Salvi and Lefohn), and allowed us to reach the required shadow map quality for the large-scale terrain without much extra cost.
We now stream everything: meshes (triangles and vertices), terrain heightfield, vegetation map and textures. For efficient mesh-streaming we had to compress our vertex formats to be as small as possible. Pixel shader derivative-based tangent calculation was used for skinned models. It both saved lots of bandwidth and memory, but made the skinning vertex shaders much faster as well. For other meshes we experimented with various methods including quaternion-based tangents, but in the end we settled on a compact 16-byte vertex format (that included some clever bit-packing). These modifications also made the rendering slightly faster, because of the reduced GPU memory bandwidth usage, so it was really a win-win situation for us.
"[Virtual texturing] has allowed us to texture many of our objects with very large... textures and has completely freed our artists of any memory budgets when designing the game world."
For texture streaming we use our own virtual texture system. Unlike id Software's virtual texturing system that is designed for unique texture-mapping everywhere, our system is designed to use storage space sparingly while still offering good blend of texture variation and resolution.
Virtual texturing has really changed the way we deal with textures. The system does fine-grained analysis of the visible scene and determines which texture areas should be loaded to the memory. It is designed to keep only the texture pixels in memory that are actually required to render the current scene. Because there's always a constant amount of pixels in screen (720p = 921K pixels), the memory footprint of virtual texturing is always the same, no matter how many and how large textures the game world contains. This has allowed us to texture many of our objects with very large 2048x2048 (and some even with 4096x4096) textures and has completely freed our artists of any texture memory budgets when designing the game world.
As we load virtual texture data from the hard drive, it must be decompressed quickly: all decals must be blended over the base data and it must be recompressed to a GPU format on fly. We implemented a fast GPGPU-based texture compressor (and combiner) to offload the majority of this workload to the GPU. Other streaming tasks are implemented as CPU jobs, and are scheduled to cores that finish their main jobs first (filling the holes in the execution).
We now have a fully gamma-correct (linear space) lighting pipeline, so the rendering looks much more natural compared to the old pipeline. We have also added a fully artist-controlled colour grading system that allows them to pile up any amount of Photoshop filters and bake the filters to one big 3D texture lookup table that is sampled at the end of our post-processing pipeline. This lookup also includes an Xbox PWL gamma repair ramp (to make the image look as much like real sRGB as possible).
The smoke and dust are basically just alpha-blended particles with slight background blurring enabled (a new feature). Our newly optimised particle system is able to run more particles, so we utilised it as much as possible. We also added proper physically correct exponential fog and a post-process 'god ray' filter that adds a slight volumetric feeling to the lighting and fog effects.
We improved our old particle-rendering system further. It still uses our (rather funky) front-to-back premultiplied destination alpha blending with stencil counting to reject extra layers of particles (that would not be visible because of heavy overdraw). The stencil-counting trick is working well (to improve fillrate), and the premultiplied alpha blending equation allows us to render all our particles (both additive and percentage blended) with a single draw call. We optimised our radix sorter (that is used to sort our particles and objects). It's partially vectorised and cache optimised very well.
"Thousands and thousands of (manual) cache optimisations and CPU stall optimisations were introduced and we VMX128 vectorised almost every bit of code that was suitable for vectorisation."
The whole engine underwent a year-long optimisation process. The Xbox PPC CPU is an older in-order design, so it's crucial to optimise your code very well if you want to get decent performance out of it. Thousands and thousands of (manual) cache optimisations and CPU stall optimisations were introduced and we VMX128-vectorised almost every bit of code that was suitable for vectorisation. So the code is now specially optimised for the PPC processor architecture of the console.
We used the GPU memexport a bit more, as GPGPU has always been near to my heart. In Trials Evolution we do terrain foliage generation, particle processing and texture compression using the GPU. Our deferred lighting and anti-aliasing shaders use Xbox-specific GPU microcode for "warp wide" branching. This technique can be used to reduce cost of incoherent dynamic branching (but depends on GPU warp size and is thus not available on most PC GPGPU platforms except for CUDA).
We did some research and the conclusion was that basically no native 1280x720p television sets were ever sold worldwide. 1366x768 was/is the most common HD-ready "720p" TV resolution, and 1080p sets were getting much more common during the last two and half years since the launch of Trials HD. We didn't see any reason to support native 1280x720 rendering anymore, as basically all TV sets would scale the image up, and nobody would see the game unscaled at perfect 1:1 pixel ratio. So we went slightly sub-HD (but at proper 16:9 this time) and let the high-quality Xbox 360 scaler hardware do the upscaling to the TV native resolution.
"We use a modified version of FXAA. It's originated from FXAA 2, but our version causes significantly less blurring to textures... Our version runs at 0.8ms, less than five per cent of the 16.6ms frame."
We use a modified version of FXAA. It's originated from FXAA 2, but our version causes significantly less blurring to textures. We again use the Xbox-specific microcode branching trick to get extra performance out of the shader (limiting the effect to areas that have high-contrast edges). Our version runs at 0.8ms, less than five per cent of the 16.6ms frame.
I did some stereo 3D reprojection tests early in the production, but at the end we didn't have time to focus on technology features that would only benefit a narrow user base. Getting the huge game world to run at stable 60 frames per second and fine-tuning all the new streaming techniques took all the technology programmers' time.
Physics changes weren't as radical this time. We upgraded to a newer version of Bullet Physics. Unfortunately that meant we had to write all our Xbox physics optimisations again. This time, however, we had much more knowledge of the hardware specifics and that provided better results in the end.
Physics modifications were required, because the bike was moving faster in the outdoor game world, and the jumps were bigger. Bike suspension needed to be modified so that the heavy impacts wouldn't cause bike parts to get stuck in the ground. In Trials Evolution, we also give users more options for controlling the physics properties of the objects. Everything from mass and buoyancy to surface friction can now be changed in the Editor.
"Physics modifications were required... the bike was moving faster in the outdoor game world, and the jumps were bigger. Bike suspension [was] modified so that the heavy impacts wouldn't cause bike parts to get stuck in the ground."
Trials Evolution was our first Xbox Live multiplayer game, so it was naturally a learning process. Fortunately our game modes are really latency-tolerant because bikes cannot collide with each other. In Supercross each bike has its own lane, and in Trials you will see opponents as real-time ghosts. This allowed us to have perfect control precision in multiplayer as well. If the network is slow, only opponents' bikes show any signs of lag. Your own bike always plays perfectly.
Each console simulates only its own physics world. We only send an optimised visual representation over the network. We do, however, use the physics engine to do extrapolation to better predict where the opposing bikes would be, as the data received from network connection is always slightly out of date.
Track Central is a huge thing for us. It brings "YouTube-style" content sharing, rating and searching functionality to the users. The visual scripting system allows players to basically define any game rules. We have already seen first-person shooters, 3D car racing games, top-down shooters, and tributes to games such as Pac-Man, Angry Birds and Super Mario. And the game has been only out for two days as I write this! It's basically a 3D game creation platform for the Xbox 360.
We haven't yet thought about the next Trials game. We are still very excited about the launch that just happened, and we have our hands full in configuring the servers and maintaining the forums. When the smoke clears we have chance to focus on our next projects.
We do of course follow the trends of future technology closely. We bought DirectX 11 graphics cards for our workstations as soon as they were available, so we could do tests with new features such as DirectCompute and tessellation. GPUs nowadays have evolved into parallel computation monsters that can be used for much more than just graphics rendering. It will be interesting to see how future games will benefit for technological advances like this...