Tech Analysis: Metal Gear Solid 5's FOX Engine
Kojima's quest for photo-realism.
At last week's GDC, Kojima Productions revealed its fifth entry in the Metal Gear Solid series, The Phantom Pain, also incorporating the previously announced MGS title Ground Zeroes. Precisely how the two fit together is currently uncertain, although Hideo Kojima himself has suggested on Twitter that Ground Zeroes acts as a prologue to The Phantom Pain and that the two are bridged by a period where the game's protagonist is left in a coma. Then again, Kojima is a master of misdirection, so who knows? The one link between the two we are sure on, however, is that they are both running on the ambitious new FOX Engine, initially unveiled with Ground Zeroes last year, and that the game is set to feature a radical overhaul of the traditional MGS gameplay set-up: Snake's new adventures take place in an ambitious series of open-world environments.
The trailer and gameplay demonstration for The Phantom Pain were followed by an extensive presentation which cut right into the heart of this new engine. Titled "Photo-realism through the eyes of a fox", the statement of intent is ambitious: Kojima is searching for ultra life-like imagery, the Holy Grail of graphics technology. But it's important to stress that it's not just about the rendering. His games are defined by the convergence and balance of both art and technology.
"The more that technology evolves, the more we have to understand our physical surroundings," he said. "To make a quality product the artist's eye is essential. Simply recreating reality would result in a traced image. This is what we mean by photo-realism through the eyes of a fox."
Looking at the footage displayed at GDC and the workflow revealed by his team, Kojima's FOX may not be that far off the scent: the engine's approach to rendering, lighting and asset creation is highly innovative and exciting to say the least but it's going to be the application of the tech by the team that truly defines the experience.
In the event, the GDC presentation concentrated much more on the technical side of the equation. It's high-end stuff that perhaps sailed over the heads of many of the GDC attendees and live-stream viewers, so the aim of this article is pretty straightforward: we're going to break down that information, detail the specific areas the developers are targeting, and try to figure out what Kojima is smoking - and whether we want some.
"In MGS5, deferred rendering is absolutely vital because its photo-realistic strategy relies largely on lighting."
The foundation of MGS5's entire approach to photo-realism is its use of deferred rendering. While deferred rendering was initially seldom used due to a variety of limitations it brought with it - such as problems rendering transparent surfaces and incompatibility with traditional anti-aliasing solutions - it has gradually grown in popularity over the lifespan of the current generation of consoles. Crytek shifted CryEngine 3 to a deferred model in order to enable it to run on consoles, but the technique has been used on a multitude of titles in various forms, starting with Xbox 360 launch title Perfect Dark Zero, encompassing other games like the Killzone series, Uncharted, GTA4 and a great many modern releases including BioShock Infinite and Tomb Raider.
Deferred rendering differs from traditional (forward) rendering in that it adds an extra step into the rendering process between the gathering of information about how materials, shading and so forth should be positioned, and the actual rendering of these features in the environment. During that extra step the information is stored textures known as the geometry buffer (G-buffer), called forth as and when the system requires.
The most important effect of deferred rendering is it separates the rendering of geometry from the application of lighting. Traditional forward renders can take different forms - typically single or multi-pass. Single-pass has issues with a potentially huge number of shader permutations (different lights, different materials all combining differently). In multi-pass, an object is rendered once for every time it is affected by a light. So ten lights equals ten renders, which makes lighting computationally expensive. In deferred rendering, lights are rendered as geometry, which then call the lighting information cast onto objects (colour, depth etc) from the G-buffer, meaning objects only have to be rendered once. Thus deferred rendering allows for a much greater number of lights at a much lower cost to performance.
"Linear-space lighting takes into account the higher-than-natural light/dark contrast of the average monitor screen, and provides gamma correction to create more natural light and shadow."
For MGS5 this is absolutely vital, because its photo-realistic strategy relies largely on lighting. FOX uses a lighting technique known as linear-space lighting. In a way, its implementation is analogous to deferred rendering in that it takes a slightly longer route through the graphics pipeline in order to produce an effect which has significant advantages over its traditional counterpart.
In a nutshell, linear-space lighting takes into account the higher-than-natural light/dark contrast of the average monitor screen, and provides gamma correction to create more natural-looking lights and shadows. The hospital demo of MGS5 demonstrates this clearly: because the environment is fairly consistent in terms of both colour and brightness, the subtlety of FOX's shadowing is given centre-stage. The resultant lighting actually looks a little scaled back from many current-gen games because it isn't trying to conceal a lack of detail with in-your-face bloom or depth-of-field effects.
While linear-space lighting is largely responsible for the game's more subtle visual quality, there are many sub-components which go into the final lighting render. Some of these have a general effect: light attenuation determines the intensity of light given its focus on and distance from an object. Meanwhile, skylight in outdoor environments accurately simulates atmospheric scattering, a complicated task that affects both how the sky is lit and how that light affects the environment at ground level.
Other lighting techniques are geared toward how light reacts with certain surfaces. With MGS5 the emphasis seems to be on the nuances of non-reflective surfaces, a continuation of a trend in graphics technology that began with screen-space ambient occlusion (SSAO) and has lately resulted in technologies like Crytek's modified Single Bounce GI solution which simulates light on glossy surfaces. The FOX Engine's shaders support translucency, important for accurate simulation of light's reaction with soft surfaces such as skin, hair and cloth.
'Roughness' is a term that peppers the presentation. The roughness of a surface in FOX can be tweaked to give it a slick, damp look, useful when creating a rainy scene to minimise complex and power-hungry liquid simulation. Most interesting, however, is view-dependent roughness - a brand new feature in FOX that affects the reflectiveness of a surface depending upon the angle from which it is viewed. So a plastered wall appears brighter and more reflective at the far end where the viewing angle is narrow than at the near end where the angle is wider and the wall's contours more visible. This sounds like a small and insignificant detail, but the resulting effect is really quite striking, especially in the hospital demo where there are lots of long corridors being directly lit by overhead strip lights.
In the same way that deferred rendering acts as a foundation for linear-space lighting, linear-space lighting in turn forms the support structure for the third and final strand of the FOX Engine's push toward photo-realism, which Kojima Productions has termed "physically-based rendering". While this may sound like heavyweight graphical terminology, all it means is rendering textures, models and materials using as much real-world data as is physically possible.
"Wherever possible the development team are using 3D photo capture, laser capture and motion capture to create assets that are as detailed and realistic as possible."
Don't let the simple explanation fool you, however. The practical implementation of this method of rendering involves a huge amount of work. Kojima Productions is taking the 'photo' aspect of photo-realism very seriously. Wherever possible the development team is using 3D photo capture, laser capture and motion capture to create assets that are as detailed and realistic as possible. In-game objects are photographed in real life from a wide range of angles, and then those photographs are compiled into a 3D model using a program called Photoscan.
Textures, meanwhile, are photographed with high exposure in order to preserve their linear light information, which gives a more accurate representation of how the human eye sees objects as opposed to how a camera lens sees objects, which if we take the term literally is actually beyond photo-realism. These textures are then cleaned up by the studio's artists and imported into programs such as Marvelous Designer a complex clothing design tool used to build character attire in layers, including accurate and malleable rendering of creases, folds, stitching and so forth.
That is why linear-space lighting is so important. Using assets so heavily based in reality requires lighting which is simulated realistically, because otherwise these textures would look completely out of place. The problem arises in the diffuse and specular reflections of light off of surfaces. Specular reflection occurs when light hits a very smooth surface, such as a mirror, and bounces off in a single direction resulting in a near-perfect reflection. Diffuse reflection, on the other hand, happens when light hits a rougher surface like a brick wall and is scattered, or "diffused" in various directions, which causes the surface to look glossy or matte.
"Clearly Kojima's claim about chasing photo-realism isn't an idle one. There's a gameplan here, and it is very well thought out indeed."
The more realistic the surface, the more complex the diffuse and specular maps become. Of course, this in turn is where deferred rendering comes into its own. The FOX Engine stores the diffuse and specular information separately in the G-buffer, then combines that information along with other lighting parameters such as the types of light being used (i.e. ambient light or sunlight) to produce the final rendering without having to render that information over and over.
Clearly Kojima's claim about chasing photo-realism isn't an idle one. There's a gameplan here, and it is very well thought out. At the same time, it's important to emphasise this is still very early days and there are an awful lot of questions yet to be answered. The demonstration was played on PC, but the game has thus far been announced for X360, PS3 and PS4, with no official confirmation of a PC version. Hence we don't know how the console versions compare, and what compromises, if any, will have to be made for them. The deferred rendering solution suggests the gap between the PC footage and the consoles won't be cavernous, but the gap between PC and current consoles has been growing and it is only going to widen further. Previous demonstrations of FOX have run on a PC said to be equivalent to current-gen console spec, so there's reason for optimistic here.
Furthermore, we equally have no idea if such scrupulous attention to detail in every texture of every rock and tree and wall and building can be applied to what Kojima himself describes as an open-world game - especially bearing in mind the RAM and streaming limitations of the current-gen platforms. It's one thing to create a staggeringly realistic render of a sparsely furnished conference room and clinically clean hospital corridor. It's quite another to apply that same pinpoint philosophy to a large, diverse and detailed world. That said, although Kojima may be an acquired taste when it comes to things like storytelling, the technology behind the Metal Gear Solid games has rarely failed to deliver on its promises. If Kojima is targeting the PC as a launch platform this time around - the first time he will have done so - this completely mad shot for the moon may well land close to the mark.