Microsoft to unlock more GPU power for Xbox One developers
Kinect and app reservations will be accessible to game-makers.
Xbox One reserves 10 per cent of graphics resources for Kinect and apps functionality, Digital Foundry can confirm, with Microsoft planning to open up this additional GPU power for game development in the future. This, and further graphics and performance-based information was revealed during our lengthy discussions with two of the architects behind the Xbox One silicon.
"Xbox One has a conservative 10 per cent time-sliced reservation on the GPU for system processing. This is used both for the GPGPU processing for Kinect and for the rendering of concurrent system content such as snap mode," Microsoft technical fellow Andrew Goossen told us.
"The current reservation provides strong isolation between the title and the system and simplifies game development - strong isolation means that the system workloads, which are variable, won't perturb the performance of the game rendering. In the future, we plan to open up more options to developers to access this GPU reservation time while maintaining full system functionality."
Once you get over the initial surprise that the background system takes up quite so much GPU time in the first place, the notion of being able to give developers access to this resource while not compromising functionality may sound rather like having your cake and eating it, but Microsoft points to particular aspects of the GPU hardware that make this scenario possible.
"In addition to asynchronous compute queues, the Xbox One hardware supports two concurrent render pipes," Goossen pointed out. "The two render pipes can allow the hardware to render title content at high priority while concurrently rendering system content at low priority. The GPU hardware scheduler is designed to maximise throughput and automatically fills 'holes' in the high-priority processing. This can allow the system rendering to make use of the ROPs for fill, for example, while the title is simultaneously doing synchronous compute operations on the compute units."
Having attempted to comprehensively address questions about the ESRAM and system memory bandwidth of the architecture, the issue of the Xbox One's fill-rate and ROPs deficit compared to PlayStation 4 is now under the microscope. ROPs are the elements of the GPU that physically write the final image from pixel, vector and texel information: PlayStation 4's 32 ROPs are generally acknowledged as overkill for a 1080p resolution (the underlying architecture from AMD was never designed exclusively just for full HD but for other resolutions such as 2560x1400/2560x1600 too), while Xbox One's 16 ROPs could theoretically be overwhelmed by developers.
In our interview, Microsoft revealed research it had carried out that suggested that the 6.6 per cent increase to GPU clock speed was more beneficial to the system than two additional AMD Radeon Graphics Core Next compute units. Our question was straightforward enough - were the results of these tests skewed by the code saturating the ROPs?
"We've chosen to let title developers make the trade-off of resolution vs. per-pixel quality in whatever way is most appropriate to their game content. A lower resolution generally means that there can be more quality per pixel."
"Yes, some parts of the frames may have been ROP-bound. However, in our more detailed analysis we've found that the portions of typical game content frames that are bound on ROP and not bound on bandwidth are generally quite small. The primary reason that the 6.6 per cent clock speed boost was a win over additional CUs was because it lifted all internal parts of the pipeline such as vertex rate, triangle rate, draw issue rate etc," Goossen explained.
"The goal of a 'balanced' system is by definition not to be consistently bottlenecked on any one area. In general with a balanced system there should rarely be a single bottleneck over the course of any given frame - parts of the frame can be fill-rate bound, other can be ALU bound, others can be fetch bound, others can be memory bound, others can be wave occupancy bound, others can be draw-setup bound, others can be state change bound, etc. To complicate matters further, the GPU bottlenecks can change within the course of a single draw call!"
Obviously though, it stands to reason that having more ROPs on call is the preferable scenario, even if they remain largely unused - and that's what PlayStation 4 offers. Microsoft's pitch is that its hardware set-up wouldn't necessarily be able to make use of them even if they were there.
"The relationship between fill-rate and memory bandwidth is a good example of where balance is necessary. A high fill-rate won't help if the memory system can't sustain the bandwidth required to run at that fill-rate," said Goossen.
"For example, consider a typical game scenario where the render target is 32bpp [bits per pixel] and blending is disabled, and the depth/stencil surface is 32bpp with Z [depth] enabled. That amount to 12 bytes of bandwidth needed per pixel drawn (eight bytes write, four bytes read). At our peak fill-rate of 13.65GPixels/s that adds up to 164GB/s of real bandwidth that is needed which pretty much saturates our ESRAM bandwidth. In this case, even if we had doubled the number of ROPs, the effective fill-rate would not have changed because we would be bottlenecked on bandwidth. In other words, we balanced our ROPs to our bandwidth for our target scenarios. Keep in mind that bandwidth is also needed for vertex and texture data as well, which in our case typically comes from DDR3."
Our take on the ROPs situation is that while these figures make perfect sense, there are many other scenarios that could be potentially challenging - depth-only passes, shadows, alpha test and Z pre-pass for example. But from a user perspective, the fact is that native 1080p isn't supported on key first-party titles like Ryse and Killer Instinct. Assuming this isn't a pixel fill-rate issue as Microsoft suggests, surely at the very least, this impacts the balanced system argument?
"In the future, we plan to open up more options to developers to access this GPU [system] reservation time while maintaining full system functionality."
"We've chosen to let title developers make the trade-off of resolution vs. per-pixel quality in whatever way is most appropriate to their game content. A lower resolution generally means that there can be more quality per pixel. With a high quality scaler and anti-aliasing and render resolutions such as 720p or '900p', some games look better with more GPU processing going to each pixel than to the number of pixels; others look better at 1080p with less GPU processing per pixel," replied Goossen.
"We built Xbox One with a higher quality scaler than on Xbox 360, and added an additional display plane, to provide more freedom to developers in this area. This matter of choice was a lesson we learned from Xbox 360 where at launch we had a Technical Certification Requirement mandate that all titles had to be 720p or better with at least 2x anti-aliasing - and we later ended up eliminating that TCR as we found it was ultimately better to allow developers to make the resolution decision themselves. Game developers are naturally incented to make the highest quality visuals possible and so will choose the most appropriate trade-off between quality of each pixel vs. number of pixels for their games."
A well-placed insider with an established background in AAA multi-platform experience, currently working with next-gen hardware, was rather more pragmatic in his assessment of the 1080p situation.
"We will probably see a lot of sub-1080p games (with hardware upscale), but this is probably because there is not enough time to learn the GPU when the development environment, and sometimes clock speeds, are changing underneath you," our source said, referring to the Xbox One's evolving "mono driver" and last-minute hardware tweaks.
"If a studio releases a sub-1080p game then is it because they can't make it run at 1080p? Is it because they don't possess the skills or experience in-house? Or is it a design choice to make their game run at a stable frame-rate for launch?"
We'll be publishing the entirety of our interview with the Xbox One architects this weekend, covering off topics including the Xbox 360 post-mortem, Microsoft's approach to GPU compute, the innovative approach to virtualisation, the choice of CPU architecture and much, much more. Over 7,500 words in total, and essential reading for anyone interested in the technological make-up of Microsoft's next-generation console.
Update: Just some clarification here - does the release of Kinect and app reservation add to the overall 1.31TF of GPU compute power in Xbox One? The answer there is no - that 1.31TF is the theoretical limit of the GPU before reservations. The point is that more GPU resource will be available in future to game developers by giving them access to the hitherto reserved GPU allocation. As we understand it, the PlayStation 4 also reserves some GPU time for the background system - but it's highly unlikely to be anything as high as 10 per cent.