What is NVIDIA Adaptive Shading? Demystifying The Turing Feature That Boosts FPS Up To 15%

By Lei Yang, Senior GPU Graphics Architect on August 26, 2019 | Turing Variable Rate Shading Featured Stories NVIDIA Adaptive Shading

Games are pushing GPUs to render more pixels, faster, to deliver those amazing visuals we all want at ultra-high resolutions and framerates.

With the Turing architecture, we support a new feature called Variable Rate Shading (VRS), which is broadly accessible to developers since it’s built into DirectX 12, Vulkan, OpenGL, and DirectX 11 (NvAPI). With VRS, portions of the screen can be rendered a different levels of detail, focusing your GPU on the most important parts of the scene.

Using VRS, we created NVIDIA Adaptive Shading (NAS), which combines two forms of VRS into one content aware option, as in MachineGames’ Wolfenstein: Youngblood. In this fast first-person shooter, it can accelerate performance by up to 15%, with no perceivable visual quality loss.

 

Below, we show an image quality comparison between NAS (performance mode) and a lower render resolution across the entire scene. Both techniques shade a similar number of pixels on screen, while the NAS image looks better thanks to a more optimal shading rate distribution.

In the shading rate visualization image, tiles in blue means the shading rate is 2x lower, green 4x lower, and yellow 8x lower. The majority of tiles in this example receive 8x reduction in shading rate

    Reference (NAS off)

NAS on (Performance preset)

Uniform 2x2 low-res rendering

Reference vs. NAS (performance) and NAS vs. uniform 2x2 low-res rendering

API support and Usage of VRS

Today, VRS is built into major graphics APIs, including DirectX 12, Vulkan, OpenGL, and DirectX 11 (NvAPI). The diagram below shows the typical flow in engine code to enable VRS using one of these APIs.

Our Turing-based GPUs allow dividing the screen pixels into fixed-sized tiles (16x16), where each tile can apply one shading rate. To specify shading rate per tile, the engine generates a shading rate map, encoding either a static or dynamically adaptive pattern into it. The map can then be bound to the pipeline to enable variable rate shading for subsequent draws. There is no need to make other changes to the engine code, shaders or assets. This greatly simplifies the integration effort.

On DirectX 12, an alternative mode is to specify shading rate per geometry primitive (triangle), or per group of geometry (draw call). This mode can be used alone or combined with per-tile shading rate through a configurable combiner function. It is also supported on our Turing architecture.

The image below shows the range of shading rate options supported on Turing. For illustration purposes, the images are shown without antialiasing applied. In reality, VRS is perfectly compatible with MSAA or other antialiasing techniques such as Temporal AA. Note that the thin black lines are geometry features, which are always sampled in full-resolution when VRS is enabled. Only details in texture and lighting are reduced.

Content Adaptive Shading

NAS is comprised of two adaptive shading techniques: Content Adaptive Shading and Motion Adaptive Shading.

These techniques intelligently reduce the shading rate without affecting image quality. In a nutshell, Content Adaptive Shading predicts the impact of reduced-rate shading, and only does it if it’s expected to be imperceptible. Thanks to the VRS feature on Turing GPUs, the shading rate can be set to vary independently on each 16x16 screen tile, allowing a fine grained distribution of shading budget over the tens of thousands of tiles on the entire screen.

Generally speaking, the less contrast and variation the objects in the scene have, the lower the shading rate can be without reducing visual quality. To give gamers the freedom to customize their experience, options are available to fine tune based on their liking.

To determine a safe shading rate to use for each image tile prior to rasterizing the current frame, we reproject the tile location to the previous frame, where the content in the tile footprint is analyzed. More precisely, we estimate the quantitative quality loss resulting from different shading rate modes, and use a threshold to find the lowest shading rate that satisfies the quality requirement. Generally speaking, the less contrast and variation the content has, the lower the shading rate can be, without causing visual quality degradation. To help, we designed an algorithm that efficiently estimates the loss of multiple shading rate modes in a single pass. Details can be found in our I3D 2019 paper.

Care must be taken when estimating and thresholding the loss when determining a safe shading rate to use. In the human visual system, sensitivity of image differences depends on not only the magnitude of the predicted loss, but also the statistics of the region, such as the local luminance level. Our algorithm takes into account these extra factors in order to fairly distribute the shading resources based on true visibility of the error. This process also exposes two parameters for the users to fine tune based on their visual error tolerance.

Motion Adaptive Shading

Motion Adaptive Shading takes advantage of motion blur-type effects to reduce the shading rate when moving and turning in game.

An inspiration for Motion Adaptive Shading is a phenomenon called LCD persistence blur. The animation below compares how we see a moving object in real world versus on the screen. Because the display can only refresh at fixed intervals, the displayed object has a jerky motion. Our eyes are still trying to follow the object in a linear motion. However, on our retina, the image is quickly shaking, leading to a blurry perceived image. And aside from display blur, many game engines also add motion blur to the rendering, essentially doubling the blurring effect.

LCD persistence blur and motion blur hide image imperfections caused by lowered shading rates. Below, the three images on the left show full, half and quarter rate shading in the X direction without motion. After adding a mild motion blur (middle column), the half and full rate images look almost identical. With an even wider blur (right column), all three images look the same, except a few edges with high contrast.

Notice that the blur does not blindly hide every level of image imperfections, but instead scales the loss down proportionally. We use signal processing theories and run simulations to make sure we apply the best scaling factors at every possible motion speed.

So, with MAS, we can reduce the shading rate of blurred game elements without your eyes noticing any difference, giving you faster performance.

NAS in MachineGames’ Wolfenstein: Youngblood offers three quality presets: Quality, Balanced, and Performance. Each has different trade-offs in terms of quality and performance.

In general, there is hardly any perceptible quality loss when using Balanced or Quality. With the more aggressive Performance mode, a trained eye may see differences in a few spots, but most likely only in static image comparisons.

Quality and Performance

NAS in MachineGames’ Wolfenstein: Youngblood offers three quality presets: Quality, Balanced, and Performance. Each has different trade-offs in terms of quality and performance.

In general, there is hardly any perceptible quality loss when using Balanced or Quality. With the more aggressive Performance mode, a trained eye may see differences in a few spots, but most likely only in static image comparisons.

With a debug visualization enabled, we demonstrate the distribution of shading rates used by the three presets. In this visualization, tiles that are not colored receive full rate shading. Tiles in blue means the shading rate is 2x lower, green 4x lower, and yellow 8x lower. With the more aggressive Performance mode, the majority of screen tiles benefit from 8x reduction in shading cost.

The table below shows the percentage improvements to framerates in two scenes. All numbers are measured in stationary scenes, rendering at 4K (3840x2160) on a GeForce RTX 2080 Ti.

Scene

NAS Off

Quality

Balanced

Performance

Garage

98.4

104.4 (+6%)

107.2 (+9%)

110.8 (+13%)

Interior

98.7

106.9 (+8%)

110.7 (+12%)

115.5 (+17%)

More images and detailed comparison can be found in our comprehensive Wolfenstein: Youngblood Graphics and Performance Guide.

Wrapping Up

NAS can be enabled on modern game engines to leverage the VRS feature on Turing GPUs - giving gamers higher FPS at barely any visual quality loss. If you are interested in learning more about the technical details of NAS and VRS, check out our GDC presentation and I3D 2019 paper.

Comments