Accelerating The Real-Time Ray Tracing Ecosystem: DXR For GeForce RTX and GeForce GTX

By Andrew Burnes on March 18, 2019 | Featured Stories GDC 2019 GeForce GTX GPUs NVIDIA RTX Ray Tracing

From engine updates to exciting games to developer tools, NVIDIA and its partners are making a number of ray tracing announcements at GDC 2019 to drive the ecosystem forward around this exciting new technology.

For decades, NVIDIA has been working towards the dream of real-time videogame ray tracing. It required millions of hours of research and development, focusing on everything from GPU hardware and software, to updated APIs and game engines, to development tools and denoisers. In 2018, all that hard work came to fruition with the launch of GeForce RTX GPUs, the world’s first consumer graphics cards with dedicated RT Core ray tracing hardware, enabling realistic ray-traced effects to run in real-time in high-fidelity and at high resolutions.

In the time since, our software and developer teams have kept working, allowing us to optimize our ray tracing technology, make new software advancements, and help developers further accelerate ray tracing performance in games. Because of this work, we have dramatically sped up ray tracing performance for GeForce RTX GPUs, and can now enable DirectX Raytracing ( DXR ) on GeForce GTX 1060 6GB and higher graphics cards via a Game Ready Driver update, expected in April.

The much larger install base of RT-capable GPUs will fuel developer adoption of ray tracing technology, bringing more games for both GeForce RTX and GeForce GTX users to experience. GeForce GTX gamers will have an opportunity to use ray tracing at lower RT quality settings and resolutions, while GeForce RTX users will experience up to 2-3x faster performance thanks to the dedicated RT Cores on their GPUs, enabling the use of higher-quality settings and resolutions at higher framerates.

Breaking Down Ray Tracing Performance

Ray tracing introduces several new workloads for the GPU to perform. The first is determining which triangle in the game scene a ray will intersect. A computationally-intensive technique called Bounding Volume Hierarchy, or BVH, is used to calculate this. After the rays are calculated, a denoising algorithm is applied to improve the visual quality of the resulting image, so that fewer total rays can be cast, allowing the process to be possible in real-time.

RT Cores on GeForce RTX GPUs provide dedicated hardware to accelerate BVH and ray / triangle intersection calculations, dramatically accelerating ray tracing. On GeForce GTX hardware, these calculations are performed on the shader cores, a resource shared with many other graphics functions of the GPU.

The Turing architecture that GeForce RTX GPUs use was designed from the start for DXR -type workloads. Pascal, on the other hand, was launched in 2016 and was designed for DirectX 12.

To see what this means in practice, let’s examine one frame of gameplay from Metro Exodus, which features ray traced global illumination:

The graphs look at GPU utilization on Pascal, Turing with RT cores disabled (via a special software setting to show GeForce RTX 2080 performance without the use of RT Cores), and Turing with RT cores and DLSS enabled.

On Pascal-architecture GPUs we see that ray tracing and all other graphics rendering tasks are handled by FP32 Pascal shader cores. This takes longer to perform, translating to a lower FPS.

The Turing architecture introduced INT32 Cores that operate simultaneously alongside FP32 Cores. These are found on GeForce RTX graphics cards, and GeForce GTX 1660 Ti and GTX 1660 graphics cards, helping dramatically reduce frame time.

When we enable the dedicated RT Cores on GeForce RTX GPUs, a substantial load is lifted from the shader cores, further improving performance. And when we examine just the ray tracing subset of the graphics workload, RTX 2080 is upwards of 10x faster.

When we factor in DLSS, an AI-accelerated technology powered by Turing’s Tensor Cores, Metro Exodus’ framerate is roughly 3x faster on the GeForce RTX 2080 than on the fastest Pascal-architecture consumer graphics card, GeForce GTX 1080 Ti.

Below you can see performance across several ray tracing titles, run on Pascal (GTX 1080 Ti) and Turing (RTX 2080) at maximum in-game quality and ray tracing settings, at 2560x1440. The charts show the gains in FPS broken down by each component of the Turing architecture -- shader core improvements such as Concurrent Floating Point and Integer Operations, RT Cores, and Tensor Cores (DLSS). And to show RTX 2080 performance both with and without RT Cores we’ve used special software settings.

Metro Exodus uses real-time ray-traced global illumination to render dynamic scenes with more-realistic indirect lighting that updates in real-time as lighting changes and events occur in the game world. In other words, the light bounces naturally, illuminating and brightening surrounding detail. And if the sun moves or window shades open, the room’s lighting realistically changes, presenting the room in an entirely new light.

This process requires a ray be cast for each pixel, making ray-traced global illumination far more performance-intensive than the rendering of simpler effects, such as limited ray-traced reflections. In this case, RT cores provide a big boost, with RTX 2080 performing up to 3x faster than GTX 1080 Ti.

Shadow of the Tomb Raider is implementing ray-traced shadows in an upcoming patch, enhancing existing shadows, and introducing new, immersive, real-time shadows that are beyond the capabilities of traditional rasterization techniques. In this instance, RTX 2080 performs up to 2x faster than GTX 1080 Ti.

BattlefieldTM V’s DXR reflections are applied exclusively to reflective surfaces, such as water, glass, mirrors, and metal. By using this approach, fewer rays are needed, enabling faster performance across GPUs.

3DMark Port Royal employs ray tracing for both reflections and shadows. Its use of ray tracing for multiple effects places a significantly-heavier RT workload on the GPU, necessitating the use of RT Cores to run the benchmark at over 30 FPS.

As seen from the charts above, ray tracing performance varies from game to game and effect to effect. In Battlefield V, ray traced reflections are computationally less demanding than the global illumination effects in Metro Exodus and shadow effects in Shadow of the Tomb Raider. If a game utilizes multiple ray tracing effects, as in Atomic Heart and 3DMark Port Royal, specialized RT Cores result in larger performance benefits.

Driver Support for GeForce GTX: Coming Soon

When DXR is enabled by a Game Ready Driver, targeted for April, the supported GeForce GTX graphics cards will work without game updates because ray-traced games are built on DirectX 12’s DirectX Raytracing API, DXR. This industry standard API uses compute-like ray tracing workloads that are compatible with both dedicated hardware units, such as the RT Cores, and the GPU’s general purpose shader cores.

Laptops with equivalent Pascal and Turing GPUs will also be enabled

Note that only GeForce GTX GPUs with sufficient performance and memory for basic ray tracing capabilities are being enabled by the update, as without specialized hardware such as the RT Cores, ray tracing workloads run concurrently with all other graphics rendering tasks, putting substantial load on the GPU.

Real-time ray tracing has reinvented graphics. And now, engine support is available in Unreal Engine and Unity, exciting content is shipping, several announcements are being made at this week’s GDC, and there’s far more in the works; the momentum around real-time ray tracing continues to grow.

Stay tuned to for news of the driver’s release, check out our other GDC 2019 articles for more gaming announcements, and if you have questions and feedback, please provide it here and it will be routed to our team.