Graphics Reinvented: Ray Tracing, AI, and Advanced Shading Deliver A Whole New Way To Experience Games

GeForce RTX graphics cards are the fastest ever made, delivering 4K 60 FPS experiences in today’s games. In addition, they include new forward-thinking technologies that enable game-changing techniques that will increase visual fidelity, introduce lifelike graphics, enable new experiences and further increase performance.

Broadly speaking, these techniques fall under three pillars: NVIDIA RTX Ray Tracing, Artificial Intelligence, and Advanced Shaders. Below, we’ll walk you through everything gaming-related made possible by the new Turing architecture, and where applicable, explain how they improve upon today’s industry-standard techniques.

Real-Time Ray Tracing For Games

Ray-tracing has been the holy grail of graphics for decades, because it is uniquely capable of delivering lifelike fidelity to everything happening on-screen. The only problem is performance: millions of ray tracing operations are needed per frame for high-quality results, and each ray-tracing operation has to calculate what was hit and how it affects the appearance of the scene.

These calculations hammer CPUs, to the extent that entire server farms of processing power are required for the creation of complex effects and virtual worlds, the likes of which you see every day in TV shows and movies (in the highest-quality productions, short snippets and scenes can take days to render).

Now though, we’ve created the GPU-accelerated NVIDIA RTX Platform, and have launched our GeForce RTX and Quadro RTX GPUs with dedicated RT Cores, rapidly accelerating the creation of Hollywood’s ray-traced VFX shots and worlds. But in the realm of games, things work very differently: games are played in real-time, at a minimum of 30 frames per second, and the player can change the camera or what’s on-screen at any moment.

A different approach is therefore required, one where ray-tracing can have a significant impact on fidelity, without making a game unplayably slow. Our solution: hybrid rendering, where select effects are ray-traced to add lifelike lighting, shadows and reflections to the high-fidelity rasterized worlds that developers have lovingly handcrafted for years.

The result is a massive upgrade in image quality, making for more immersive and realistic gameplay that can still be enjoyed at smooth, playable framerates, even in Battlefield V’s chaotic 64-player multiplayer matches.

Ray-Traced Reflections

Ray-Traced Reflections are the showcase NVIDIA RTX effect; the one you excitedly show your friends to demonstrate what your new GeForce RTX GPU can do. At time of writing, they’re coming to Assetto Corsa Competizione, Atomic Heart, Battlefield V, Control, and MechWarrior 5: Mercenaries, and numerous other games yet to be unveiled. In each, ray-traced reflections will dramatically increase image quality and immersion game-wide.

Before now, the best reflection tech available was Screen Space Reflections (SSR) combined with static environment maps, and it is virtually impossible for any developer, no matter how good, to overcome their limitations.

For instance, Screen-Space Reflections can only show reflections of on-screen objects (those that are already visible in other parts of the frame). Anything off-screen (including behind the camera), that should be reflected, is in fact faked with a low-resolution cubemap , which is a pre-generated reflective image applied to surfaces (like the skyscraper windows in open world action games).

The limitation with cubemaps is they only represent the reflected environment as viewed from a single point in the scene, and can therefore give physically implausible results on large reflective objects (in addition, you commonly see the same reflection baked into multiple windows in different locations, and when you turn around the actual scene is completely different). And while some techniques do exist to update cubemaps at regular intervals, based on the appearance of the nearby world, they won’t overcome the technique’s inherent issues, and still therefore lack the ability to show people, vehicles, or effects moving in real-time behind the camera or player.

Another major issue with SSR is that the reflections rendered can’t display occluded detail, like a large item just behind another, which in reality would be visible in the reflection. Furthermore, there are the issues with disappearing reflections when the player moves or the camera is turned, or when the player is too close to the surface or object being reflected; all because the detail to be reflected is off-screen.

Enter NVIDIA RTX Ray Tracing, which fixes all those issues, and more, to generate high-resolution, full-scene, real-time reflections that reflect detail in front of, behind, above and below the player or camera.

 

To generate these new ultra-quality reflections, “rays” are “cast” from surfaces visible from the player’s camera. When they intersect objects and surfaces, the detail at that point of intersection is used along with many other pieces of info, from many other rays and points of intersection, to create reflections.

To create photorealistic results, however, the properties of each object and surface must be considered to determine if they’re reflective, if they absorb light, if they are rough or smooth, if they’re transparent, and so on.

Then, if a developer wishes to show reflections within reflections, an additional “bounce” is used, sending the rays off in new directions based on the properties of what they initially hit. As they hit new game elements, the info at that point of intersection is subsequently applied to reflections, and the appearance of the scene updated accordingly.

And finally, the Fresnel Effect is applied to give objects and surfaces natural-looking reflections (in reality, the edges of a glass ball are 100% reflective, but the center of the sphere reflects only a small degree of light). Without this, every reflection would be unnaturally bright at all points.

Two examples of the incredible reflections visible in Battlefield V on GeForce RTX graphics cards

That entire process happens millions of times every millisecond to create realistic real-time reflections across the entire scene, and it’s only possible at smooth, playable framerates because of GeForce RTX real-time ray-tracing units, called RT Cores, which were specifically crafted for this task (to learn more, see page 30 of the Turing Whitepaper). The improvement is breathtaking, raising the bar for fidelity for all other games.

To see more amazing NVIDIA RTX Real-Time Ray-Tracing Reflections screenshots, videos and trailers, check out our articles for Assetto Corsa Competizione, Atomic Heart, Battlefield V, Control, and MechWarrior 5: Mercenaries.

Ray-Traced Diffuse Global Illumination

The real-time ray tracing capabilities of GeForce RTX GPUs also enable us to more-accurately model the effect of light bounces off of surfaces in a scene, giving developers the power to add Ray-Traced Diffuse Global Illumination to their games.

If you’re unfamiliar with the term “Global Illumination”, it describes the process of computing all the light interactions in a scene, including the indirect contributions resulting from direct light bouncing off from one surface to another. Before now, it was commonly achieved with precomputed lightmaps, Image-Based Light Probes, Spherical Harmonics, and Reflective Shadowmaps, plus artist-placed lights to help force illumination where the aforementioned techniques fail.

These techniques had several shortcomings, the biggest of which being that dynamic lighting failed to bounce or illuminate beyond the area that the light hit.

For example, imagine a dark room with bright light shining through a window. With traditional techniques, everything that is directly hit by the light is illuminated, but the illuminated areas themselves do not bounce light, and do not illuminate surrounding game elements, when in reality they would.

With ray tracing, we now have the ability to more accurately model the dynamic indirect diffuse lighting reflected by one or more indirect bounces off of surfaces in the scene, enabling developers to craft dynamic scenes with more-realistic indirect lighting that updates in real-time as lighting changes, and events occur in the game world. In other words, the light bounces naturally, illuminating and brightening surrounding detail. And if the sun moves or the window shades open, the room’s lighting realistically changes; enabling you to see the room in an entirely new light.

Here’s an example from Metro Exodus, which launches in 2019:

 

To approximate high-quality ray-traced global illumination, some developers utilize a process called “baking”, whereby the effects of light on static objects in a scene are calculated and written to textures, that are overlaid on a game’s geometry. The results can be excellent in games with fixed lighting and static geometry, or fixed times of day, and it improves gameplay performance by not having to render global illumination effects. But baked lighting is not able to handle dynamic changes in the world that dramatically affect lighting; like the opening of a shade in the Metro Exodus video above.

Furthermore, baking is time-consuming, requiring a re-run each time a game build is compiled, and the iteration of baked lighting can be a real drag on the content creation process.

Now, developers can use NVIDIA RTX and ray-tracing to instantly preview their baked lighting and any updates, greatly accelerating development time, and giving developers more opportunities to iterate and tweak for maximum fidelity. And when they are happy with the results, they can toggle ray-tracing off and bake the lighting for older GPUs and other platforms.

 

An early look at Remedy Entertainment’s Control, which features improved ray-traced Global Illumination lighting

With real-time Ray-Traced Diffuse Global Illumination, developers can now, for the first time, craft realistic, properly-lit worlds that react dynamically to lighting changes, and changes to geometry. This is particularly noticeable in Control, where the shape-shifting levels radically affect the lighting of a scene, completely changing its look. And as in the case of Ray-Traced Reflections, the impact to the look and feel of a game is immense, making you wish every game featured NVIDIA RTX effects.

For more screenshots and videos of Ray-Traced Diffuse Global Illumination in action, check out our Control and Metro Exodus articles.

Ray-Traced Ambient Occlusion

Ambient Occlusion (AO) is a technique used to add contact shadows where objects or surfaces occlude ambient light, around the base of a rock, for instance. This prevents objects from appearing as if they’re floating on surfaces, and makes a scene appear more grounded and realistic.

Before now, the go-to solution for best-in-class AO was our own HBAO+, which improved upon SSAO considerably in terms of performance and fidelity. Ultimately though, it’s still a screen-space technique, and as such couldn’t account for off-screen detail and hidden detail.

With Ray-Traced Ambient Occlusion (RTAO), quality and accuracy is greatly improved, off-screen detail can be accounted for, as can hidden detail, such as the underside of a table, resulting in a more realistic scene.

 

The Metro Exodus video also includes examples of Real-Time Ray-Traced Ambient Occlusion

Ambient Occlusion’s importance to the immersion and realism of a scene is often overlooked - it’s not big and flashy like a reflection or an explosion, nor is it immediately noticeable like a player’s shadow. But without it, everything looks 'off'. Creases in armor would lack shadowing, there’d be no dark corners where two walls meet, and absolutely everything would appear to float on top of other surfaces.

With Ray Tracing, we can make this crucial effect even better, enhancing detail in every part of every game that uses it. For more examples of it in action check out our Assetto Corsa Competizione, Atomic Heart, MechWarrior 5: Mercenaries, and Metro Exodus articles.

Ray-Traced Shadows

Video game shadows have come a long way in recent years, adding many features that have brought them closer to replicating the look of shadows in the real world. Until the advent of real-time ray-tracing, our own Hybrid Frustum Traced Shadows (HFTS) technique got closest, but ultimately it and all other rasterized techniques were a collection of cleverly-programmed tricks to try and emulate what a shadow should look like.

Without HFTS, developers need to carefully balance the properties of their traditional shadow maps to maximize accuracy, detail and view distances, whilst avoid shadow aliasing, shadow acne (erroneous self-shadowing) and shadow detachment (inability to fully connect with the base of the shadow caster to ground it).

With ray tracing, those issues are no longer a concern. Instead, we cast millions of rays across a scene to realistically account for characters, objects and foliage that block light, resulting in genuinely-lifelike shadows for the very first time. And, beyond the addition of accurate shadows, we can for the first time support large complex interactions, real-time translucent shadowing, and a plethora of other techniques, at a level of detail far beyond that was possible or previously seen.

In particular, ray tracing enables developers to use area light sources, resulting in larger penumbras and physically correct contact hardening, such that shadows become sharper as they get closer to the object casting them. The benefits of this can be clearly seen in Shadow of the Tomb Raider, one of the first games to add ray-traced shadows:

 

If you wish to see additional examples of ray-traced shadows, read our articles for Assetto Corsa Competizione, Atomic Heart and MechWarrior 5: Mercenaries.

Real-Time Ray-Tracing, Coming Soon To Games

Reflections, shadows, ambient occlusion, and global illumination are all best-realized with ray tracing. Now, we’re finally able to bring them to games and gamers, delivering unprecedented fidelity, realism and immersion. As such, real-time ray tracing on GeForce RTX graphics cards truly reinvents graphics, replacing the tricks and approximations of current techniques with realistic calculations based on the real-world behavior of light.

See ray tracing in action in our game articles, and experience them for yourself later this year when playing NVIDIA RTX games on a GeForce RTX graphics card.

AI-Accelerated Features In Gaming

In addition to the “RT Cores” that power NVIDIA RTX Ray Tracing, our new GeForce RTX graphics cards also feature “Tensor Cores”, for Artificial Intelligence tasks (for technical detail, see page 21 in the Turing Whitepaper).

Deep Learning Super Sampling (DLSS)

In previous years, NVIDIA’s founder and CEO, Jensen Huang, liked to show a Deep Learning neural network demo where artificial intelligence would decide whether a photo of a pet was showing a cat or a dog. As the machine got more images and made more correct picks it would get better and better, before eventually picking the correct answer each time.

Using that basis for machine learning, we’ve created a new rendering technique called Deep Learning Super Sampling (DLSS). It boosts performance by a significant degree, can improve image quality, and it’s anti-aliasing has better temporal stability and image clarity compared to commonly-used Temporal Anti-Aliasing (TAA) techniques.

To make it work, there are a few steps, each leveraging our cutting-edge tech, and expertise with Deep Learning and AI.

First, we show a neural network thousands of screen captures from each DLSS game that are up to 64x Super Sampled (64xSS), and then another set of images captured without anti-aliasing. We then task the network with reviewing and comparing the shots, to learn how to approximate the quality of 64xSS without its immense performance cost.

As the network repeats the process its algorithms are tweaked, and eventually it learns to automate the process, delivering anti-aliasing approaching the quality of 64xSS, whilst avoiding the issues associated with TAA, such as screen-wide blurring, motion-based blur, ghosting, and artifacting on transparencies.

Of course, not everyone owns a supercomputer, so we package the trained data from our neural network into a small file that’s either included with a Game Ready driver, or transmitted to your system via GeForce Experience. With that, your GeForce RTX GPU will automatically know how to best render each DLSS game, ensuring optimum results each time.

Finally, we leverage Deep Learning Super-Sampling’s vastly-superior 64xSS-esque quality, and our high-quality filters, to reduce the game’s internal rendering resolution. This greatly accelerates performance, without a noticeably negative impact on image quality, as you can see in the Final Fantasy XV: Windows Edition DLSS benchmark.

This performance boost, combined with the improved rasterization performance of GeForce RTX graphics cards, sees framerates increase by up to 2X at 4K, compared to 10-Series Pascal-architecture graphics cards. And with this extra performance you crank up the settings, whilst still enjoying a super smooth experience at 60 FPS or more, giving you the definitive gaming experience.

At the time of writing, 25 games are adding DLSS, including popular titles such as Final Fantasy XV: Windows Edition and PUBG:

  • Ark: Survival Evolved from Studio Wildcard
  • Atomic Heart from Mundfish
  • Darksiders III from Gunfire Games / THQ Nordic
  • Dauntless from Phoenix Labs
  • Deliver Us The Moon: Fortuna from KeokeN Interactive
  • Fear The Wolves from Vostok Games / Focus Home Interactive
  • Final Fantasy XV: Windows Edition from Square Enix
  • Fractured Lands from Unbroken Studios
  • Hellblade: Senua's Sacrifice from Ninja Theory
  • Hitman 2 from IO Interactive / Warner Bros.
  • Islands of Nyne from Define Human Studios
  • Justice from NetEase
  • JX3 from Kingsoft
  • KINETIK from Hero Machine Studios
  • Mechwarrior 5: Mercenaries from Piranha Games
  • Outpost Zero from Symmetric Games / tinyBuild Games
  • Overkill's The Walking Dead from Overkill Software / Starbreeze Studios
  • PlayerUnknown’s Battlegrounds from PUBG Corp.
  • Remnant: From The Ashes from Arc Games
  • SCUM from Gamepires / Devolver Digital
  • Serious Sam 4: Planet Badass from Croteam / Devolver Digital
  • Shadow of the Tomb Raider from Square Enix / Eidos-Montréal / Crystal Dynamics / Nixxes
  • Stormdivers from Housemarque
  • The Forge Arena from Freezing Raccoon Studios
  • We Happy Few from Compulsion Games / Gearbox

Stay tuned to GeForce.com for announcements about other new games adding DLSS, and news of DLSS’s release in these games.

AI Up-Res

Enlarging a picture stretches existing pixels, losing detail and clarity in the process. Using AI and a new process called “AI Up-Res”, we can create new pixels by interpreting the contents of the image, before intelligently placing new data. This results in a sharper enlargement that correctly preserves depth of field effects and other artistic treatments.

In NVIDIA Ansel, we’re leveraging this technology to enable the capture of higher-resolution in-game photos in 200 games.

To learn more, head on over to our dedicated GeForce Experience article.

Advanced Shading

A shader is a program running on the GPU that controls various aspects of the rendering process: how vertices are positioned relative to the current camera view, how larger objects are broken down into smaller triangles as they get closer to the viewer, or how light is reflected at a given point on the object’s surface, giving developers the control necessary to produce amazing visuals that we see in games today.

NVIDIA pioneered programmable shading with the introduction of the world’s first programmable GPU, GeForce 3, in 2001. And over the course of the past 17 years we’ve witnessed a dazzling rate of advancement in the shading technology, resulting in increasingly-richer, more-detailed and more-realistic games.

The Turing architecture represents the pinnacle of these advancements, introducing four new shading technologies that will unlock innovative rendering approaches, taking real-time graphics in games to new levels of visual fidelity and performance.

Variable Rate Shading

With broader use of 4K monitors and VR displays, developers demand more control over how they spend GPU cycles shading pixels on the screen. For example, in a VR headset, pixels on the periphery of the field of view don’t require as much shading as pixels in the center of the image. In a similar fashion, surfaces with smoothly changing textures or soft diffuse illumination can reuse shading information across neighboring pixels, rather than shading each pixel from scratch.

Another example is fast moving objects in the scene: a motion-blurred car screeching across your display doesn’t need to be shaded every pixel, as it’s moving so quickly that you can’t clearly perceive all the details and effects anyways.

To address these scenarios, Turing implements advanced pixel shader scheduling technology called Variable Rate Shading, or VRS for short. With VRS, developers have full control over how much shading work is performed in different regions on the screen.

Objects in the center of a VR display, or highly detailed surfaces on a monitor, can be shaded at full rate, or even multiple times per pixel, while at the same time pixels distorted by the lens in the VR headset, or pixels corresponding to fast moving objects on the screen, can be shaded at lower rate, with multiple pixels sharing results of a single shading calculation.

As a result, games can more intelligently allocate shading work and spend more time shading important objects in the scene, effectively “borrowing” shading resources from other objects that do not need as much shading.

Using VRS, we have also created Motion Adaptive Shading (MAS), Content Adaptive Shading (CAS) and Foveated Rendering, three further techniques that help accelerate performance in specific ways.

Motion Adaptive Shading (MAS)

Changing the shading rate based on the degree of motion present in a particular region on the screen is one of the most effective applications of Variable Rate Shading. This technique is called Motion Adaptive Shading (MAS).

Motion Adaptive Shading works by first calculating how objects are moving across the screen. For example, in a third-person racing game, the car will appear mostly static and as such will have to be shaded at full rate to preserve important detail. In contrast to that, objects on the periphery of the screen, such as road signs or lane markings, will be moving very fast as they approach the camera, and thus can be shaded less frequently.

Without VRS, each pixel would be shaded individually (1x1). With VRS, a developer has up to seven options to choose from for each 16x16 pixel region, including having one shading result be used to color four pixels (2 x 2), or 16 pixels (4 x 4), or non-square footprints like 1 x 2 or 2 x 4. The colored overlay on the screenshot shows a possible application of VRS—perhaps the car could be shaded at full rate (blue region) while the area near the car could be shaded once per four pixels (green), and the motion-blurred road to the left and right could be shaded once per eight pixels (yellow).

Based on this motion information, the game calculates appropriate shading rates for each screen-space region and feeds it to Turing’s Variable Rate Shading hardware, which controls pixel shader scheduling. From this point onwards, the rest of the game engine can remain largely unaware of what is happening under the hood, making the technique relatively easy to integrate into existing games. And of course, giving the gamer improved performance with a barely-perceivable impact on image quality.

Content Adaptive Shading (CAS)

With Content Adaptive Shading, the shading rate is lowered by considering factors like spatial and temporal color coherence. In other words, in areas of comparatively-low detail, that remain unchanged from frame to frame , such as sky boxes and walls, the shading rate can be lowered in successive frames.

In the example below, the static detail around the animated control panels has its shading rate lowered, improving performance:

For even greater performance gains, developers can utilize both CAS and MAS simultaneously.

Foveated Rendering

The third application of VRS is Foveated Rendering, which adjusts shading rates based on what a gamer is looking at when eye-tracking technology is utilized in VR, or with a desktop monitor eye-tracker like a Tobii 4C.

Foveated Rendering is based on the observation that the resolution that our eye can perceive depends on viewing angle. We have maximum visual resolution for objects in the center of our field of view, but much lower visual resolution for objects in the periphery. Therefore, if the viewer’s eye position is known, we can shade at lower rates in the periphery, and higher rates in the center of the field of view, maximizing image quality at all times.

Multi-View Rendering

The Pascal architecture introduced Single Pass Stereo, a technique that accelerated VR performance by submitting geometry only once for both of a headset’s displays, before simply offsetting the geometry’s position to the left or right, instead of rendering each eye independently.

With the Turing architecture’s Multi-View Rendering (MVR), we can now render up to four views, which can be positioned arbitrarily in the scene. For gamers, this opens the door to canted (non-coplanar) ultra-wide, wrap-around headsets with extreme fields of view, and higher-resolution setups that use stacked screens on the left and right eye. And it could also accelerate rendering to holographic displays where multiple points of view need to be taken into account.

In non-VR scenarios, multi-view rendering can be used to accelerate the rendering of multiple shadowmaps , allowing developers to generate crisper, higher-resolution shadows (if they’re not using ray tracing).

Mesh Shading

The real world is a visually-rich, geometrically complex place, with outdoor scenes in particular being composed of hundreds of thousands of elements (rocks, trees, leaves, etc).

Today’s graphics pipelines, with vertex, tessellation, and geometry shaders, are very effective at rendering the details of a single object, but still have limitations when trying to replicate the detail of the real world. In particular, each object requires its own unique draw call from the CPU, and the shader model is a per-thread model, which limits the types of algorithms that can be used.

Real world scenes like these have too many unique complex objects to render in real time with today’s graphics pipelines.

Efficient processing of geometry is one of the cornerstones of real-time computer graphics, and NVIDIA has a long history of innovation on this front: our very first GPU, GeForce 256, offloaded the CPU from having to perform this computationally expensive task; GeForce 3 was the world’s first programmable GPU, giving developer unprecedented control over geometry processing; and the Fermi architecture introduced tessellation - a technique commonly used in film to represent geometry detail at its finest scale.

Now, with VR, Ray Tracing, and complex games chasing photorealism, the demand for finer geometric fidelity is higher than ever before. To accommodate these ever-increasing requirements for more efficient geometry handling, the Turing architecture introduces Mesh Shading, an entirely new programming model for geometry, merging the flexibility of compute shaders with the efficiency of the hardware-scheduled graphics pipeline. And best of all, Mesh Shading’s new, more flexible model enables developers to eliminate CPU draw call bottlenecks through the use of more efficient algorithms for producing triangles.

With Mesh Shading, developers no longer think about processing individual vertices, or individual primitives. Instead, they can leverage the power of a new programming model, where a set of hardware-scheduled threads can cooperate to compute a result much more efficiently than if they were to perform the computation independently, on their own.

For example, a single vertex is typically shared by a number of triangles. To determine whether a given triangle is visible or not, a game engine will have to perform some computation at each of the triangle’s vertices and combine the results. In the traditional pipeline, each triangle is handled independently, and hence shared vertices will be processed redundantly multiple times.

Mesh Shading Pipeline vs. Traditional Pipelines

With Mesh Shading, the developer can structure the computation so that the per-vertex part of it can be performed exactly once, and then shared across all triangles referencing that vertex.

Another aspect of Mesh Shading is that it allows for far richer geometry refinement. With tessellation, a given patch, or triangle, can be broken down into a fixed pattern of smaller triangles. With Mesh Shading, developers have complete control over how that refinement is done: for example, they can program the hardware to generate an entire tree, with fully detailed bark and leaves, from a very simple description of where its trunk and its branches are.

In a nutshell, Mesh Shading will allow for extremely detailed worlds, the likes of which have never seen before. Think detailed vegetation and undergrowth, clutter, dirt and other small-scale detail, significant in making game scenes look far less “synthetic”, and therefore closer to what we see in the real world.

To demonstrate, we created the Asteroids Level of Detail (LoD) demo, which shows how Mesh Shading can dramatically improve performance and image quality when rendering hundreds of thousands of asteroids simultaneously.

If you’re unfamiliar with LoD, it’s a technique used by virtually all games to scale the quality of an object based on its distance from the player’s camera. Take a tree for example: there’s no point rendering every leaf and branch at a distance where you can’t see it, so the tree’s detail is reduced to instead show its overall size and shape, maintaining performance and giving the game the resources to render nearby trees at maximum fidelity. This process is applied to every model and object on-screen, throughout a game, so it's easy to imagine how Mesh Shading will be help with image quality and performance.

In the Asteroids demo, mesh shaders dynamically adjust the Level of Detail of up to 350,000 individual asteroids to maintain sub-pixel geometric detail, at a level of performance that would be otherwise impossible to achieve.

With traditional rendering, the CPU is heavily involved with LoD management, creating a large bottleneck in the rendering pipeline. With Mesh Shading, Dynamic LoD management runs autonomously on the GPU. In the demo, we enable users to toggle the Mesh Shading’s Dynamic LoD system on and off, demonstrating vastly superior image quality and performance when it is enabled.

Below, an interactive comparison shows the wireframe rendering mode of the Asteroids demo, where you can clearly see this improvement in image quality from using higher LODs (note, tessellation is not utilized).

Asteroids NVIDIA Mesh Shading Demo Interactive Comparisons

With programmable mesh shading, the GPU can dynamically reduce and cull the trillions of triangles in the field of view of the player down to only the few million of triangles necessary to cover the pixels on screen. By applying these simple techniques efficiently, we were able to boost performance significantly whilst still improving image quality substantially where it matters most: in areas close to the player’s point of view.

With these benefits, Mesh Shading can enable the creation of massive, detailed worlds, unlike anything experienced before.

Texture-Space Shading

Nearly all games today render new frames “from scratch”, meaning that they don’t use any calculations made prior to that frame (unless they use temporal anti-aliasing). But in most games - like in the real world - relatively little changes from frame to frame. If you look outside your window, you might see trees blowing in the wind, pedestrians passing by, or birds flying in the distance. But the majority of the scene is “static” or unchanged. The main thing that changes is your point of view.

Now, some objects will indeed change appearance as you change your point of view - notably those which are glossy or shiny. But most objects will actually change appearance very little as you move your head, and as such, it’s a waste of precious GPU cycles to keep recalculating the same exact colors that make up those objects every frame. It’d be far more efficient to shade those objects at a lower rate (say, every third frame, or perhaps even lower than that) and reuse the object’s colors (or “texels” as they’re referred to) calculated in the past. This notion of work reuse becomes particularly important in ray tracing and especially in the case of global illumination, which is a very common example of a slow-changing and very expensive shading computation.

This technique is what’s referred to as Texture-Space Shading, in that the calculations are not performed frame-to-frame in screen space (i.e. from the point of view of the gamer), but rather calculated at a different shading rate in texture space (essentially from the point of view of the object itself). Why texture space?  Because all objects in games these days have textures and they’re independent of the gamer’s point of view, making them a perfect choice for storing shaded objects’ colors and “carrying” them from frame to frame.

This same technique can be effectively applied to VR, too: because our eyes are fairly close together, the vast majority of objects that you see with one eye are also seen by the other eye. The main difference is not the shading of those objects (e.g. your left eye sees the pencil on your desk as being the same yellow as the right eye), but rather the orientation of them. As such, with Texture-Space Shading you can “borrow” the shading calculations from one eye and use them with the other eye, essentially halving your shading workload. And if your performance is limited by pixel shading in your game, then Texture-Space Shading could theoretically double your framerates.

With texture-space shading, a game engine will not shade all rasterized pixels immediately. Instead, it will first identify which texels are referenced by the pixels it rasterized. This operation is very similar to what the texture unit does when it finds texels needed for a given texturing operation. This set of texels is then queued up for shading, which will happen at a later point in time. Note that during this process, the same texel may be referenced by multiple pixels but for efficiency, we of course don’t want to shade texels redundantly. So the game engine will have to perform what is referred to as deduplication of shade requests, isolating unique texel references and ensuring that each texel is only shaded once.

Once the set of unique texel references is identified, the game will shade those texels, storing results in corresponding textures for later reuse. This is analogous to pixel shading, except that what is being shaded are not pixels on the screen but rather texels within a texture.

Finally, the computed texels can be used to calculate their corresponding pixel’s colors - in the exact same way as static textures are used today. This step is extremely cheap, since it merely performs a single texturing operation.

The process of finding the set of visible texels and isolating unique ones is a computationally expensive process, and for that reason applications couldn’t afford texture-space shading in the past. The Turing architecture addresses this problem by introducing hardware acceleration for this key step: Turing’s texture unit is now capable of providing texel address information directly to the shader, and new data-parallel intrinsics make the deduplication step of the process very efficient.

Though Texture-space shading might seem like a very straightforward and logical thing to do, there are some “gotchas” that have kept the technique from becoming mainstream.

First of all, modern games reuse a lot of objects, including their textures. If you see a forest within a game, it’s highly unlikely that each leaf has its own texture. Most likely, the leaves are all taken from a small set of leaf models and textures. These objects won’t work for Texture-Space Shading, because all visible objects must have their own textures - no sharing allowed! Imagine if you didn’t enforce this - that all the leaves on a tree continued to share the same leaf texture - and then you used your hand to put one of them into shadow. Then all the leaves on the tree would simultaneously go into shadow as well! So this is the first thing developers have to deal with - ensuring that every object has its own texture which can be shaded independently of all other objects.

Texture-space shading will therefore require fairly significant rethinking of how game engines are built, but once this is done, the possibilities are endless - a door is opened to entirely new ways of doing rendering, building on the benefits of both ray-tracing and rasterization to generate photo-realistic images at high framerates, without compromises. The Turing architecture’s hardware support for texture-space shading makes such hybrid approaches finally practical.

Reinventing Graphics For The Next-Generation of Games

NVIDIA GeForce RTX graphics cards deliver the ultimate PC gaming experience. Powered by the new NVIDIA Turing GPU architecture and the revolutionary RTX platform, RTX graphics cards bring together real-time ray tracing, artificial intelligence, and programmable shading. This is a whole new way to experience games.

For news of games integrating these game-changing GeForce RTX technologies, stay tuned to GeForce.com.

Comments