Alain Galvan ·6/20/2019 8:30 AM · Updated 1 year ago
An overview of various ray operations, their similarities, differences, and how to implement each type of ray operation in real time rendering.
Tags: blogshaderglslhlslvulkandirectxopenglspir-vmetal
Real time ray-tracing was a goal graphics software engineers have sought after ever since the first papers on light simulations were published back in the late 1970s to early 1980s [Whitted 1980] [Cook et al. 1984] [Veach 1998]. As of now this is no longer a dream, but a reality that is available commercially in modern graphics processing units (GPU) and game consoles such as the PS5 and Xbox Series X.

Real time ray-tracing continues to be a heavily researched topic. From early papers and projects from the demo-scene using shader model 3 to perform simple ray tracing [Quílez 2005], to recent papers running into the issue of branch coherency during acceleration structure traversals [Benthin et al. 2018], attempts to reduce sample counts for rays by means of denoising algorithms. GPU architectural papers such as those released by NVIDIA's Pascal architecture, viewport based quad-tree systems to reduce samples in unimportant areas, BRDF papers for analytically modeling a variety of materials, and so much more, the subject of ray tracing contains a variety of complex sub domains.
While GPUs were originally meant for storing frame buffers, simple rasterization and shading operations, they've since become powerful co-processors that can perform non-specialized calculations, from compute to tensor and ray-tracing routines with varying levels of hardware acceleration. [Arafa 2019]
Thanks to an immense effort on the part of hardware designers, researchers and industry manufacturers, real time hardware accelerated ray tracing is now possible in modern high end graphics cards, with compute based falloffs available on select devices.
Now ray tracing has become an ambiguous term [Shirley 2018] over the years, so let's define terms based on the research papers they were first used:
Ray Casting - a simple collision algorithm, rays can hit a acceleration data structure such as a discrete shape or set of triangles, and that ray returns the data regarding that hit point.
Ray Marching - A method of ray casting that works by "marching" towards a density function's threshold "solid" value [Perlin et al. 1989] by means of a sphere tracing algorithm [Hart et al. 1989]. Similarly, a hierarchical Z Buffer can also be marched, such as in Pixel-projected reflections [Cichocki 2017]. Ray marching has recently been made popular by the DemoScene and ShaderToy.
Ray Tracing - otherwise known as Whitted Path Tracing [Whitted 1980] or Recursive Ray Casting, rays can continue past their original collision point, thus allowing for surface behaviors like reflections. Cook Style Ray Tracing (distribution ray tracing) [Cook et al. 1984] involves the use of monte-carlo integration (introducing randomness) on surfaces to introduce soft shadows. The billiard table figure above was rendered with this technique in 1984!
Path Tracing - Kajiya expanded on ray tracing with with the inclusion of diffuse inter-reflection (global illumination) and even further with his light integration formula [Kajiya 1986]. Veach took this even further with his Ph.D dissertation on path tracing which introduced ideas like Multiple Importance Sampling (MIPS), Next Event Estimation (NEE), and Russian Roulette [Veach 1998].
Ray tracing is a subject that can encompass entire books and courses, so it's important to make clear what we won't be discussing in detail:
Existing Renderers - There's Mitsuba 2, NVIDIA OptiX, Intel Embree, Blender Cycles, Autodesk Arnold, Pixar RenderMan, PBRT's reference renderer. The differences between each of these backends are complex enough to warrant their own blog post or paper. That being said, it's highly valuable to review the source of any one of these projects!
Acceleration Structures - While KD-Trees and Bounding Volume Hierarchies are important things to know, these are abstracted by real-time Ray Tracing APIs [Stich 2018] (though you could always implement your own, Scratch A Pixel has a brilliant article on the subject, and the subject is detailed in Physically Based Rendering Chapter 4).
Linear Algebra - You should know how to multiply matrices to describe a camera, scene hierarchy, etc. Here's a refresher on the subject.
The pattern by which any ray based algorithm follows is:
For Each Pixel that you intend to get radiance (light value) information from, cast a ray based on a given camera matrix. This ray can be calculated by getting the NDC coordinate of that pixel, then multiplying that by the inverse view-projection matrix to form a world based vector from the camera to the scene.
Test for Collisions with simple math functions for simple shapes, or more complex methods like a bounding volume hierarchy for traversing the scene, and a low level acceleration structure to test each triangle of the objects in the scene such as a KD-Tree.
Calculate Radiance by testing that surface for its material function (or Bidirectional Reflectance/Transmission Distribution Function, BxDF).
Cast Rays - by bouncing off that surface according to the behavior expected by that material or what you intend to record (Global Illumination, Reflections, etc.).
Stop, Average, and Write at some pre-determined level of quality such as 4096 samples, then average out the radiance values of each of those samples to get a final color. Finally, write that to your output memory, be it a frame buffer attachment in a graphics API or a .png image, etc.
At the start of every ray tracing routine is the initial camera ray. This can be computed using a View and Projection matrix. If you want to add other effects to your routine such as anti-aliasing or depth of field you can adjust the ray direction with a random offset in its origin and direction.
These matrices can easily be found in linear algebra libraries that include a lookAt and perspective matrix calculation, but the basic idea is to build a basis matrix based on the camera orientation, and a scaling matrix based on its field of view and near/far planes.
Here's an example that uses GLM:
using namespace glm;
// 📷 Declare Image Size
const unsigned width = 1920;
const unsigned height = 1080;
// 👧 First Person Camera
// 🧍♀️ Standing at 1.65 meters
// 👆 Pointing at the zenith at PI/2
float theta = 1.570796f, phi = -.1f;
vec3 origin = vec3(-10.0f, 1.65f, -0.5f);
vec3 firstPersonDirection =
vec3(sin(theta) * cos(phi), sin(phi), cos(theta) * cos(phi));
vec3 firstPersonUp = vec3(sin(theta) * sin(phi) * -1.0f, cos(phi),
cos(theta) * sin(phi) * -1.0f);
mat4 view = lookAt(origin, origin + firstPersonDirection, firstPersonUp);
// 🦍 aprox. 103 degrees FoV horizontal like Overwatch
float fovVertical = 1.24f;
float aspectRatio = static_cast<float>(width) / static_cast<float>(height);
float near = .1f, far = 300.0f;
mat4 projection = perspective(fovVertical, aspectRatio, near, far);
// 🌎 Unproject camera for world space ray
mat4 inversinvProjectionView = inverse(projection * view);Then you can use that data as a uniform in a compute shader, ray dispatch shader, fragment shader, etc:
#version 430
precision mediump float;
/**************************************************************
* 🎛️ Inputs
**************************************************************/
layout(local_size_x = 1, local_size_y = 1) in;
/**************************************************************
* 🖼️ Outputs
**************************************************************/
layout(rgba16f, binding = 0) uniform image2D outColor;
/**************************************************************
* 👔 Uniforms
**************************************************************/
layout(binding = 1) uniform FragUBO
{
mat4 invProjectionView;
float near;
float far;
}
ubo;
/**************************************************************
* 👋 Main
**************************************************************/
struct Ray
{
vec3 origin;
vec3 direction;
};
void main()
{
// 🟥 Pixel Coordinates
ivec2 pixelCoords = ivec2(gl_GlobalInvocationID.xy);
ivec2 dims = imageSize(outColor); // fetch image dimensions
// 🌄 Texture Coordinates
vec2 texCoord = vec2(float(pixelCoords.x) / float(dims.x),
float(pixelCoords.y) / float(dims.y));
// 🎥 Normalized Device Coordinates (NDC)
vec2 ndc = (2.0 * texCoord - 1.0);
float ndcDepth = ubo.far - ubo.near;
float ndcSum = ubo.far + ubo.near;
// 🌏 World Space Ray
vec4 camRay = ubo.inverseViewProjection * vec4(ndc * ndcDepth, ndcSum, ndcDepth);
vec4 camOrigin = ubo.inverseViewProjection * vec4( ndc, -1.0, 1.0 );
// 🧪 Test ray with scene...
Ray ray;
ray.origin = camOrigin;
ray.direction = normalize(camRay).xyz;
//vec4 radiance = calculateRadiance(ray);
// ✍ Write to final color output
imageStore(outColor, pixelCoords, radiance);
}Ray Casting is casting a ray from the camera to a collision point and stop the operation there. Early iterations of ray casting were extremely simple, opting to try drawing simple lines or lighting effects.
This is generally a simple operation, and functions as a basic primitive by which all other Ray Tracing methods are built off of.
There's a variety of different operations for triangles and basic primitives, acceleration structures like bounding volume hierarchies, KD-Trees, etc. available here.
Ray Marching is a form of raycasting that iterates over discrete steps to determine collisions, with Sphere Marching taking that idea a step further, integrating a distance field function to determine collisions (rather than a collision acceleration data structure such as a K-D Tree or Bounding Volume Hierarchy (BVH)). [Hart et al. 1989] [Hart 1996] [Keinert et al. 2014]
Ray Marching (both terms are used interchangeably) is very useful for rendering real-time volumes, the game Fortnite by Epic Games published a post on FxGuide detailing their technique, and prior to that, Sebastien Hillaire published a paper describing raymarching voxels for volumetric rendering [Hillaire 2015]. Practically every ShaderToy and project in the demoscene uses some form of ray marching.
Note: As ray marching is more of a method of testing collisions, it is possible to use ray marching to drive a ray caster or a complex path tracer.
const int marchSteps = 64;
const float epsilon = 1.0e-5;
// ⭕ Sphere Marching
float raymarch(vec3 rayOrigin, vec3 rayDirection, float near, float far)
{
// 🏁 Starting integrated distance
float dist = near + epsilon;
// 📏 Final distance Value
float t = 0.0;
// 🥁 March
for (int i = 0; i < marchSteps; i++)
{
// 👩🌾 Near/Far Planes
if (abs(dist) < near || t > far)
break;
// 🥊 Advance the distance of the last lookup
// `dist` approaches values below 0.
t += dist;
vec3 p = rayOrigin + t * rayDirection;
//dist = scenedf(p); // 👈 your scene distance field
}
return t;
}This can easily be done in a compute or fragment shader with no ray tracing cores necessary.
Inigo Quilez (@iquilezles) has an entire blog full of articles on ray marching here.
Ray Tracing is when a ray continues past its original point in a recursive manner. The word tends to be used to describe everything here, recursive ray casts, cook style ray tracing, path tracing, etc.
This is useful to calculate information based on an object's neighbors such as Reflections, Global Illumination or Ambient Occlusion.
Path Tracing is the process of tracing a ray's path, such as from the camera to a light source. Path tracing is basically ray tracing that uses material functions (BxDFs) to model light interactions and the behavior of rays that interact with those materials.
There are several types of path tracing techniques described in literature, here's a few:
Backwards Path Tracing - Tracing a scene by casting rays from the camera to the scene until they hit light objects. Also called Unidirectional Path Tracing though that title can also apply to only tracing from lights.
Bidirectional Path Tracing - Traditional path tracing, Bidirectional Path Tracing (BPT) where sub paths are started at the camera and the lights (in our case the environment map), and vertices from the sub paths are connected to form a full path.
When sampling an environment map, Light, or BxDF, there are regions which are more likely to hit highlights or bright concentrated spots than others, and if you focus your rays to target these regions, you'll see far less variance when sampling these regions.
By varying the local sample density according to the importance function, the error of the approximation can be significantly reduced. The decision whether to place an ink dot is the result of a threshold comparison of each image pixel with the corresponding dither matrix element. [Cornel 2014]
How this is implemented can vary, taking into account Next Event Estimation (NEE), Russian Roulette, and more:
Multiple Importance Sampling (MIPS) - When estimating an integral, we should draw samples from multiple distributions, hoping one matches reasonably well, and weigh the samples from each technique to reduce variance spikes.
Probability Density Function (PDF) - denotes the probability area of a given interaction, which for most of path tracing, is the outgoing direction of a ray according to a BxDF.
Cumulative Distribution Function (CDF) - Usually denoted as ( P(x) ), it's the probability that a uniformly distributed sample ( u < x \int_-\inf^x p(y)dy ). It basically defines the integral of all possible points of a given random variable, and should tend to be 1.0.
Russian Roulette - A random chance that if the luminance of a ray is less than a given ( \epsilon ) the path will be discarded. Reduces variance by accepting stronger rays more often.
Next Event Estimation (NEE) - Tracing shadow rays to the light source on each bounce to see if you can terminate the current path. This involves shooting a shadow ray towards light sources, if it's occluded, terminate the ray.
The differences between ray casting, ray marching, ray tracing, and path tracing are somewhat subtle, it can be useful to imagine each in that order of complexity, with ray casting serving as an operation, ray marching as a method of managing collisions of density functions, and ray tracing serving as an overall description of all techniques.
For more information information on material models, check out this blog post where I review the state of the art in materials.
The set of books recommended by Real Time Rendering
Professor Morgan McGuire (@CasualEffects) released The Graphics Codex, a massive repository of articles, equations, diagrams, and programming projects.
Alan Wolfe (@Atrix256) released a series of articles introducing the concept of path tracing in ShaderToy.
Jacco Bikker (@j_bikker)'s articles on Probability Theory for Physically Based Rendering.
Kai Burjack wrote a series of articles on Ray tracing with OpenGL Compute Shaders with LWJGL, Java's most popular OpenGL implementation.
Károly Zsolnai-Fehér's (@karoly_zsolnai) TU Wien Rendering Course
Alexander Keller et al.'s Siggraph 2020 course Advances in Monte Carlo Rendering: The Legacy of Jaroslav Křivánek.
Chris Wyman of NVIDIA's Getting Started with RTX Ray Tracing tutorials.
NVIDIA's Vulkan Ray Tracing Tutorials now support the standard Vulkan KHR extensions.
For CPU based ray tracing:
| [Whitted 1980] |
| [Cook et al. 1984] |
| [Veach 1998] |
| [Quílez 2005] |
| [Benthin et al. 2018] |
| [Arafa 2019] |
| [Shirley 2018] |
| [Perlin et al. 1989] |
| [Hart et al. 1989] |
| [Cichocki 2017] Optimized pixel-projected reflections for planar reflectors Siggraph, SIGGRAPH Computer Graphics 2017 advances.realtimerendering.com |
| [Kajiya 1986] |
| [Stich 2018] |
| [Hart 1996] Sphere Tracing: A Geometric Method for the Antialiased Ray Tracing of Implicit Surfaces The Visual Computer 1996 |
| [Keinert et al. 2014] |
| [Hillaire 2015] |
| [Cornel 2014] |