Real Time Renderer Architectures

There are a variety of architectures in 3D real time rendering that are applicable to a variety of use cases, each with their own benefits and trade offs in performance and capability. The following is a summary of each as well as a detailed description of their implementation.

Though each implementation can be done in any graphics API, all code in this post will be using Vulkan with no dependencies, as Vulkan offers the best perspective of computer graphics APIs and the most compatibility across operating systems. In addition, though some concepts are applicable with 2D renderers (in particular the immediate mode model used often in UI frameworks like Aymar Cornut's (@ocornut) imgui and Wenzel Jacob's NanoGUI. This post will primarily focus on writing a renderer for a 3D application.

Data Organization

The glTF Specification [Cozzi et al. 2016] offers a useful perspective on the organization of data in a real time rendering application. Data in a real-time application can be described as being composed of:

A tree of Nodes, each with their own transformation matrix.
A collection of Primitives that those nodes point to, each of which represents 1 draw call in a graphics API. Each Primitive points to mesh buffers and a technique with its own parameters.
A collection of Material Implementations (Techniques), each with their own parameters set and each pointing to a reference material that handles the shader logic, uniforms, etc.
A collection of Materials with shaders programs and aspects of the graphics pipeline such as blend mode, stencil tests, etc. set for that material.
A collection of Mesh Data Buffers that can contain CPU and/or GPU accessible data buffers that will be used when rendering that primitive.

As well as state that is used by the renderer that all draw calls will share such as:

Camera Matrices (Perspective and View matrices to generate the current modelViewProjection matrix for that draw call)
View Surface Data such as the position the output buffer takes on the rendering window in the case of split screen games or Picture in Picture (pip).
Light Data such as their position, direction, type, brightness, and color.
Skybox Data for irradiance/radiance maps.
Postprocessing Parameters such as depth of field focus points, tone mapping algorithm, etc.

And of course any other arbitrary information such as terrain data, voxel data for GI, ray marched sign distance fields for clouds, etc.

Most rendering architectures organize their data in this format, and indeed most model files organize themselves in this way as well. This consistent data design makes it easier for applications to interoperable and work together to help design, model, and render a given scene.

Immediate Mode Context

Immediate Mode Context Rendering uses explicit draw calls every frame to render a scene.


/**
 * Example of ImGui rendering a debug window:
 */

unsigned frameRate;
float ms;

 void renderDebugData()
 {
     ImGui::Text("Frame Rate: %d fps", frameRate);
     ImGui::Text("Frame Time: %.3f ms", ms);
 }

 // Later in an Unreal Engine 4 Actor:

 void AMyActor::tick(float deltaTime)
 {
     // Execute every frame
     renderDebugData();
 }

Immediate Mode Context Rendering normally uses a singleton stack containing draw calls, and when updating the frame it pops that stack until it's empty with each draw command. Afterwords the Immediate Mode Context Renderer has a list of command buffers that it executes in a given Graphics API. Due to the nature of this architecture it is difficult it can be less performant than an inturupt, subscriber based architecture where elements are explicitly added/removed from a graphics state. As such, this architecture is normally relegated to UIs or 2D applications where such performance concerns aren't an issue.

Forward

Forward Rendering is arguably the first and most common implementation of real time rendering. It is capable of practically every effect you can ask for, from transparency, refraction, reflection, MSAA, and much more.

Deferred

Deffered Rendering is arguably the most intuitive. Instead of rendering lighting information on a per mesh level, you render it on a per pixel level of the current surface. This can result in easier to architect post-processing effects and faster lighting for cases of scenes with a lot of lights. At the same time, it can be somewhat slower than forward rendering since there can be more divergance in an image resulting in stalls.

This involves the use of a Geometric Buffer (G-Buffer) [Saito et al. 1990], though such a buffer can also be used in forward rendering as a prepass.

Deffered Rendering tends to be executed using raster based screen space buffers, with a blend mode set to add between lights to accumulate lighting information in the scene.

Render Diffuse Irradiance information (Gobal Illumination, irradiance skybox texture/spherical harmonics/dice harmonics, etc.)
Render Specular Radiance Information (Reflections, Light specular terms, etc.)
Repeat for each light.

Trade Offs

Anti-Aliasing is limited to SSAA, FXAA, TAA.
Transparency is much more difficult to compute.

Tile Deferred

Tiled Deffered takes advantage of the fact that GPUs execute shaders in "tiles" and leverages that to automate the addition of such tiles rather than using blend modes. Keeping this data away from the shader author and making it automatic has the benefit of reducing complexity while at the same time increasing performance.

Conclusion

In the same vain as the days where per vertex Gouraud shading was the state of the art, it's now possible to limit shading to an object in texture space, reducing the amount of expensive lighting calculations on that object.

By using Sampler Feedback along side other modern graphics techniques such as compute shaders, mesh shaders, and variable rate shading, this technique can allow for more efficient processing of expensive lighting operations and thus more work distributed across a variety of tasks in your graphics application. This was featured in DirectX 12 Ultimate's Announcement.

Stack.gl has a number of great examples of rendering architectures, including a great WebGL Forward Rendering example here.
University of Pennsylvania's CIS565 course has an example of WebGL Defered Shading with gLTF
Joel Yliluoma wrote a number of articles and YouTube videos discussing offline rasterization, a useful reference to what a software renderer might look like.
Here's a really nice Spherical Harmonics summary with an example ShaderToy.

[Cozzi et al. 2016]

GL Transmission Format (glTF) Specification
Patrick Cozzi and Tony Parisi
Khronos Group 2016
github.com

[Saito et al. 1990]

Comprehensible rendering of 3-D shapes
Takafumi Saito and Tokiichiro Takahashi
SIGGRAPH 1990
dl.acm.org

Real Time Renderer Architectures

Data Organization

Immediate Mode Context

Forward

Deferred

Trade Offs

Tile Deferred

Conclusion

GitHub Comments

Data Organization

Immediate Mode Context

Forward

Deferred

Trade Offs

Tile Deferred

Conclusion