<!-- OpenGraph image: https://media.githubusercontent.com/media/wrightwriter/wrightwriter.github.io/master/blog/preview_pathtracing_b.png -->
<!-- OpenGraph description: Rendering global illumination with GPU atomics and forward pathtracing. -->

<!-- OpengGraph image: https://media.githubusercontent.com/media/wrightwriter/wrightwriter.github.io/master/images/generative_static/unknown%20(4).png -->

<BlogTitle title={title} date={date} />

<article>

<brb />  
<brb />  

## **Introduction**

Forward pathtracing is a technique fundamental in [lightmapping](https://en.wikipedia.org/wiki/Lightmap) 
and [bidirectional pathtracing](https://www.pbr-book.org/3ed-2018/Light_Transport_III_Bidirectional_Methods/Bidirectional_Path_Tracing). At its core, you are shooting rays
from light sources in the scene, instead of the camera, like you would in a traditional backward pathtracer.
    
I'll present how I used it to render the following motion graphic:  
<brb />
<brb />
<Video src={`/images/generative_mograph/OUT (2).mp4`}/>
	<!-- <div class="play-button" style={`
		width: 4rem;
		height: 4rem;
		position: absolute;
		top: 50%;
		right: 50%;
		transform: translate(50%, -50%);
		pointer-events: none;
`}>
	{@html play_pause_icon}
	</div> -->
<!-- <h1 style="margin-left: auto; margin-right: auto; margin-top:0.5rem;"> -->
<SubTitle>
	Forward traced <a href="https://en.wikipedia.org/wiki/Mandelbox">mandelbox</a>, in my Kotlin OpenGL engine 
</SubTitle>
<brb />

There are 2 directional conic light sources, refracting through a fractal [implicit surface](https://en.wikipedia.org/wiki/Implicit_surface).
Rays are shot stochastically and rendered as lines. 

<!-- Forward traced [mandelbox](https://en.wikipedia.org/wiki/Mandelbox), in my Kotlin OpenGL engine -->
<!-- </h1> -->
    

<brb />
<brb />
    

<Image src={`/blog/lyc-sketch.gif`} />

    
<SubTitle>
	2d pathtracing sketch by <a href="https://www.deviantart.com/lyc">lycium</a>
</SubTitle>
<brb />

What's being rendered is fluence¹. This is what 2d pathtracing displays.  
Possibly the first [article](https://web.archive.org/web/20040131013212/http://lycium.cfxweb.net/) and renders were done by the graphics engineer <a href="https://www.deviantart.com/lyc">lycium</a>.
   
What we see with our eyes and in camera sensors is irradiance². That is what traditional pathtracers render.

   
<wbr />


<div class="footnotes">
  <hr />
  <ol>
      1. <a href="https://cs.dartmouth.edu/wjarosz/publications/jarosz12theory.html">Fluence</a> - integral of radiance incoming in a region, over all angles. Measures radiance passing through a point.  <brb />
      2. <a href="https://en.wikipedia.org/wiki/Irradiance">Irradiance</a> - energy received per surface area. Measures radiance hitting a surface. Irradiance over a region on the camera sensor is what you integrate in a pathtracer.
  </ol>
  <hr />
</div>

<wbr />

## **The algo**

One day, an idea occurred to me to translate the technique to 3d, while also using forward pathtracing. That resulted in the following sketch:   

<brb />
<Video src={`/images/generative_mograph/OUT (3).mp4`}/>
<!-- <h1 style="margin-left: auto; margin-right: auto; margin-top:0.5rem;"> -->
<SubTitle>
	Artistic liberty taken with refraction physics and line drawing.
</SubTitle>
<brb />

The fundamental algorithm is:


1. 2 images are created, one for accumulation and one for post processing.
2. A compute shader is dispatched. Each thread picks a random light, raytraces from it.
3. For every bounce, it draws a line using atomic adds to the accumulation image.
4. Post processing shader is used to "tonemap" by scaling brightness relative to the amount of raytracing threads.


Pseudocode for the shader:  
  
<wbr />

  
```glsl
layout(rgba32ui) uniform coherent restrict uimage2D accum_image;
float sdf(vec3 p){
	// Your implicit surface signed distance function.
}
vec3 projParticle(vec3 p){
	// Some projection, can use p/=p.z for perspective projection.
}
vec3 pickRandomLight(){
}
vec3 getDirFromLight(){
	// Can be spherically uniform random normalized vector for an omnidirectional light source.
}
vec3 getCosineWeightedHemisphereSample(vec3 normal){
	// Importance sampling, cancels out the cosine term.
}
vec3 drawLine(){
	// Explained later
}
void main(){
	const int bounces = 3; // Arbitrary
	const light_pos = pickRandomLight();

	// Throughput is the electromagnetical energy of the ray.
	// It's RGB, but would ideally be spectrally defined.
	vec3 throughput = vec3(1); 

	const vec3 p = light_pos;
	const vec3 ray_dir = getDirFromLight();

	for (int i=0; i<bounces; i++){
		Hit hit = march(p, ray_dir);
		vec3 normal = hit.normal;
		vec3 hit_pos = hit.pos;
		vec3 albedo = hit.albedo;

		drawLine(p, hit_pos, throughput);

		// Simplified for a perfectly rough surface with no fresnel.
		throughput *= albedo;

		ray_dir = getCosineWeightedHemisphereSample(normal);
		p = hit_pos;
	}
}
```

<brb />

## **Clarifications**
   
<wbr />

```
layout(r32ui) uniform coherent restrict uimage2D accum_image;
```


- "coherent" is needed when different [shader invocations](https://www.khronos.org/opengl/wiki/Shader#Execution_and_invocations) 
(single pixel in a frag shader, single thread in a compute shader) write/read to locations potentially read/written from other invocations. 
It enforces [coherent memory access](https://www.khronos.org/opengl/wiki/Memory_Model#Ensuring_visibility). It can look cool if you don't enable it :)
    
- "restrict" is a hint to the compiler, which allows some performance optimizations. 
You are telling it that if you have a variable A, which reads from a memory buffer, it will be the only variable which reads from that buffer. You are also saying you won't read the value of variable A and write it to any other variable B.
<!-- - "restrict" is a hint to the compiler, which allows some performance optimizations. that you will only use a single variable to read from the memory buffer and you won't use another varaible to read the first one.  -->
<!-- - "restrict" is a performance optimization, related  -->
    
- "writeonly" doesn't let you read from the memory. I'm not sure if it leads to any optimizations.
   
More info on the [khronos wiki](https://www.khronos.org/opengl/wiki/Type_Qualifier_(GLSL)).
    
<wbr />

## **Drawing lines**

<p>  
Okay, how do you draw a line?    
You could use hardware rasterization to render quads, but i opted in to directly draw to the screen.  

<brb />
  
Ideally you would use <a href="https://www.shadertoy.com/view/4dX3zl">DDA</a> - both accurate and cheap. 
What I did was naively march through the ray and atomically splat to the screen at every step. I think it ended up looking pretty nice!

</p>
   
<brb />  

```glsl
void drawLine(vec3 p_start, vec3 p_end, vec3 throughput){
	vec3 ray_dir = normalize(p_end - p_start);
	float dither = float_hash()/min(resolution.x,resolution.y);
	float line_len = length(p_end - p_start) - dither;
	float steps = line_len*min(resolution.x, resolution.y);
	steps = min(steps,550.);
	float stepSz = lineLen / steps;

	for(float i = 0.; i < steps; i++){
			vec2 p = projParticle(p_start.xy + ray_dir.xy*i*stepSz);
			vec2 uv = vec2(p.xy) + 0.5;
			vec2 anti_aliasing = randc(rand2(seed ))*0.55555/min(resolution.x,resolution.y); // horrible box filter
			ivec2 int_p = ivec2(uv + anti_aliasing)*ivec2(resolution));
			if(
				// inside bounds
				int_p.x >= 0 && int_p.x < int(resolution.x) &&
				int_p.y >= 0 && int_p.y < int(resolution.y)
			) { 
				// Can add dithering here or just use float accum tex.
				// In practice, it looks alright
				ivec3 throughput_quantized = ivec3(throughput * 255.);
				imageAtomicAdd(accum_image, throughput_quantized);
			}
	}
}
```

<brb/>


```glsl
imageAtomicAdd(accum_image, throughput_quantized);
```

"image[Atomic](https://en.wikipedia.org/wiki/Linearizability)Add" lets multiple threads write/read a location without [data racing](https://en.wikipedia.org/wiki/Race_condition). Looks rather cool if you directly add to the buffer without atomics though :)
  

If you feel like accumulating to a float buffer/image, you can do that with an [nvidia extension](https://registry.khronos.org/OpenGL/extensions/NV/NV_shader_atomic_float.txt),
 or the [compare-and-swap](https://en.wikipedia.org/wiki/Compare-and-swap) algorithm.

<brb/>

## **Wrapping up**
  
Hope you enjoyed reading this and feel free to send me stuff you've made on discord! :) The article is trivial to transate to 2d, if you are interested in that.

</article>
    
     

<script lang="ts">
	import {onDestroy, onMount} from 'svelte'
  import play_pause_icon from "../../../public/play_pause.svg"
	import Video from "../utils/Video.svelte"
	import Image from "../utils/Image.svelte"
	import SubTitle from "../utils/SubTitle.svelte"
	import BlogTitle from "../utils/BlogTitle.svelte"
	// import image_a from "../../../public/images/generative_mograph/OUT (2).mp4"

  export let date = new Date("2023-08-18")
  export let title = "3D (and 2D) forward pathtracing in a compute shader"
	
	
	
	onDestroy(() => {})
	onMount(async () => {
	})
	
</script>

<style lang="scss">
</style>

