Assignment 3: PathTracer

Fanjia Yan

https://cal-cs184-student.github.io/sp22-project-webpages-Fanjia-Yan/proj3-1/index.html

In Project 3-1 PathTracer, we start off by building the fundamental block of ray tracing, the ray itself and ray intersection. We implemented ray-plane,ray-triangle, and ray-sphere intersection. Secondly, we want to speed up our rendering by using Bounding Volume Hierarchy(BVH) as our data structure so that we can increase the speed of ray intersection. Then, we go on building direct and indirect illumination to form global illumination we recursively bounce the ray and track the luminance so the image can be rendered in a real fashion. Lastly, we implement the adaptive sampling to sample more frequently on area that is harder to converge.

Part 1: Ray Generation and Intersection

To start off part 1, we fill in the generate_ray() function. The function takes in normalized coordinate in an image and will output a ray in the world space. What I did was first convert the image space coordinate into camera space using a series of transforms. Then I create a ray using (0,0,0) as origin and (input - origin) as directional vector to generate a ray in camera space. Lastly, I transform the ray back to image space and return.

After that, I implemented the raytrace_pixel() function. We are provided with ns_aa as the number of sample, and for each sampling, I shoot a ray and accumulate the radiance. Finally, I average out the radiance and update the pixel in the sample buffer.

Next, I implemented the ray-triangle intersection. The algorithm I use is Moller Intersection Algorithm that was introduced in class. The Moller Algorithm input the three vertices of the triangles as well as the ray's origin and direction. The algorithm stems from the idea of Cramer's Rule and after algebraic manipulation and matrix multiplication, it returns the time of intersection "t" and two barycentric coordinate "b1" and "b2". For has_intersect() function, we just need to judge if t is between ray.min_t and ray.max_t and whether the barycentric coordinate is between 0 and 1. If both condition meets, there is an intersection. For intersect(), if there is an intersection, we populate the an Intersection variable isect which contains the time, normal vector, primitives and bsdf. After that, we return true for any intersection.

Screenshot of Moller Algorithm from lecture slides

Lastly, I implemented the ray-sphere intersection, which is more straightforward than the ray-triangle intersection. We simply solve for the equation

(o + td -c )^2 = R^2

And then check if t is between min_t and max_t. For intersection(), we will again populate the isect variable.

Rendering of CBspheres_lambertian.dae
Rendering of CBcoil.dae
Rendering of CBgems.dae

Part 2: Bounding Volume Hierarchy

In order to pre-process and construct the BVH, I did the following:

The heuristic I implement is to spilt the left and right node based on median of the longest axis.

Rendering of cow.dae with BVH acceleration
Rendering of maxplanck.dae with BVH acceleration
Rendering of dragon.dae with BVH acceleration
Rendering of CBlucy.dae with BVH acceleration

BVH acceleration comparison chart

File # of Primitives Rendering Duration(s) Speed Up
cow.dae 5856 15.82 / 0.05 x316
maxplanck.dae 50801 186 / 0.11 x1690
dragon.dae 105120 379 / 0.06 x6316
CBlucy.dae 133796 478 / 0.07 x6828

From the chart above, we can see that cow.dae has a less complex structure in a sense that it only has 5856 primitives Without BVH acceleration, cow.dae takes 15.82 seconds which is not a lot but the BVH acceleration still brinds the rendering duration down to 0.05 seconds. For maxplanck.dae,dragon.dae,CBlucy.dae, those three meshes have a relatively complex structure ranging from 50,000 to 133,000 primties. By observation, when we render without BVH acceleration, the rendering duration grows as the number of primitive increases. That is because the method iterate through all the primitives and find intersection which will take O(n) time complexity. By contrast, the BVH accelerated method gives roughly the same time regardless the number of primitives. That is because we store the bounding box into a binary tree structure and traversing through the tree will take best case O(logn), worst case O(n). The heuristic I implement is to spilt the left and right node based on median of the longest axis. The runtime of the improved ray tracing method demonstrates that the current heuristic provides sounding acceleration and spilt the primtives roughly even, which makes the overall runtime approach to O(logn).

Part 3: Direct Illumination

In the section of the project, I implement the direct lighting function with both hemisphere sampling and important sampling. The procedure is the following:

CBbuuny.dae: hemisphere sampling, #camera ray per pixel = 64, #samples per area of light = 32
CBbuuny.dae: important sampling, #camera ray per pixel = 64, #samples per area of light = 32
CBspheres_lambertian.dae: hemisphere sampling, #camera ray per pixel = 64, #samples per area of light = 32
CBspheres_lambertian.dae: important sampling, #camera ray per pixel = 64, #samples per area of light = 32

From the above renderings, we observe that with high camera ray per pixel and large number of samples per area of light, both hemisphere and importance sampling converge to a reasonable degree of accuracy in terms of shading and shadow. This shows that as we shoot more rays into the scene and sample more , the renderings will approach to reality for both methods. Constrasting the hemisphere sampling renderings with importance sampling renderings, hemisphere rendering has more noise despite high sampling rate. That is because the method sample uniformly of the entire hemisphere while our light sources only lie in a tiny portion of the hemisphere. This results in a lack of sampling at the light sources which create those dark "noise" spots in the rendering. For importance sampling, we only sample the direction that are able to hit the light, which maximize the ability the trace the light source and luminate the scene. This comes with the tradeoff that if we are running with a low sample rate, there will be a lot of point light reflection scattered in the scene which leads to inaccuracy of the artifact. Despite imperfections, the importance sampling is a more efficient sampling method as it converges more quickly than the uniform hemisphere sampling which enables us to render the artifact accurately without sampling too many times.

dragon.dae: hemisphere sampling, #camera ray per pixel = 1, #samples per area of light = 1
dragon.dae: important sampling, #camera ray = 4 per pixel, #samples per area of light = 1
dragon.dae: hemisphere sampling, #camera ray per pixel = 16, #samples per area of light = 1
dragon.dae: important sampling, #camera ray = 64 per pixel, #samples per area of light = 1

Part 4: Global Illumination

In order to implement the global illumination, I do the following:

CBspheres_lambertian.dae: #camera ray per pixel = 16, #samples per area of light = 1024
CBbuuny.dae: #camera ray = 16 per pixel, #samples per area of light = 1024
CBspheres_lambertian.dae: direct illumination
CBspheres_lambertian.dae: indirect illumination

CBbuuny.dae: max_ray_depth = 0
CBbuuny.dae: max_ray_depth = 1
CBbuuny.dae: max_ray_depth = 2
CBbuuny.dae: max_ray_depth = 3
CBbuuny.dae: max_ray_depth = 100

From a series of CBunny with max_ray_depth: 0,1,2,3,100, we can see that as the max_ray_depth increases, the scene is more ligt up and luminance is higher. In 0 bounce image, we can only see the light emitted directly from the light source, i.e., the above light. For one bounce image, this is basically the direct illumination of the scene with light source and light bounce of from object, which is the same as what we have implemented in part 3. As the max_ray_depth increases, we get a more luminous and detailed image because we are tracing the ray bouncing off from objects. The 100 max depth is where the rendering feels the most rich in detail and proximal to reality. One thing I notice is that, with the russian roulette termination, we get a relatively quick rendering speed in comparison to lower max_ray_depth. That is because we do not trace every ray into 100's bounce but instead terminate the ray tracing based on probability of 0.3 which the the russian roulette coefficient and the expected value of the luminance in the rendering will still be the same.


CBspheres_lambertian.dae: sample-per-pixel = 1
CBspheres_lambertian.dae: sample-per-pixel = 2
CBspheres_lambertian.dae: sample-per-pixel = 4
CBspheres_lambertian.dae: sample-per-pixel = 8
CBspheres_lambertian.dae: sample-per-pixel = 16
CBspheres_lambertian.dae: sample-per-pixel = 64
CBspheres_lambertian.dae: sample-per-pixel = 1024

As shown in the graph, with the increase in sample per pixel,the noise level of the rendering devreases porportionally.

Part 5: Adaptive Sampling

For adaptive sampling, we implemented the algorithm provided to detect speed of converge. First we add accumulators s1, s2 and use them to find mean and variance. After that, we calculate I = 1.96 * mean / sqrt(variance) and compare it with the max_tolerance. If it is smaller than the max_tolerance, we break the samping loop and put the averaged luminance in the sample buffer. Else, we keep looping until we sample "ns_aa" amount of time. In order to improve efficiency, we only test convergent every "samplesPerBatch" time. By implementing this method, we sample more frequently on the part of the image that is hard to converge and sample less frequently on area such as light source that is easily converged.

CBbunny.dae: Rendering
CBbunny.dae: Sample rate graph

As shown in the sample rate photo, the light source is sampled the least frequently, following by the cornell box. The most sampled area lies within the artifact bunny and the shadow which is relatively hard to converge.