In this project, we implemented a simple rasterizer that creates drawings one pixel at a time.
The program can draw basic shapes, render images, antialias drawings, and perform basic transforms. It also has
the ability to interpolate across triangle faces using barycentric coordinates and to use multiple techniques to
map textures onto a surface.
What we've built is the basics of what a graphics pipeline must have in order to provide an interactive user
experience. Because any shape can be decomposed into a mesh of triangles, the task of rendering triangles is
applied in nearly all graphics applications. In addition, rendering realistic surfaces often involves applying
textures. A common technique in practice is to use barycentric coordinates with trilinear filtering to map texture
coordinates onto a surface, and we implemented this in our project as well.
The key takeaways from this projet for us were figuring out how to translate the math of cartesian geometry and
transforms into code that renders pixels on the screen in the desired way. Thankfully, since we derived all the
equations in lecture and confirmed their correctness beforehand, we were able to avoid mathematical errors and so
most of our time debugging was spent on resolving implementation errors or memory access errors.
We found the implementation of the Mipmap based trilinear interpolation to be an especially challenging and
interesting task, as the involved thinking through the implementation carefully, accurately determining the mipmap
level, and identifying the appropriate finite differences to approximate the partial derivatives.
For rasterization of single color triangles, we implemented the point-in-triangle test for each pixel that we considered. The "point" we test corresponds to the center of each pixel (ie. for the pixel at (x, y), we would test the point (x + 0.5, y + 0.5)). The point-in-triangle test consists of three line tests. A single line test determines on what side a point is in respect to a line created by two vertices of the triangle. We create a line for each of the 3 pairs of vertices, then perform the single line test on our desired point from each of the lines. In order for a point to be within the triangle, all three line tests must be greater than or equal to 0, or all three line tests must be less than or equal to 0.
For each triangle, our algorithm checks a bounding box of values. The coordinates of this rectangle are defined by the max x, max y, min x, and min y values of all three triangle vertices. We also had to ensure that each corner of the rectangle was within the width and height of the image. This approach of looking at only the bounding box around each triangle is more efficient than looking at every pixel in the image, which could be a very time-consuming and costly operation.
Test example 4 shows that we can now render single color triangles! However, we can see that there is still aliasing due to a small sample rate. Some artifacts that are visible are jaggies and even some missing pixels as shown on the top of the magenta triangle.
The datastructures that we used for supersampling were the samplebuffer and framebuffer.
The samplebuffer is used to store a higher-resolution version of the image (of size = sample_rate * height *
width) as we were creating it, while the framebuffer (of size = height * width) holds the pixels that are actually
rendered to the screen.
To generate the anti-aliased images, we first created the high-resolution version of the image and stored it in
the samplebuffer. Then, we averaged Color values of all the samples within a given pixel and stored the result to
the framebuffer. The sample buffer would no longer be needed after resolving the completed pixels
to the framebuffer. This process is known as supersampling.
In addition to the extra samplebuffer data structure, we had to make a couple changes to the code to enable
supersampling. We had to split each x, and y pixel into sqrt(sampling_rate)
pixels. To achieve this, we added an additional double for-loop within the loop over x, y indices.
We gave these values sub_x, and sub_y. The sampled colors were then saved to samplebuffer using the same technique
to check if the point was in the triangle. However, we had to keep track of pixel location relative to the
triangle and relative to the samplebuffer, which was scaled by a multiple of sample_rate. We had to modify
fill_pixel to take in sub_pixel_x and sub_pixel_y as parameter inputs
to account for the larger size of the samplebuffer.
To translate back to the original framebuffer, we took an average of the subpixels that made up each original
pixel and place this value into the framebuffer in the original x, y location.
|
|
|
|
In the examples above we can see the aliasing from the first part slowly descrease as we increase our sampling rate. From afar, it is hard to tell that there were ever jaggies! These changes are observed because without super sampling, pixels were either on or off. If the center of the pixel was within the triangle, it would be filled otherwise it would be left blank or white. With supersampling, if a corner of the triangle is filled, then a weighted portion of it's color will show up. This gives the image its lighter red shaded pixels where only some of the pixel is within the triangle.
Cubeman learns to play basketball. In this image, we explored rotations, translations, and stretching matrices to position cubeman to play basketball.
Rotations and Scaling fall under a class of transformations known as linear transformations, because they preserve superposition and scaling. To allow for translations as well (an affine transformation), we use the homogenous coordinates representation, ie. appending a "1" to our points so that translations can also be implemented with a matrix-vector multiplication.
Barycentric coordinates is a system to describe points in the plane with respect to three vertices of a specified
triangle. We use alpha, beta, and gamma to denote the proportion of distance along an altitude to the side
opposite the A, B, and C vertices respectively. Alternatively, Barycentric coordinates can be understood from the
perspective of proportional areas. We create three subtriangles by connecting our point in question to the
vertices of the specified triangle, and alpha, beta, and gamma are proportional to the areas of the subtriangles
opposite to the A, B, and C vertices respectively.
It is easy to see that barycentric coordinates have the property that alpha + beta + gamma = 1 for any point
(alpha, beta, gamma). So even though it is a triple of values, there are only two degrees of freedom as the
coordinates describe 2-dimensional space.
The barycentric coordinate values can be found using a similar formula to the line test above, with an additional
step of normalizing by the distance from the triangle vertex to the opposite side.
As before, the sampled (x, y) point must be at the center of the pixel. We find the distance from the point to a
side, and divide by the distance of the opposite vertex. This was the equation we implemented in our code.
Barycentric coordinate values can be useful in in terms of smoothly interpolating values over a triangle. For
example, if we have pure red, green and blue values at the corners of a triangle, we can use barycentric
coordinates to weight each color for any point inside the triangle and interpolate a smooth gradient of colors
over the surface.
In the example above, we see what happens when we see a triangle that is interpolated from 3 color values at the vertices (red, green, blue). As we get closer to the middle, these values are blended together using a weighted sum of the Color at vertex i times it's respective alpha, beta, or gamma value. Barycentric coordinates can be used for different types of values such as colors and textures.
The above image is an image of many tri-color triangles arranged in a circle. We can see how barycentric coordinates allows for seamless blendings of each of the values.
|
|
|
|
|
|
|
|
|
|
Mipmaps require higher memory usage, because many layers of the texture at
various resolutions have to be stored. However, this additional memory usage
allows for faster antialiasing. In the examples above, we see that using linear and
nearest layers reduced some of the noise seen in using constant level 0.
Using mipmaps overall appears to be a powerful way to reduce aliasing effects without too much additional memory
overhead, though there is some additional computation involved in calculating the partial derivatives and taking
the logarithm to calculate the mipmap level. Performing linear interpolation between mipmap levels adds even more
computation, though it does not require any more memory. In the cases we tried, the improvement from trilinear
interpolation (as opposed to nearest mipmap level) seems only marginal, and so not worth the additional
computation. However, the improvement from bilinear pixel sampling compared to nearest pixel from the texture
seems to yield a significant improvement at a relatively cheaper compute cost (and no additional memory use), so
it would be worth it.
We see that the trilinear filtering, where we used linear-level
and linear pixel sampling, looked the most smoothened. In comparison, using
zero-level and nearest pixel sampling looks as if it has more detail, but also
more artifacts and jaggies.
From an eyeball test considering the antialiasing power and compute costs, we would say that
nearest level mipmap (L_NEAREST) with bilinear pixel interpolation (P_LINEAR) would be the best choice for
rendering high-quality graphics without too much wasted compute effort.