I am so glad to have written this article and developed the demo site, and there is no doubt that I could have done this without your assistance, presence, and inspiration.
This is an article on Volumetric rendering of 3D data sets using Volumetric Raycasting where I will attempt to present the ideas and techniques used to perform it.
Check this live site to see the demo.
Let’s Start,
Image Order and Object Order Rendering :
Volume rendering can be performed by Image order, object order, or a domain-based rendering technique.
In the Image order rendering algorithm, each ray is cast through the volume data from each pixel in the image plane to find the color of that pixel. Meanwhile, In the object order rendering algorithm, volume data is mapped into the image plane. Hence, Object Order uses forward mapping whereas Image order uses backward mapping.
It’s worth to know about both image order and object order rendering scheme so I will try again to present the same basic concept but with a different approach which is finding the difference.
The distinction between these two is what goes into the outer loop of computation.
- Image Order Rendering:
In Image order rendering, the outer loop goes from pixel by pixel. So basically it loops through every pixel and inside that loop there is a bunch of works to color that pixel.
- Object Order Rendering:
In Object order rendering, the outer loop iterates over the data, or goes from an object to an object in the scene ( more precisely from primitives by primitives which is generally triangle in most models). Then perform the rasterization.
If you haven’t still understood these basic concepts, please check this great website by Prof Dr. Steven M. laValle.
This is a summation of all these tasks that we are going to discuss and perform to achieve our goal.
We start with a ray originating from the camera position. Mathematically, the ray can be expressed by an equation similar to that of a straight line.
y = m*x + c
where m is the orientation of a line and c is a point of intersection with the y-axis.
Likewise, the ray as a vector from the camera to fragment in world space can be expressed with an equation:
current_point = (orientation/direction) * (number of steps) + offset
In the above equation, offset is a ray origin i.e camera/eye position in world space and orientation is a ray vector. The number of steps is a floating-point number which allows us to define the position of a ray at any point in that direction.
To perform the ray traversal in the volume space, we need to find the point of intersection of a ray and volume’s surface. To perform that operation, we assume that our bounding box is an Axis-Aligned Bounding Box ( AABB ) which means that it is oriented along the world space axis of the cartesian coordinate system. AABB moves with the object but remains aligned with axes. To better understand this, I strongly recommend you to watch this great video demonstrating AABB by Prof Dr. Ian Parberry.
Axially Aligned Bounding Box (AABB):
If you watch this video clip carefully you can notice the unique behavior of the bounding box ( wireframe ) as the plane rotates in the scene.
Another popular type of bounding box is the Oriented Bounding Box ( OBB ). Though we do not use this in our case, it’s really important to know about it for its great applications. You can skip this if you are not interested in this.
Unlike AABB, the Oriented Bounding box rotates along with the object that it is attached to. Since OBB takes rotation into account, we need to calculate or figure out the orientation/axes direction of the box. Here is another great video on OBB and observe the nature of the bounding box as the object rotates.
Oriented Bounding Box (OBB) :
Task: check more deeply and compare the two bounding boxes in the respective videos. There is one main feature that makes the AABB and OBB great in their own way. When you find it, you can post it to the comment section.
Tip: We can’t use AABB in medical visualization for treatment monitoring.
Sorry for driving you off-road for while but I think it was worth it to know. I hope that now you think the same.
AABB can be represented with just minimum and maximum extends along each axis whereas OBB representation must encode orientation along with position and width.
Ray Box Intersection:
Now we have a ray and bounding box, the next step is finding a point of intersection between the ray and these bounding planes.
Suppose we define the bounding box with a min point (B0x, B0y, B0z) and the origin of the ray is (Ox, Oy, Oz). Then, the ray equation for this min-point B0x becomes,
B0x = Dx * t0x + Ox
In the same way, the ray equation can be expressed for B0y and B0z and these equations can be arranged to find their respective value of t.
t0x = (B0x – Ox) / Dx ———(i)
t0y = (B0y – Oy) / Dy ———(ii)
t0z = (B0z – Oz) / Dz ———-(ii)
Likewise, we can perform this for the max point of the bounding box which will give us t1x, t1y, and t1z respectively.
If you are struggling with these mathematical representations and expressions, I suggest you read this great article on Ray-Box Intersection by Scratchpixel.
Since we have three orthogonal bounding planes for each of the two points, there will be a total of six values. The trouble here is finding which one of these points is the perfect point. The recipe to get this is to find ” greatest min and smallest max“.
” min of the max and max of the min are the points touching the box surface “
GLSL code to implement this is :
Visualizing “min of max and max of min” logic in 3D:
if you can visualize why ” min of max in 3D ” to find the perfect point, you can skip this. When I was learning Ray Intersection, I had a hard time visualizing this because most of the contents online deal only with 1D or 2D. So, I am trying here to visualize and explain with a case in 3D to reduce your effort and frustration.
In the above 3D Space, the wireframe is a box, the black line is the ray originating from the camera/eye and three colored dots are the point of intersection of min bounding surface and that ray. The red dot is the point of intersection with the X-axis plane, the blue one is with the Z-axis plane and the light black is on the Y-axis plane. In the image, it is pretty visible that red is farthest from the cube and blue is less far than red but still farther than light black which is on the surface or face of the cube which means red is the minimum of all three values whereas light black is the maximum value. So the maximum valued point of intersection of the minimum plane is the perfect point of intersection in the min bounding plane.
Likewise, in the max bounding surface, fig (b) is the same as fig(iv) but instead, it is on the max point side. Think it in the same way as we have to find the point closest to the surface of the box but here instead of the max value it’s rather min value which gives us the perfect point on the surface of the box.
Finding Stepping Interval:
After you have a point of intersection, all we need is to traverse a ray through the pixel into volume through the sample at regular intervals. Before we start traversal, we should know the sampling stepsize interval along the voxel of the volume along that ray direction.
For this, I believe a paper titled A Fast Voxel Traversal Algorithm for Ray Tracing (1987) by John Amanatides and Andrew Woo is a great read.
I will try to summarize the logic presented by the paper to calculate the step size.
Stepsize interval is a measure of how far we can travel along the ray so that the ray will hit a boundary of cell or voxel. The minimum of these values along all three axes gives us the length we need to travel so that we hit the first boundary.
In fig (c), the green circle is the point of intersection between the grid and the ray. From that green point, the ray has traversed tMaxX before crossing the horizontal boundary and tMaxY before crossing the Vertical boundary of the cell. Extending these to forward direction, the minimum of these values gives the shortest distance ray has to travel to cross the boundary.
GLSL code to implement this is:
Since we now know the sampling stepping interval along the ray, we can fetch a value from 3D volume texture at those sampling positions. Using that value, we fetch the color from 2D texture.
Compositing/Alpha Blending:
After fetching, we need to measure the color reflected at each sample point along the direction of the ray. This process is commonly called Compositing and it is defined as an accumulation of the colors of multiple surfaces or the values weighted by alpha or opacity.
There are actually two ways of representing the color value. One is Premultiplied RGBA and another one is straight RGBA. Straight RGBA is often called as non premultiplied RGBA. We will be using straight RGBA in our case.
Premultiplied Compositing:
Color_blended = Color_Source + Color_dest ( 1- alpha_source)
We apply/perform this computation over and over until we run out of the samples or we meet conditions for early termination.
Compositing can be done using a transfer function or using a ray function. Here, we are going to use a transfer function for compositing.
There are two popular ray traversal strategies: Front to Back and Back to Front. Since we are working on raycasting, we will be traversing from front to back.
Note: Those two equations of compositing above are on front to back traversal.
In Front to Back traversal, we leverage the advantage of early ray termination. For example, if the contribution of the color of the farther voxels is not much to the pixel color i.e when accumulated opacity is high, traversal can be terminated at that point.
GLSL Code to Implement this,
3D-texture support in WebGL 2.0
In WebGL 2.0, WebGL now support 3D texture in a way very similar to 2D texture. Like they had texStorage2D() now they have textStorage3D() for 3D texture. Meanwhile, like textSubImage2D() they also now have texSubImage3D().
So which one to choose from? This is the same condition as in 2D but nobody talks about this on the internet even for 2D texture so I am bringing this here.
In the context of what to chose, texStorage3D() allocates the memory immediately on GPU, while this isn’t a case with texImage3D() because in this you don’t specify all the MIP levels. So, the difference lies in when is MIPs specified. Think of this as a choice between efficiency and flexibility.
In my case, I have used texImage3D( ).
Complete code is available in this GitHub repo. For any queries, you can contact me on LinkedIn
Thank you. Happy Learning !!!!
7
This is the exact thing that i was searching and thanks for your great content. I like the way you not only present the idea but also provide the other great sources to help understand better. Kudos to your effort and keep posting more frequently.
Thank you for reading and your kind words. I am glad you found and enjoyed reading it.
I always find the camera tricky thing to understand programmatically. I know I am off-topic but could please help me understand how camera rotate on the scene ? and thanks for writing this rare peace of gem.
Hello, I am sorry for late reply and thank you for your inspiring words.
If you are coming from 3D library like three.js like me at first, it is common to ask this question. They (libraries) present camera as an object which rotate around the scene and they make it look like you are adding controller to the camera. This is pretty common way of understanding because this help us on better visualizing what we are doing. But, programmatically this is not what you are thinking. There is no thing called camera and camera is just an inverse of the view matrix. If you deeply think this, you can understand that we are not actually rotating camera rather we are moving the scene itself. We are making it camera movement by making same relative motion. So its not camera that rotate as it’s scene that rotates or translates.
For answer of the question- how camera rotate, there are multiple ways to rotate the camera or scene. You can google it. In this project, I have implemented the arc ball camera which is implementation of https://www.talisman.org/~erlkonig/misc/shoemake92-arcball.pdf
by Ken Shoemake.