Calcolo numerico per la generazione di immagini fotorealistiche
Maurizio Tomasi maurizio.tomasi@unimi.it
| Week | Topic |
|---|---|
| March, 2nd | Colors |
| March, 9th | HDR files |
| March, 16th | No classes! |
| March, 23rd | Tone mapping |
| March, 30th | Linear algebra |
| April, 8th | Clifford algebras (only on Wednesday!) |
| April, 13th | 3D projections |
| April, 20th | Geometrical shapes #1 |
| April, 27th | Geometrical shapes #2 |
| May, 4th | Path tracing #1 |
| May, 11th | Path tracing #2 |
| May, 18th | Compilers #1 |
| May, 25th | Compilers #2 |
Radiance (flux \Phi in Watts normalized on the projected surface per unit solid angle): L = \frac{\mathrm{d}^2\Phi}{\mathrm{d}\Omega\,\mathrm{d}A^\perp} = \frac{\mathrm{d}^2\Phi}{\mathrm{d}\Omega\,\mathrm{d}A\,\cos\theta}, \qquad [L] = \mathrm{W}/\mathrm{m}^2/\mathrm{sr}.
Rendering equation: \begin{aligned} L(x \rightarrow \Theta) = &L_e(x \rightarrow \Theta) +\\ &\int_{\Omega_x} f_r(x, \Psi \rightarrow \Theta)\,L(x \leftarrow \Psi)\,\cos(N_x, \Psi)\,\mathrm{d}\omega_\Psi. \end{aligned}
The quantities \Phi, L, etc. are all wavelength-dependent \lambda (radiance → spectral radiance)
In numerical codes that simulate light propagation, we have to solve two problems:
A function f(\lambda) dependent on the wavelength has an infinite number of degrees of freedom: how can we represent it numerically?
In our case, radiance is perceived as a color: but how do you specify a color when controlling a monitor or a printer?

One number is not enough to encode a color: this is only true for an ideal black body (where the temperature T is a sufficient descriptor)!
Emission spectra of real-world objects can be very complex (see previous lesson):

The term Spectral Power Distribution (SPD) is a generic term that indicates the functional form of a quantity dependent on λ: SPD of radiance, SPD of flux, SPD of emittance, etc.
The plots in the previous slide are in fact representations of different SPDs.
The visual perception of a color depends on the SPD of the irradiance that reaches the color-sensitive photoreceptors of the retina (cones).
There are two types of photoreceptors in the human eye:
Rods: photoreceptor cells highly sensitive to light intensity (~100 million per eye)
Cones: photoreceptor cells sensitive to the color of light (~5 million per eye)
Rods are not sensitive to SPD, and are used mainly in low light conditions.
Since our focus today is on color, we will concentrate on cones.
There are three types of cones (probably):
There are many theories that explain how the brain combines the information from the three types of cones to represent a color.
In the animal world there is a lot of variety: the mantis shrimp has 12 types of cones!

Tristimulus theory of color: it is always possible to encode the color of the signal S(\lambda) perceived by the human eye using three scalar quantities related to the responses B_S(\lambda), B_M(\lambda), and B_L(\lambda) of the cones:
\begin{aligned} s &= k \int_\lambda \mathrm{d}\lambda\,S(\lambda)\,B_S(\lambda),\\ m &= k \int_\lambda \mathrm{d}\lambda\,S(\lambda)\,B_M(\lambda),\\ l &= k \int_\lambda \mathrm{d}\lambda\,S(\lambda)\,B_L(\lambda). \end{aligned}
It is possible that two different signals S_1(\lambda) \not= S_2(\lambda) lead to the same triplet (s, m, l)
In this case, the perceived color for the two signals is indistinguishable to the human eye
The phenomenon is called metamerism, and the two colors associated with the radiation hitting the eye are said to be metameric
Metamerism suggests that our vision applies a “lossy compression” of the input signal, converting it into a simpler representation. This is why ray-tracing is computationally feasible!
There are various color encodings, based on triplets of scalar quantities: XYZ, HSV, HSL, RGB…
Widely used encodings are RGB (monitors) and CMYK (printers)
In this course we will only deal with RGB encoding
RGB encoding uses three scalar quantities to identify a color: red, green, blue (Red, Green, Blue).
Based on the additive synthesis of colors, which is perfect for monitors (printers use subtractive synthesis, and use CMYK encoding).
Inherited from the operation of cathode ray tube (CRT) televisions and replicated on modern LED and LCD screens
![]()
There are various types of screens (cathode ray tubes, LEDs, etc.), and the emission spectra of the three RGB channels can be different:
We will not spend too much time on this for time reasons.
| Red | Green | Blue |
|---|---|---|
Rendering equation expressed for L_\lambda \begin{aligned} L_\lambda(x \rightarrow \Theta) = &L_{e,\lambda}(x \rightarrow \Theta) +\\ &\int_{\Omega_x} f_{r,\lambda}(x, \Psi \rightarrow \Theta)\,L_\lambda(x \leftarrow \Psi)\,\cos(N_x, \Psi)\,\mathrm{d}\omega_\Psi. \end{aligned}
We want to convert the equation in L_\lambda into three equations that provide the three components R, G, B that “drive” each pixel on the monitor.
If f_{r,\lambda} = f_{r, X} is constant in the band X(\lambda) (big approximation!), then
\begin{aligned} L_\lambda(x \rightarrow \Theta) = &L_{e,\lambda}(x \rightarrow \Theta) +\\ % I use \! here to insert some negative space &\int_{\Omega_x}\! f_{r,\lambda}(x, \Psi \rightarrow \Theta)\,L_\lambda(x \leftarrow \Psi)\,\cos(N_x, \Psi)\,\mathrm{d}\omega_\Psi\\ \int_0^\infty\!\!\!\!\!{} X(\lambda)\,L_\lambda(x \rightarrow \Theta)\,\mathrm{d}\lambda = &\int_0^\infty\!\!\!\!\!{} X(\lambda)\,L_{e,\lambda}(x \rightarrow \Theta)\,\mathrm{d}\lambda +\\ &\iint\!\!\mathrm{d}\lambda\,\mathrm{d}\omega_\Psi\,X(\lambda)\,L_\lambda(x \leftarrow \Psi) f_{r,X}(x, \Psi \rightarrow \Theta)\,\cos(N_x, \Psi)\\ L_X(x \rightarrow \Theta) = &L_{X,e}(x \rightarrow \Theta) +\\ &\int_{\Omega_x}\! f_{r,X}(x, \Psi \rightarrow \Theta)\,L_X(x \leftarrow \Psi)\,\cos(N_x, \Psi)\,\mathrm{d}\omega_\Psi. \end{aligned}
If we denote with R, G and B the integrated and converted radiance in the RGB system, the rendering equation translates into a system of three equations.
These can be rewritten as a “vector” equation on \vec c = (R, G, B): \begin{aligned} \vec c(x \rightarrow \Theta) = &\vec c_{e}(x \rightarrow \Theta) +\\ &\int_{\Omega_x} \vec f_r(x, \Psi \rightarrow \Theta)\otimes \vec c(x \leftarrow \Psi)\,\cos(N_x, \Psi)\,\mathrm{d}\omega_\Psi.\\ \end{aligned}
where \vec v \otimes \vec w indicates a “vector” given by the product of the components of \vec v and \vec w.
A monitor can be considered a matrix of emitting points (pixels: picture element)
Each point is controlled by an RGB triplet of values
The possible values are constrained within a finite rangeinterval
Realism in the emission of L by a monitor is therefore generally impossible
Today all monitors and graphics cards support the so-called “24-bit color depth” (16 millions of colors!)
An RGB triplet is encoded by a computer using three 8-bit integer values; for example, in C++ one could use a type like the following:
The total number of RGB combinations is 2^8 \times 2^8 \times 2^8 = 2^{24} = 16\,777\,216.
| Red | Green | Blue |
|---|---|---|
The power emitted by the points of a screen does not vary linearly.
The relationship between the requested emission level I and the flux \Phi actually emitted by a pixel is usually in the form \Phi = \Phi_0 + \bigl(\Phi_\text{max} - \Phi_0\bigr) \left(\frac{I}{I_\text{max}}\right)^\gamma\ \text{for R, G and B},
where I \in [0, I_\text{max}], and \gamma is a characteristic parameter of the device.
In modern monitors, of course I_\text{max} = 255, and I is an integer number.
We assume here that \Phi_0 \approx 0.
\text{value} = \frac{\Phi}{\Phi_\text{max}} \stackrel{\Phi_0 \approx 0}{\approx} \left(\frac12\right)^\gamma \quad\Rightarrow\quad \gamma = \frac{\log 1/2}{\log(\text{value})}
Standard image file formats (PNG, JPEG, TIFF…) all use sRGB encoding
If we want our program to produce easy-to-use images, we must therefore convert the result of the rendering equation from RGB to sRGB.
Tone mapping is the process through which an RGB image is converted into an sRGB image, where by image we mean a matrix of RGB colors.
There are two categories of images that are relevant for this course:
Both LDR and HDR images are encoded by a color matrix; each color is usually an RGB triplet.
The file usually has this content:
LDR format, very common on Unix systems.
You can read and write it using NetPBM or ImageMagick. The latter is the most common; it can be installed under Ubuntu with
$ sudo apt install imagemagick
You can convert images with the command
$ convert input.png output_p6.ppm # P6 Format
$ convert input.jpg -compress none output_p3.ppm # P3 FormatPPM is a format designed to be written and read easily.
A PPM file is a text file, openable with any editor.
Header:
P3;Color Matrix: the R, G, B triplets must be reported as integers starting from the top left corner to the bottom right, proceeding row by row.
P3
3 2
255
255 0 0
0 255 0
0 0 255
255 255 0
255 255 255
0 0 0
It is a type of file that is inspired by PPM, but it is an HDR format
Very important for this course!
It has limited native support in standard image viewers: under
Ubuntu there is only pftools, which is installed with
$ sudo apt install pftoolsWe will write our own tools that will allow us to convert PFM
files to PPM, so pftools will not be necessary
Like PPM files in P6 format, PFM files are also partially text and partially binary.
Header:
PF, plus the character
0x0a (newline);ncol nrows (columns and rows), followed by newline
0x0a;-1.0, followed by 0x0a.Color Matrix: the R, G, B triplets must be written as sequences of 32-bit numbers (so not text!), from left to right and from bottom to top (different from PPM!).