Calcolo numerico per la generazione di immagini fotorealistiche
Maurizio Tomasi maurizio.tomasi@unimi.it
The HdrImage
class must be able to load and save
images to disk.
Since HdrImage
uses floating-point for the three
color components (red, green, blue), an HDR format is required, so PNG,
JPEG and the like are not suitable.
We will use the PFM format.
Writing PFM files is relatively trivial, because they have a very simple format
A PFM file is a binary file, but it starts as if it were a text file (with ASCII characters, so we don’t have to worry about Unicode):
PF
width height
±1.0
where width
and height
are the width
(number of columns) and height (number of rows) of the image; then the
RGB values follow in binary.
The newline characters in the first three lines must always and
only be encoded as 0x0a
(\n
). Thus, on Windows
you must not write newlines using
0x0d 0x0a
(\r\n
).
±1.0
The third line of the file header must contain a positive (e.g.,
1.0
) or negative number.
This number is used to signal how each of the RGB components of a color (32-bit floating-point) is encoded:
When writing, we could choose one of the two formats and not worry too much, but when reading, we must handle both!
You must be sure to floating-point numbers in binary! In C++,
prints the characters «1
», «.
» and
«3
» (text encoding!).
Each language has a different approach; in Python for example struct
is used:
HdrImage
APIThe way a data type or function should be used by the programmer is called Application Program Interface (API).
In our case, the API for writing a PFM file consists of how we
would invoke a write_pfm
function:
The API type should also be modeled based on the tests that need to be written on it.
Let’s consider the case of write_pfm
. How should we
write a test for this function?
If the function writes to a file, it means that we should then load the file and verify that it was written correctly.
Does this mean that until we have a parallel
read_pfm
routine we can’t test
write_pfm
?
You can think of a binary file as a vector (one-dimensional array) of bytes, one after the other. (A text file is the same, but in UTF encoding it is a sequence of code points rather than bytes, and it is a bit more complicated).
Modern languages introduce an abstraction: the stream. (Alas, D doesn’t have it in its standard libraries: if you program in D, use stream.d)
This abstraction is very useful in tests.
Simply put, a stream is an object capable of performing these operations:
These two operations are those typically performed on files, but a stream is also applicable to other contexts:
We could consider modifying our API so that it writes to a
generic stream, like the write_hello
example in the
video:
When the program is running, we’ll make stream
a
real file.
When we need to run a test, we can instead make
stream
an in-memory variable. The bytes will not be written
to a file, but kept in a byte array, on which we will perform
assert
s.
write_pfm
Methoddef write_pfm(self, stream, endianness=Endianness.LITTLE_ENDIAN):
endianness_str = "-1.0" if endianness == Endianness.LITTLE_ENDIAN else "1.0"
# The PFM header, as a Python string (UTF-8)
header = f"PF\n{self.width} {self.height}\n{endianness_str}\n"
# Convert the header into a sequence of bytes
stream.write(header.encode("ascii"))
# Write the image (bottom-to-up, left-to-right)
for y in reversed(range(self.height)):
for x in range(self.width):
color = self.get_pixel(x, y)
_write_float(stream, color.r, endianness)
_write_float(stream, color.g, endianness)
_write_float(stream, color.b, endianness)
I created two PFM files with these characteristics:
One is encoded as little endian (-1.0
), the
other as big endian (1.0
);
Color matrix (RGB) of size 3×2 pixels:
#1 | #2 | #3 | |
---|---|---|---|
#A | (10, 20, 30) | (40, 50, 60) | (70, 80, 90) |
#B | (100, 200, 300) | (400, 500, 600) | (700, 800, 900) |
It is useful to have the files on your disk. Download them with
the names reference_le.pfm
and reference_be.pfm
and
save them in your repository, possibly in the same directory as the
tests.
The first approach is to read the reference_le.pfm
file
and compare it with the file that would have been written by
write_pfm
:
img = HdrImage(3, 2)
img.set_pixel(0, 0, Color(1.0e1, 2.0e1, 3.0e1)) # Each component is
img.set_pixel(1, 0, Color(4.0e1, 5.0e1, 6.0e1)) # different from any
img.set_pixel(2, 0, Color(7.0e1, 8.0e1, 9.0e1)) # other: important in
img.set_pixel(0, 1, Color(1.0e2, 2.0e2, 3.0e2)) # tests!
img.set_pixel(1, 1, Color(4.0e2, 5.0e2, 6.0e2))
img.set_pixel(2, 1, Color(7.0e2, 8.0e2, 9.0e2))
buf = BytesIO()
img.write_pfm(buf, endianness=Endianness.LITTLE_ENDIAN)
with open("reference_le.pfm", "rb") as inpf:
reference_bytes = inpf.readall()
assert buf.getvalue() == reference_bytes
Another approach is possible. If we run xxd
on the file
reference_le.pfm
, we can get the sequence of byte values in
C/C++ format:
$ xxd -i reference_le.pfm
unsigned char reference_le_pfm[] = {
0x50, 0x46, 0x0a, 0x33, 0x20, 0x32, 0x0a, 0x2d, 0x31, 0x2e, 0x30, 0x0a,
0x00, 0x00, 0xc8, 0x42, 0x00, 0x00, 0x48, 0x43, 0x00, 0x00, 0x96, 0x43,
0x00, 0x00, 0xc8, 0x43, 0x00, 0x00, 0xfa, 0x43, 0x00, 0x00, 0x16, 0x44,
0x00, 0x00, 0x2f, 0x44, 0x00, 0x00, 0x48, 0x44, 0x00, 0x00, 0x61, 0x44,
0x00, 0x00, 0x20, 0x41, 0x00, 0x00, 0xa0, 0x41, 0x00, 0x00, 0xf0, 0x41,
0x00, 0x00, 0x20, 0x42, 0x00, 0x00, 0x48, 0x42, 0x00, 0x00, 0x70, 0x42,
0x00, 0x00, 0x8c, 0x42, 0x00, 0x00, 0xa0, 0x42, 0x00, 0x00, 0xb4, 0x42
};
unsigned int reference_le_pfm_len = 84;
If we insert this byte sequence into our program, we can make a direct comparison in memory:
# Create "img" as in the previous case, then…
# Little-endian format
reference_bytes = bytes([
0x50, 0x46, 0x0a, 0x33, 0x20, 0x32, 0x0a, 0x2d, 0x31, 0x2e, 0x30, 0x0a,
0x00, 0x00, 0xc8, 0x42, 0x00, 0x00, 0x48, 0x43, 0x00, 0x00, 0x96, 0x43,
0x00, 0x00, 0xc8, 0x43, 0x00, 0x00, 0xfa, 0x43, 0x00, 0x00, 0x16, 0x44,
0x00, 0x00, 0x2f, 0x44, 0x00, 0x00, 0x48, 0x44, 0x00, 0x00, 0x61, 0x44,
0x00, 0x00, 0x20, 0x41, 0x00, 0x00, 0xa0, 0x41, 0x00, 0x00, 0xf0, 0x41,
0x00, 0x00, 0x20, 0x42, 0x00, 0x00, 0x48, 0x42, 0x00, 0x00, 0x70, 0x42,
0x00, 0x00, 0x8c, 0x42, 0x00, 0x00, 0xa0, 0x42, 0x00, 0x00, 0xb4, 0x42
])
# No file is being read/written here!
buf = BytesIO()
img.write_pfm(buf)
assert buf.getvalue() == reference_bytes
Let’s now address the much more challenging problem of reading files.
Unlike writing, reading presents more difficulties:
Implementing PFM reading within a constructor is possible (C++):
It’s ok to define a read_pfm_file
function:
The problem of picking which possibility is the best is related to the choice of an API.
The choice depends on personal taste and other factors:
In the previous slides, the C++ interface to read a file is
through a stream (std::istream
);
We saw that streams simplify the creation of tests, as they do not need to read data from disk;
Moreover, the code is more versatile: instead of reading data from disk, we could read it from an Internet connection or a compressed file:
Anyway, it would be really handy to open a file directly:
In C++, we could think of implementing an overloaded constructor:
Unfortunately this is not valid C++: the code does not compile!
The std::ifstream
instance in the second constructor
is temporary and cannot be passed to the first constructor:
$ g++ -c hdrimages.cpp
hdrimages.cpp: In constructor ‘HdrImage::HdrImage(const string&)’:
hdrimages.cpp:33:58: error: cannot bind non-const lvalue reference of type ‘std::istream&’ {aka ‘std::basic_istream<char>&’} to an rvalue of type ‘std::basic_istream<char>’
There are similar problems in C♯ and Kotlin, as secondar constructors must first call primary constructors before anything else. This problem is known as constructor chaining issue.
The easiest solution is to implement the functionality in a function/method and call it in both constructors.
struct HdrImage {
private:
// Put the code that reads the PFM file in a separate method
void read_pfm_file(std::istream & stream);
public:
// First constructor: invoke `read_pfm_file`
HdrImage(std::istream & stream) { read_pfm_file(stream); }
// Second constructor: again, invoke `read_pfm_file`
HdrImage(const std::string & file_name) {
std::ifstream stream{file_name};
read_pfm_file(stream);
}
};
Let’s recall the shape of the PFM format:
PF
width height
±1.0
<binary data>
The reading code must verify the following things:
PF\n
, otherwise it is not in
PFM format;1.0
or
-1.0
;std::expected
in C++ or Result
in
Rust.InvalidPfmFileFormat
In Python, it is enough to create a class derived from
Exception
:
class InvalidPfmFileFormat(Exception):
def __init__(self, error_message: str):
super().__init__(error_message)
Note that we accept an error message to better identify which problem occurred when reading the PFM file.
If possible, follow the same strategy for your language, perhaps
being careful to derive the class from a pre-existing exception suitable
for the context (e.g., System.FormatException
in C#,
RuntimeException
in Kotlin).
If we handle error conditions using exceptions, we can decide how to handle errors depending on the context.
For example, in a main
that wants to open a PFM file
provided by the user:
In a unit test that needs to open a file containing
reference data, we would not catch the exception in a
try ... except
.
To interpret a PFM file, we will need to call standard library functions of our language:
320
).In case of errors, the language’s core functions can raise
exceptions (e.g., ValueError
in Python when trying to
convert a string like hello, world!
to an
integer).
We need to ensure that we “catch” these exceptions and convert
them into InvalidPfmFileFormat
, otherwise, the code from
the previous slide would no longer work.
We saw in the last lesson that it is easier to write tests for smaller functions.
In our case, reading a PFM file could rely on the following functions:
\n
(if
the language does not already provide one);Each of these functions can be tested in a dedicated unit test.
_FLOAT_STRUCT_FORMAT = {
Endianness.LITTLE_ENDIAN: "<f",
Endianness.BIG_ENDIAN: ">f",
}
# This function is meant to be used with PFM files only! It raises a
# InvalidPfmFileFormat exception if not enough bytes are available.
def _read_float(stream, endianness=Endianness.LITTLE_ENDIAN):
format_str = _FLOAT_STRUCT_FORMAT[endianness]
try:
return struct.unpack(format_str, stream.read(4))[0]
except struct.error:
# Capture the exception and convert it in a more appropriate type
raise InvalidPfmFileFormat("impossible to read binary data from the file")
In Python, we can avoid implementing tests for
_read_float
: it is a function that simply acts as a
wrapper for a standard function in the Python library.
However, if the standard library of your language does not provide a similar feature, you should test it…
…but actually we will verify its behavior when we test reading a PFM file from start to finish, so you can avoid creating an unit test for it.
def _parse_endianness(line: str):
try:
value = float(line)
except ValueError:
raise InvalidPfmFileFormat("missing endianness specification")
if value > 0:
return Endianness.BIG_ENDIAN
elif value < 0:
return Endianness.LITTLE_ENDIAN
else:
raise InvalidPfmFileFormat("invalid endianness specification, it cannot be zero")
def test_pfm_parse_endianness():
assert _parse_endianness("1.0") == Endianness.BIG_ENDIAN
assert _parse_endianness("-1.0") == Endianness.LITTLE_ENDIAN
# We must test that the function properly raises an exception when
# wrong input is passed. Here we use the "pytest" framework to do this.
with pytest.raises(InvalidPfmFileFormat):
_ = _parse_endianness("0.0")
with pytest.raises(InvalidPfmFileFormat):
_ = _parse_endianness("abc")
def _parse_img_size(line: str):
elements = line.split(" ")
if len(elements) != 2:
raise InvalidPfmFileFormat("invalid image size specification")
try:
width, height = (int(elements[0]), int(elements[1]))
if (width < 0) or (height < 0):
raise ValueError()
except ValueError:
raise InvalidPfmFileFormat("invalid width/height")
return width, height
read_pfm_image
def read_pfm_image(stream):
# The first bytes in a binary file are usually called «magic bytes»
# See https://hackers.town/@zwol/114155595855705796
magic = _read_line(stream)
if magic != "PF":
raise InvalidPfmFileFormat("invalid magic in PFM file")
img_size = _read_line(stream)
(width, height) = _parse_img_size(img_size)
endianness_line = _read_line(stream)
endianness = _parse_endianness(endianness_line)
result = HdrImage(width=width, height=height)
for y in range(height - 1, -1, -1):
for x in range(width):
(r, g, b) = [_read_float(stream, endianness) for i in range(3)]
result.set_pixel(x, y, Color(r, g, b))
return result
We have implemented tests for all the functions on which
read_pfm_image
is built: _read_line
,
_parse_endianness
, etc.
But how can we be sure that we have correctly combined the functions?
It is necessary to go beyond unit tests and perform a test that runs the entire machinery from start to finish.
A test on a complex function that calls already tested simple functions is called an integration test.
Specifically, our test must verify functionality on
little-endian files (reference_le.pfm
),
big-endian files (reference_be.pfm
), and
also on invalid files.
read_pfm_file
# This is the content of "reference_le.pfm" (little-endian file)
LE_REFERENCE_BYTES = bytes([
0x50, 0x46, 0x0a, 0x33, 0x20, 0x32, 0x0a, 0x2d, 0x31, 0x2e, 0x30, 0x0a,
0x00, 0x00, 0xc8, 0x42, 0x00, 0x00, 0x48, 0x43, 0x00, 0x00, 0x96, 0x43,
0x00, 0x00, 0xc8, 0x43, 0x00, 0x00, 0xfa, 0x43, 0x00, 0x00, 0x16, 0x44,
0x00, 0x00, 0x2f, 0x44, 0x00, 0x00, 0x48, 0x44, 0x00, 0x00, 0x61, 0x44,
0x00, 0x00, 0x20, 0x41, 0x00, 0x00, 0xa0, 0x41, 0x00, 0x00, 0xf0, 0x41,
0x00, 0x00, 0x20, 0x42, 0x00, 0x00, 0x48, 0x42, 0x00, 0x00, 0x70, 0x42,
0x00, 0x00, 0x8c, 0x42, 0x00, 0x00, 0xa0, 0x42, 0x00, 0x00, 0xb4, 0x42
])
# This is the content of "reference_be.pfm" (big-endian file)
BE_REFERENCE_BYTES = bytes([
0x50, 0x46, 0x0a, 0x33, 0x20, 0x32, 0x0a, 0x31, 0x2e, 0x30, 0x0a, 0x42,
0xc8, 0x00, 0x00, 0x43, 0x48, 0x00, 0x00, 0x43, 0x96, 0x00, 0x00, 0x43,
0xc8, 0x00, 0x00, 0x43, 0xfa, 0x00, 0x00, 0x44, 0x16, 0x00, 0x00, 0x44,
0x2f, 0x00, 0x00, 0x44, 0x48, 0x00, 0x00, 0x44, 0x61, 0x00, 0x00, 0x41,
0x20, 0x00, 0x00, 0x41, 0xa0, 0x00, 0x00, 0x41, 0xf0, 0x00, 0x00, 0x42,
0x20, 0x00, 0x00, 0x42, 0x48, 0x00, 0x00, 0x42, 0x70, 0x00, 0x00, 0x42,
0x8c, 0x00, 0x00, 0x42, 0xa0, 0x00, 0x00, 0x42, 0xb4, 0x00, 0x00
])
def test_pfm_read(self):
for reference_bytes in [LE_REFERENCE_BYTES, BE_REFERENCE_BYTES]:
img = read_pfm_image(BytesIO(reference_bytes))
assert img.width == 3
assert img.height == 2
assert img.get_pixel(0, 0).is_close(Color(1.0e1, 2.0e1, 3.0e1))
assert img.get_pixel(1, 0).is_close(Color(4.0e1, 5.0e1, 6.0e1))
assert img.get_pixel(2, 0).is_close(Color(7.0e1, 8.0e1, 9.0e1))
assert img.get_pixel(0, 1).is_close(Color(1.0e2, 2.0e2, 3.0e2))
assert img.get_pixel(0, 0).is_close(Color(1.0e1, 2.0e1, 3.0e1))
assert img.get_pixel(1, 1).is_close(Color(4.0e2, 5.0e2, 6.0e2))
assert img.get_pixel(2, 1).is_close(Color(7.0e2, 8.0e2, 9.0e2))
def test_pfm_read_wrong(self):
buf = BytesIO(b"PF\n3 2\n-1.0\nstop")
with pytest.raises(InvalidPfmFileFormat):
_ = read_pfm_image(buf)
Implement the following functions:
_read_float
in the
Python example);\n
or the end of the
stream (_read_line
);_parse_img_size
);_parse_endianness
).Implement a function/method that reads a PFM file from a stream.
Implement the same tests as in the Python example. Also, verify that your methods correctly handle errors.
For file access, C++ is not very sophisticated: you open the file
for writing using std::ofstream
.
In-memory streams (like ByteIO
in Python) are
implemented by std::stringstream
(in <sstream>
):
C++ does not offer many tools to decompose a float
variable into its four bytes; use this implementation and study it
carefully:
#include <cstdint> // It contains uint8_t
enum class Endianness { little_endian, big_endian };
void write_float(std::ostream &stream, float value, Endianness endianness) {
// Convert "value" in a sequence of 32 bit
uint32_t double_word{*((uint32_t *)&value)};
// Extract the four bytes in "double_word" using bit-level operators
uint8_t bytes[] = {
static_cast<uint8_t>(double_word & 0xFF), // Least significant byte
static_cast<uint8_t>((double_word >> 8) & 0xFF),
static_cast<uint8_t>((double_word >> 16) & 0xFF),
static_cast<uint8_t>((double_word >> 24) & 0xFF), // Most significant byte
};
switch (endianness) {
case Endianness::little_endian:
for (int i{}; i < 4; ++i) // Forward loop
stream << bytes[i];
break;
case Endianness::big_endian:
for (int i{3}; i >= 0; --i) // Backward loop
stream << bytes[i];
break;
}
}
// You can use "write_float" to write little/big endian-encoded floats:
// write_float(stream, 10.0, Endianness::little_endian);
// write_float(stream, 10.0, Endianness::big_endian);
On the third line of the PFM file, you must write
1.0
or -1.0
depending on the
endianness.
The write_float
function from the previous slide
works in both cases, so you can choose one and use that.
Side note: the following function returns true
when
run on a little endian system, and false
otherwise:
Java and Kotlin have the classes InputStream
and OutputStream
(in java.io
) to represent a stream. These are suitable for
the writeFloat
and writePfm
prototypes.
FileOutputStream
opens a file for writing and returns a stream.
In-memory streams are created with ByteArrayOutputStream
.
To open a file, operate on it, and close it, Kotlin offers the
very convenient use
,
similar to using
in C#:
Endianness is identified by the ByteOrder
type in java.nio
(a Java class, but you can natively use
Java libraries in Kotlin).
To write/read values in binary format, there is the ByteBuffer
class, also in java.nio
. Example in Kotlin:
ByteBuffer
Bytes in Kotlin are signed (very strange!).
To initialize an array from hexadecimal values like those printed
by xxd -i reference_be.pfm
, you need a small helper
function:
reference_be.pfm
val reference_be = byteArrayOfInts(
0x50, 0x46, 0x0a, 0x33, 0x20, 0x32, 0x0a, 0x31, 0x2e, 0x30, 0x0a, 0x42,
0xc8, 0x00, 0x00, 0x43, 0x48, 0x00, 0x00, 0x43, 0x96, 0x00, 0x00, 0x43,
0xc8, 0x00, 0x00, 0x43, 0xfa, 0x00, 0x00, 0x44, 0x16, 0x00, 0x00, 0x44,
0x2f, 0x00, 0x00, 0x44, 0x48, 0x00, 0x00, 0x44, 0x61, 0x00, 0x00, 0x41,
0x20, 0x00, 0x00, 0x41, 0xa0, 0x00, 0x00, 0x41, 0xf0, 0x00, 0x00, 0x42,
0x20, 0x00, 0x00, 0x42, 0x48, 0x00, 0x00, 0x42, 0x70, 0x00, 0x00, 0x42,
0x8c, 0x00, 0x00, 0x42, 0xa0, 0x00, 0x00, 0x42, 0xb4, 0x00, 0x00
)
Do the same with reference_le.pfm
.
Java and Kotlin internally represent strings using UTF-16 encoding.
To convert the encoding to ASCII so it can be saved in a binary
file, Kotlin offers the very convenient toByteArray()
method:
In Julia, streams are represented as subtypes of
IO
.
Instead of defining a savepfm
function, provide a
new definition of write
using multiple dispatch:
This way you will extend the write
function (implemented
by Julia for basic types) to your HdrImage
type as
well.
To determine if the machine is little endian or big
endian, there is the constant ENDIAN_BOM
:
To convert a floating-point number to an integer and vice-versa,
there is reinterpret
:
You can convert an integer value from big endian or
little endian to the local machine format with the functions
ntoh
, hton
, ltoh
and
htol
.
The letter h
stands for «host», and indicates the
machine on which the program is running.
Obviously, on little endian machines the functions
ltoh
and htol
correspond to the identity; on
big endian machines this applies to ntoh
and
hton
.
Strings in Julia are of type String
, and are encoded
as UTF-8.
Characters are of type Char
, but unlike C++ they are
32-bit values: in other words, they are Unicode code points
stored using UTF-32.
To convert a string to a sequence of bytes, use transcode
:
In C#, a stream is of type Stream
, which is a base
class from which FileStream
and MemoryStream
derive.
To open a file for writing, use the using
keyword:
The BitConverter
class implements methods for reading and writing binary data from
streams.
The following method writes a 32-bit floating-point number in binary:
The variable BitConverter.IsLittleEndian
exists to
decide whether to write 1.0
or -1.0
in the PFM
file.
C#, unlike C++, distinguishes between strings (encoded in Unicode with UTF-16) and byte sequences.
To correctly write the header, the simplest thing is to create a Unicode string and then convert it to ASCII:
where endianness_value
is a double
that is
either 1.0
or -1.0
.
Unfortunately, the most recent version of the D language does not natively support streams.
But it’s not a big deal, because you can use dynamic byte
sequences like ubyte[]
; for writing, an Appender
is even better (more efficient), or alternatively outbuffer
.
The language provides the Endian type
and the std.bitmanip
library, which provides the append
template function:
To write a float
number to an Appender
,
you can use this code:
Note that you cannot pass the endianness
value to
append!
because, being a template, it requires the value to
be known at compile time.
To write the header, you can send the ASCII file string with the
put
function:
appender.put(cast(ubyte[])(header_string))
.
It’s simple to interpret a byte array as a stream:
class StringStream {
uint curidx = 0;
const ubyte[] stream;
this(const ubyte[] _stream) { stream = _stream; }
bool eof() { return curidx >= stream.length; }
char read_char() {
if (curidx < stream.length) {
char result = stream[curidx];
curidx++;
return result;
}
return 0; // If we are at the end of the string, return 0
}
}
Add a read_line()
method for convenience.
To read a float
, use this method:
In addition to decode_pfm(const ubyte[])
, you can
also implement:
enum
and match
To specify the endianness there is the type ByteOrder
in the crate endianness
With enum
s get used to using match
instead of if
:
Use the Write
and Read
traits
to define functions that read and write to a stream. For example:
You can make the code faster using BufWriter
and BufReader
, but it’s not necessary (it certainly
won’t be the bottleneck!).