Libpolycomp  1.0
A compression/decompression library that implements the polynomial compression and other simple compression schemes
Run-Length Encoding

Functions

size_t pcomp_rle_bufsize (size_t input_size)
 Calculate an upper limit for the size of a buffer holding RLE-encoded streams. More...
 
int pcomp_compress_rle_int8 (int8_t *output_buf, size_t *output_size, const int8_t *input_buf, size_t input_size)
 Compress an array of int8_t values using the RLE compression. More...
 
int pcomp_compress_rle_int16 (int16_t *output_buf, size_t *output_size, const int16_t *input_buf, size_t input_size)
 Compress an array of int16_t values using the RLE compression. More...
 
int pcomp_compress_rle_int32 (int32_t *output_buf, size_t *output_size, const int32_t *input_buf, size_t input_size)
 Compress an array of int32_t values using the RLE compression. More...
 
int pcomp_compress_rle_int64 (int64_t *output_buf, size_t *output_size, const int64_t *input_buf, size_t input_size)
 Compress an array of int64_t values using the RLE compression. More...
 
int pcomp_compress_rle_uint8 (uint8_t *output_buf, size_t *output_size, const uint8_t *input_buf, size_t input_size)
 Compress an array of uint8_t values using the RLE compression. More...
 
int pcomp_compress_rle_uint16 (uint16_t *output_buf, size_t *output_size, const uint16_t *input_buf, size_t input_size)
 Compress an array of uint16_t values using the RLE compression. More...
 
int pcomp_compress_rle_uint32 (uint32_t *output_buf, size_t *output_size, const uint32_t *input_buf, size_t input_size)
 Compress an array of uint32_t values using the RLE compression. More...
 
int pcomp_compress_rle_uint64 (uint64_t *output_buf, size_t *output_size, const uint64_t *input_buf, size_t input_size)
 Compress an array of uint64_t values using the RLE compression. More...
 

Detailed Description

The algorithm and its applicability

Libpolycomp implements routines for compressing and decompressing streams of data using the Run-Length Encoding (RLE) scheme. This kind of compression is perfect for data streams which contain long sequences of repeated data, e.g.,

1041 1041 1041 1041 1280 1280 1041 1041 1041 1041

The RLE scheme works by encoding each sample together with the number of consecutive repeats found. Therefore, for the previous example the encoding would be

1041 4  1280 2  1041 4 

In certain cases, RLE can outperform other well-known compression schemes like gzip and bzip2.

The RLE scheme can be applied reliably only on sequences of integer data. The algorithm involves the comparison between consecutive values, and this cannot be done reliably with floating-point numbers because of round-off errors. The Libpolycomp library implements many similar functions (e.g., pcomp_compress_rle_int8) for signed and unsigned integers, with sizes of 1, 2, 4, and 8 bytes.

Libpolycomp correctly takes into account possible overflows. Suppose for instance that the input datastream contains 1000 repetitions of the 8-bit value "152". In this case, Libpolycomp produces the following output:

152 256  152 256  152 256  152 232 

as 1000 = 256 * 3 + 232.

Implementation details

The compression and decompression routines require the caller to pre-allocate the memory which will contain the output. For compression routines, one typically uses the function pcomp_rle_bufsize to get an upper estimate to the size of the output buffer, and after the compression resizes the buffer (is needed) in order to free the unused memory. See the documentation for pcomp_compress_rle_int8 and pcomp_decompress_rle_int8 for examples.

Function Documentation

int pcomp_compress_rle_int16 ( int16_t *  output_buf,
size_t *  output_size,
const int16_t *  input_buf,
size_t  input_size 
)

Compress an array of int16_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

int pcomp_compress_rle_int32 ( int32_t *  output_buf,
size_t *  output_size,
const int32_t *  input_buf,
size_t  input_size 
)

Compress an array of int32_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

int pcomp_compress_rle_int64 ( int64_t *  output_buf,
size_t *  output_size,
const int64_t *  input_buf,
size_t  input_size 
)

Compress an array of int64_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

int pcomp_compress_rle_int8 ( int8_t *  output_buf,
size_t *  output_size,
const int8_t *  input_buf,
size_t  input_size 
)

Compress an array of int8_t values using the RLE compression.

The size of the output buffer is typically guessed using pcomp_rle_bufsize. After the call to the function, the buffer should be resized to claim unused space at the end. Here is an example:

int8_t input_buf[] = { 10, 10, 20, 30, 30, 30 };
size_t input_size = sizeof(input_buf) / sizeof(input_buf[0]);
size_t output_size = pcomp_rle_bufsize(input_size) *
sizeof(int8_t);
int8_t* output_buf = malloc(output_size);
pcomp_compress_rle_int8(output_buf, &output_size, input_buf,
input_size);
output_buf = realloc(output_buf, output_size * sizeof(int8_t));
Parameters
[out]output_bufThe buffer that will hold the compressed stream. It must have space for a number of elements at least equal to the value returned by pcomp_rle_bufsize.
[in,out]output_sizeOn entering the function, this must specify the number of elements (not bytes) that can be written in output_buf. On exit, it will contain the actual number of elements written.
[in]input_bufThe array of values to compress
[in]input_sizeNumber of elements (not bytes) in the array input_buf
Returns
PCOMP_STAT_SUCCESS if the encoding completed successfully. Otherwise, the error code specifies the kind of error occurred during the call.
int pcomp_compress_rle_uint16 ( uint16_t *  output_buf,
size_t *  output_size,
const uint16_t *  input_buf,
size_t  input_size 
)

Compress an array of uint16_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

int pcomp_compress_rle_uint32 ( uint32_t *  output_buf,
size_t *  output_size,
const uint32_t *  input_buf,
size_t  input_size 
)

Compress an array of uint32_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

int pcomp_compress_rle_uint64 ( uint64_t *  output_buf,
size_t *  output_size,
const uint64_t *  input_buf,
size_t  input_size 
)

Compress an array of uint64_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

int pcomp_compress_rle_uint8 ( uint8_t *  output_buf,
size_t *  output_size,
const uint8_t *  input_buf,
size_t  input_size 
)

Compress an array of uint8_t values using the RLE compression.

Refer to the documentation for pcomp_compress_rle_int8 for more information.

size_t pcomp_rle_bufsize ( size_t  input_size)

Calculate an upper limit for the size of a buffer holding RLE-encoded streams.

Return the number of elements required for a buffer used to hold the RLE-compressed version of a datastream. It is typically used together with functions like pcomp_compress_rle_int8 to pre-allocate the buffer that will contain the result.

The number returned by this function might be severely overestimated. Therefore, when using this function to allocate a buffer for functions like pcomp_compress_rle_int8, it is recommended to claim any unused space at the end of the buffer. (Functions like pcomp_compress_rle_int8 inform the caller about the number of elements actually written in the buffer.)

Parameters
[in]input_sizeNumber of elements of the data stream to compress
Returns
The number of elements (not bytes) of the buffer

Definition at line 95 of file rle.c.