CFITSIO now transparently supports 2 types of image compression:
1) The entire FITS file may be externally compressed with the gzip or Unix compress algorithm, producing a *.gz or *.Z file, respectively. When reading compressed files of this type, CFITSIO first uncompresses the entire file into memory before performing the requested read operations. Output files can be directly written in the gzip compressed format if the user-specified filename ends with `.gz'. In this case, CFITSIO initially writes the uncompressed file in memory and then compresses it and writes it to disk when the FITS file is closed, thus saving user disk space. Read and write access to these compressed FITS files is generally quite fast; the main limitation is that there must be enough available memory (or swap space) to hold the entire uncompressed FITS file.
2) CFITSIO also supports a newer image compression format in which the image is divided into a grid of rectangular tiles, and each tile of pixels is individually compressed. The compressed tiles are stored in rows of a variable length array column in a FITS binary table, but CFITSIO recognizes that the binary table extension contains an image and treats it as if it were an IMAGE extension. This tile-compressed format is especially well suited for compressing very large images because a) the FITS header keywords remain uncompressed for rapid read access, and because b) it is possible to extract and uncompress sections of the image without having to uncompress the entire image. This format is also much more effective in compressing floating point images (using a lossy compression algorithm) than simply compressing the image using gzip or compress.
A detailed description of this format is available at:
http://heasarc.gsfc.nasa.gov/docs/software/fitsio/ compression/compress_image.html
The N-dimensional FITS image can be divided into any desired rectangular grid of compression tiles. By default the tiles are chosen to correspond to the rows of the image, each containing NAXIS1 pixels. For example, a 800 x 800 x 4 pixel data cube would be divided in to 3200 tiles containing 800 pixels each by default. Alternatively, this data cube could be divided into 256 tiles that are each 100 X 100 X 1 pixels in size, or 4 tiles containing 800 x 800 X 1 pixels, or a single tile containing the entire data cube. Note that the image dimensions are not required to be an integer multiple of the tile dimensions, so, for example, this data cube could also be divided into 250 X 200 pixel tiles, in which case the last tile in each row would only contain 50 X 200 pixels.
Currently, 3 image compression algorithms are supported: Rice, GZIP, and PLIO. Rice and GZIP are general purpose algorithms that can be used to compress almost any image. The PLIO algorithm, on the other hand, is more specialized and was developed for use in IRAF to store pixel data quality masks. It is designed to only work on images containing positive integers with values up to about 2**24. Other image compression algorithms may be supported in the future.
The 3 supported image compression algorithms are all 'loss-less' when applied to integer FITS images; the pixel values are preserved exactly with no loss of information during the compression and uncompression process. Floating point FITS images (which have BITPIX = -32 or -64) are first quantized into scaled integer pixel values before being compressed. This technique produces much higher compression factors than simply using GZIP to compress the image, but it also means that the original floating value pixel values may not be precisely returned when the image is uncompressed. When done properly, this only discards the 'noise' from the floating point values without losing any significant information. The amount of noise that is discarded can be controlled by the 'noise_bits' compression parameter.
No special action is required to read tile-compressed images because all the CFITSIO routines that read normal uncompressed FITS images can also read images in the tile-compressed format; CFITSIO essentially treats the binary table that contains the compressed tiles as if it were an IMAGE extension.
When creating (writing) a new image with CFITSIO, a normal uncompressed FITS primary array or IMAGE extension will be written unless the tile-compressed format has been specified in 1 of 2 possible ways:
1) At run time, when specifying the name of the output FITS file to be created at run time, the user can indicate that images should be written in tile-compressed format by enclosing the compression parameters in square brackets following the root disk file name. The `imcopy' example program that included with the CFITSIO distribution can be used for this purpose to compress or uncompress images. Here are some examples of the extended file name syntax for specifying tile-compressed output images:
myfile.fit[compress] - use the default compression algorithm (Rice) and the default tile size (row by row) myfile.fit[compress GZIP] - use the specified compression algorithm; myfile.fit[compress Rice] only the first letter of the algorithm myfile.fit[compress PLIO] name is required. myfile.fit[compress R 100,100] - use Rice compression and 100 x 100 pixel tile size myfile.fit[compress R 100,100;2] - as above, and also use noisebits = 2
2) Before calling the CFITSIO routine to write the image header keywords (e.g., fits_create_image) the programmer can call the routines described below to specify the compression algorithm and the tiling pattern that is to be used. There are 3 routines for specifying the various compression parameters and 3 corresponding routines to return the current values of the parameters:
int fits_set_compression_type(fitsfile *fptr, int comptype, int *status) int fits_set_tile_dim(fitsfile *fptr, int ndim, long *tilesize, int *status) int fits_set_noise_bits(fitsfile *fptr, int noisebits, int *status) int fits_get_compression_type(fitsfile *fptr, int *comptype, int *status) int fits_get_tile_dim(fitsfile *fptr, int ndim, long *tilesize, int *status) int fits_get_noise_bits(fitsfile *fptr, int *noisebits, int *status)3 symbollic constants are defined for use as the value of the `comptype' parameter: GZIP_1, RICE_1, or PLIO_1. Entering NULL for comptype will turn off the tile-compression and cause normal FITS images to be written.
The 'noisebits' parameter is only used when compressing floating point images. The default value is 4. Decreasing the value of noisebits will improve the overall compression efficiency at the expense of losing more information.
A small example program called 'imcopy' is included with CFITSIO that can be used to compress (or uncompress) any FITS image. This program can be used to experiment with the various compression options on existing FITS images as shown in these examples:
1) imcopy infile.fit 'outfile.fit[compress]' This will use the default compression algorithm (Rice) and the default tile size (row by row) 2) imcopy infile.fit 'outfile.fit[compress GZIP]' This will use the GZIP compression algorithm and the default tile size (row by row). The allowed compression algorithms are Rice, GZIP, and PLIO. Only the first letter of the algorithm name needs to be specified. 3) imcopy infile.fit 'outfile.fit[compress G 100,100]' This will use the GZIP compression algorithm and 100 X 100 pixel tiles. 4) imcopy infile.fit 'outfile.fit[compress R 100,100; 4]' This will use the Rice compression algorithm, 100 X 100 pixel tiles, and noise_bits = 4 (assuming the input image has a floating point data type). Decreasing the value of noisebits will improve the overall compression efficiency at the expense of losing more information. 5) imcopy infile.fit outfile.fit If the input file is in tile-compressed format, then it will be uncompressed to the output file. Otherwise, it simply copies the input image to the output image. 6) imcopy 'infile.fit[1001:1500,2001:2500]' outfile.fit This extracts a 500 X 500 pixel section of the much larger input image (which may be in tile-compressed format). The output is a normal uncompressed FITS image. 7) imcopy 'infile.fit[1001:1500,2001:2500]' outfile.fit.gz Same as above, except the output file is externally compressed using the gzip algorithm.