org.bolet.jgz
Class GZipDecoder

java.lang.Object
  extended by org.bolet.jgz.GZipDecoder

public class GZipDecoder
extends java.lang.Object

This class implements a generic decoder for gzip stream headers. A gzip stream is a concatenation of "members". Each member is compressed with a method (the only currently defined method is number 8, for the "deflate" algorithm) and is associated with some meta-information, such as the source file name or last modification time.

A GZipDecoder instance can be used to decode the member headers. Note that this class does not handle decompression itself: the decompressor must be invoked externally. The gzip format (documented in RFC 1952) assumes that compressed data is self-terminated; hence, the member header does not contain any indication as to the compressed data length.

Each member begins with a header, followed by the compressed data, and then a member footer. The member footer contains the uncompressed data length (truncated to 32-bit) and a 32-bit CRC checksum on the uncompressed data; both values shall be used to check successful decompression. The stream processing should go thus:

  1. Decode the next member header.
  2. Process the meta-information, if needed.
  3. Uncompress the data; optionaly, keep a count of the uncompressed data length, and a running CRC.
  4. Decode the member footer; optionaly, check the uncompressed data length and CRC value.
  5. Get back to step 1.

Check of length and CRC are highly recommanded. There is no indication, neither in the header nor in the footer, about whether the member is final or not. Hence, the end of the underlying stream should provide that information. Such end-of-stream may occur only immediately after a member footer.

The get*() methods provide access to data obtained by decoding the member header or footer. Unless otherwise specified, the data is made available when the header has been decoded; the uncompressed data size and CRC are decoded from the footer, which means that their value becomes available only after a call to closeMember().


Field Summary
static int DEFLATE
          Compression method: "deflate" algorithm (RFC 1951).
static int FCOMMENT
          Header flag: there is a comment field.
static int FEXTRA
          Header flag: there is some extra data.
static int FHCRC
          Header flag: header has its own CRC checksum.
static int FNAME
          Header flag: there is a source file name.
static int FTEXT
          Header flag: compressed data is probably an ASCII text.
 
Constructor Summary
GZipDecoder(java.io.InputStream sub)
          Build the decoder over the provided stream for compressed data.
 
Method Summary
 void closeMember()
          Decode the member footer.
 java.lang.String getComment()
          Get the member comment (optional).
 int getCompressionMethod()
          Get the compression method.
 int getFlags()
          Get the header flags (8-bit value).
 int getFlagsExtra()
          Get the extra flags (8 bits).
 int getMTime()
          Get the source file last modification time (32-bit "unix" time, as a number of seconds since the Epoch).
 java.lang.String getOriginalFileName()
          Get the data source file name (optional).
 int getOS()
          Get the stream producer operating system (1-byte identifier).
 java.io.InputStream getSubStream()
          Get the underlying stream which this decoder uses.
 int getUncompressedCRC()
          Get the CRC over uncompressed data (32-bit value, from the member footer).
 int getUncompressedSize()
          Get the uncompressed data size for this member (truncated to 32 bits).
 java.io.InputStream nextMember()
          Decode the next member header.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FTEXT

public static final int FTEXT
Header flag: compressed data is probably an ASCII text.

See Also:
Constant Field Values

FHCRC

public static final int FHCRC
Header flag: header has its own CRC checksum.

See Also:
Constant Field Values

FEXTRA

public static final int FEXTRA
Header flag: there is some extra data.

See Also:
Constant Field Values

FNAME

public static final int FNAME
Header flag: there is a source file name.

See Also:
Constant Field Values

FCOMMENT

public static final int FCOMMENT
Header flag: there is a comment field.

See Also:
Constant Field Values

DEFLATE

public static final int DEFLATE
Compression method: "deflate" algorithm (RFC 1951).

See Also:
Constant Field Values
Constructor Detail

GZipDecoder

public GZipDecoder(java.io.InputStream sub)
Build the decoder over the provided stream for compressed data.

Parameters:
sub - the compressed data stream
Method Detail

nextMember

public java.io.InputStream nextMember()
                               throws java.io.IOException
Decode the next member header. If a member header was found and correctly decoded, then the underlying data stream is returned; the compressed data shall be read and processed from that stream. Otherwise, if the end-of-stream was reached when trying to get the first header byte, then null is returned. Otherwise (at least one byte was read but no valid header was found), an IOException is thrown.

Returns:
the compressed data stream, or null
Throws:
java.io.IOException - on I/O or format error

closeMember

public void closeMember()
                 throws java.io.IOException
Decode the member footer. This method shall be called after the processing of the compressed data.

Throws:
java.io.IOException - on I/O or format error

getCompressionMethod

public int getCompressionMethod()
Get the compression method.

Returns:
the compression method

getFlags

public int getFlags()
Get the header flags (8-bit value).

Returns:
the header flags

getMTime

public int getMTime()
Get the source file last modification time (32-bit "unix" time, as a number of seconds since the Epoch).

Returns:
the source file last modification time

getFlagsExtra

public int getFlagsExtra()
Get the extra flags (8 bits).

Returns:
the extra flags

getOS

public int getOS()
Get the stream producer operating system (1-byte identifier).

Returns:
the stream producer operating system

getOriginalFileName

public java.lang.String getOriginalFileName()
Get the data source file name (optional). If present, the source file name may contain any character except the null character; in particular, it may contains slashes, backslashes, dots and various control characters. All file name individual characters have values between 1 and 255 (inclusive).

Returns:
the source file name, or null

getComment

public java.lang.String getComment()
Get the member comment (optional). The comment is meant for human consumption. Note that nothing in this class checks that the comment contains only printable characters. All comment individual characters have values between 1 and 255 (inclusive).

Returns:
the member comment, or null

getUncompressedCRC

public int getUncompressedCRC()
Get the CRC over uncompressed data (32-bit value, from the member footer).

Returns:
the uncompressed data checksum

getUncompressedSize

public int getUncompressedSize()
Get the uncompressed data size for this member (truncated to 32 bits). This value is obtained from the member footer.

Returns:
the uncompressed data size

getSubStream

public java.io.InputStream getSubStream()
Get the underlying stream which this decoder uses. This method can always be called.

Returns:
the underlying stream for compressed data