Compress::Bzip2 - Interface to Bzip2 compression library
use Compress::Bzip2 qw(:all :constant :utilities :gzip);
($bz, $status) = bzdeflateInit( [PARAMS] );
($out, $status) = $bz->bzdeflate($buffer) ; # compress
($bz, $status) = bzinflateInit( [PARAMS] );
($out, $status) = $bz->bzinflate($buffer); # uncompress
($out, $status) = $bz->bzflush() ;
($out, $status) = $bz->bzclose() ;
$dest = memBzip($source);
alias compress
$dest = memBunzip($source);
alias decompress
$bz = Compress::Bzip2->new( [PARAMS] );
$bz = bzopen($filename or filehandle, $mode);
alternate, with $bz created by new():
$bz->bzopen($filename or filehandle, $mode);
$bytesread = $bz->bzread($buffer [,$size]) ;
$bytesread = $bz->bzreadline($line);
$byteswritten = $bz->bzwrite($buffer [,$limit]);
$errstring = $bz->bzerror();
$status = $bz->bzeof();
$status = $bz->bzflush();
$status = $bz->bzclose() ;
$status = $bz->bzsetparams( $param => $setting );
$bz->total_in() ;
$bz->total_out() ;
$verstring = $bz->bzversion();
$Compress::Bzip2::bzerrno
The
Compress::Bzip2 module provides a Perl interface to the
bzip2
compression library (see "AUTHOR" for details about where to get
Bzip2). A relevant subset of the functionality provided by
bzip2
is available in
Compress::Bzip2.
All string parameters can either be a scalar or a scalar reference.
The module can be split into two general areas of functionality, namely
in-memory compression/decompression and read/write access to
bzip2
files. Each of these areas will be discussed separately below.
NOTE
Compress::Bzip2 is just a simple
bzip2 binding, comparable to the
old Compress::Zlib library. It is not well integrated into PerlIO, use the
preferred IO::Compress::Bzip2 instead.
A number of functions are supplied in
bzlib for reading and writing
bzip2 files. Unfortunately, most of them are not suitable. So, this
module provides another interface, built over top of the low level bzlib
methods.
This function returns an object which is used to access the other
bzip2
methods.
The
mode parameter is used to specify both whether the file is opened for
reading or writing, with "r" or "w" respectively.
If a reference to an open filehandle is passed in place of the filename, it
better be positioned to the start of a compression/decompression sequence.
WARNING: With Perl 5.6 you cannot use a filehandle because of SEGV in
destruction with bzclose or an implicit close.
Create a Compress::Bzip2 object. Optionally, provide compression/decompression
parameters as a keyword => setting list. See
bzsetparams() for a description of the
parameters.
This is bzopen, but it uses an object previously created by the new method.
Other than that, it is identical to the above bzopen.
Reads
$size bytes from the compressed file into
$buffer . If
$size is not specified,
it will default to 4096. If the scalar
$buffer is not
large enough, it will be extended automatically.
Returns the number of bytes actually read. On EOF it returns 0 and in the case
of an error, -1.
Reads the next line from the compressed file into
$line.
Returns the number of bytes actually read. On EOF it returns 0 and in the case
of an error, -1.
It IS legal to intermix calls to
bzread and
bzreadline.
At this time
bzreadline ignores the variable $/ ($INPUT_RECORD_SEPARATOR
or $RS when "English" is in use). The end of a line is denoted by
the C character '\n'.
Writes the contents of
$buffer to the compressed file.
Returns the number of bytes actually written, or 0 on error.
If $limit is given and non-zero, then only that many bytes from $buffer will be
written.
Flushes all pending output to the compressed file. Works identically to the
zlib function it interfaces to. Note that the use of
bzflush can
degrade compression.
Returns "BZ_OK" if
$flush is
"BZ_FINISH" and all output could be flushed. Otherwise the bzlib
error code is returned.
Refer to the
bzlib documentation for the valid values of
$flush .
Returns 1 if the end of file has been detected while reading the input file,
otherwise returns 0.
Closes the compressed file. Any pending data is flushed to the file before it is
closed.
Change settings for the deflate stream $bz.
The list of the valid options is shown below. Options not specified will remain
unchanged.
- -verbosity
- Defines the verbosity level. Valid values are 0 through 4,
The default is "-verbosity => 0".
- -blockSize100k
- For bzip object opened for stream deflation or write.
Defines the buffering factor of compression method. The algorithm buffers
all data until the buffer is full, then it flushes all the data out. Use
-blockSize100k to specify the size of the buffer.
Valid settings are 1 through 9, representing a blocking in multiples of
100k.
Note that each such block has an overhead of leading and trailing
synchronization bytes. bzip2 recovery uses this information to pull
useable data out of a corrupted file.
A streaming application would probably want to set the blocking low.
- -workFactor
- For bzip object opened for stream deflation or write.
The workFactor setting tells the deflation algorithm how much work to invest
to compensate for repetitive data.
workFactor may be a number from 0 to 250 inclusive. The default setting is
30.
See the bzip documentation for more information.
- -small
- For bzip object opened for stream inflation or read.
small may be 0 or 1. Set "small" to one to use a slower,
less memory intensive algorithm.
Returns the
bzlib error message or number for the last operation
associated with
$bz. The return value will be the
bzlib error number when used in a numeric context and the
bzlib
error message when used in a string context. The
bzlib error number
constants, shown below, are available for use.
BZ_CONFIG_ERROR
BZ_DATA_ERROR
BZ_DATA_ERROR_MAGIC
BZ_FINISH
BZ_FINISH_OK
BZ_FLUSH
BZ_FLUSH_OK
BZ_IO_ERROR
BZ_MAX_UNUSED
BZ_MEM_ERROR
BZ_OK
BZ_OUTBUFF_FULL
BZ_PARAM_ERROR
BZ_RUN
BZ_RUN_OK
BZ_SEQUENCE_ERROR
BZ_STREAM_END
BZ_UNEXPECTED_EOF
The
$bzerrno scalar holds the error code associated with
the most recent
bzip2 routine. Note that unlike
bzerror(), the error is
not associated with a
particular file.
As with
bzerror() it returns an error number in numeric
context and an error message in string context. Unlike
bzerror() though, the error message will correspond to
the
bzlib message when the error is associated with
bzlib
itself, or the UNIX error message when it is not (i.e.
bzlib returned
"Z_ERRORNO").
As there is an overlap between the error numbers used by
bzlib and UNIX,
$bzerrno should only be used to check for the presence of
an error in numeric context. Use
bzerror() to
check for specific
bzlib errors. The
bzcat example below shows
how the variable can be used safely.
Returns the additional 5 byte header which is prepended to the bzip2 header
starting with "BZh" when using memBzip/compress.
Options: -d -c -z -f -v -k -s -1..9
While the 2.x thread forked off of 1.00, another line of development came to a
head at 1.03. The 1.03 version worked with bzlib 1.0.2, had improvements to
the error handling, single buffer inflate/deflate, a streaming interface to
inflate/deflate, and a cpan style test suite.
Alias to memBzip, this compresses string, using the optional compression level,
1 through 9, the default being 6. Returns a string containing the compressed
data.
On error
undef is returned.
Alias to memBunzip, this decompresses the data in string, returning a string
containing the decompressed data.
On error
undef is returned.
Another alias to memBunzip
Alias to bzdeflateInit. In addition to the named parameters documented for
bzdeflateInit, the following are accepted:
-level, alias to -blockSize100k
-buffer, to set the buffer size.
The -buffer option is ignored. The intermediate buffer size is not changeable.
Alias to bzinflateInit. See bzinflateInit for a description of the parameters.
The option "-buffer" is accepted, but ignored.
Add data to be compressed/decompressed. Returns whatever output is available
(possibly none, if it's still buffering it), or undef on error.
Finish the operation; takes an optional final data string. Whatever is returned
completes the output; returns undef on error.
Like the function, but applies to the current object only. Note that errors in a
stream object are also returned by the function.
Alias to total_in. Total bytes passed to the stream.
Alias to total_out. Total bytes received from the stream.
Except for the exact state and error numbers, this package presents an interface
very much like that given by the Compress::Zlib package. Mostly, if you take
the method name, state or error number from Compress::Zlib and replace the
"g" with a "b", your code should work.
To make the interoperability even easier, all the Compress::Zlib method names
have been used as aliases or cover functions for the bzip2 methods.
Therefore, most code that uses Compress::Zlib should be able to use this
package, with a one line change.
Simply change
$gz = Compress::Zlib::gzopen( "filename", "w" );
to
$gz = Compress::Bzip2::gzopen( "filename", "w" );
Some of the Compress::Zlib aliases don't return anything useful, like crc32 or
adler32, cause bzip2 doesn't do that sort of thing.
Alias for bzopen.
Alias for bzread.
Alias for bzreadline.
Alias for bzwrite.
Alias for bzflush, with return code translation.
Alias for bzclose.
Alias for bzeof.
Alias for bzerror.
This is a no-op.
Alias for bzdeflateInit, with return code translation.
All OPTS are ignored.
Alias for bzdeflate, with return code translation.
This is a no-op.
Cover function for bzflush or bzclose, depending on $flushtype.
See the Compress::Zlib documentation for more information.
This is a no-op.
This is a no-op.
Alias for bzinflateInit, with return code translation.
All OPTS are ignored.
Alias for bzinflate, with return code translation.
This is a no-op.
This is a no-op.
This is a no-op.
Alias for memBzip.
Alias for memBunzip.
Two high-level functions are provided by
bzlib to perform in-memory
compression. They are
memBzip and
memBunzip. Two Perl subs are
provided which provide similar functionality.
Compresses
$buffer. If successful it returns the compressed
data. Otherwise it returns
undef.
The buffer parameter can either be a scalar or a scalar reference.
Essentially, an in-memory bzip file is created. It creates a minimal bzip
header, which adds 5 bytes before the bzip2 specific BZh header.
Uncompresses
$buffer. If successful it returns the
uncompressed data. Otherwise it returns
undef.
The source buffer can either be a scalar or a scalar reference.
The buffer parameter can either be a scalar or a scalar reference. The contents
of the buffer parameter are destroyed after calling this function.
The Perl interface will
always consume the complete input buffer before
returning. Also the output buffer returned will be automatically grown to fit
the amount of output available.
Here is a definition of the interface available:
Initialises a deflation stream.
If successful, it will return the initialised deflation stream,
$d and
$status of "BZ_OK"
in a list context. In scalar context it returns the deflation stream,
$d, only.
If not successful, the returned deflation stream (
$d) will
be
undef and
$status will hold the exact
bzip2 error code.
The function optionally takes a number of named options specified as
"-Name=>value" pairs. This allows individual options to be
tailored without having to specify them all in the parameter list.
Here is a list of the valid options:
- -verbosity
- Defines the verbosity level. Valid values are 0 through 4,
The default is "-verbosity => 0".
- -blockSize100k
- Defines the buffering factor of compression method. The
algorithm buffers all data until the buffer is full, then it flushes all
the data out. Use -blockSize100k to specify the size of the buffer.
Valid settings are 1 through 9, representing a blocking in multiples of
100k.
Note that each such block has an overhead of leading and trailing
synchronization bytes. bzip2 recovery uses this information to pull
useable data out of a corrupted file.
A streaming application would probably want to set the blocking low.
- -workFactor
- The workFactor setting tells the deflation algorithm how
much work to invest to compensate for repetitive data.
workFactor may be a number from 0 to 250 inclusive. The default setting is
30.
See the bzip documentation for more information.
Here is an example of using the
deflateInit optional parameter list to
override the default buffer size and compression level. All other options will
take their default values.
bzdeflateInit( -blockSize100k => 1, -verbosity => 1 );
Deflates the contents of
$buffer. The buffer can either be
a scalar or a scalar reference. When finished,
$buffer
will be completely processed (assuming there were no errors). If the deflation
was successful it returns deflated output,
$out, and a
status value,
$status, of "Z_OK".
On error,
$out will be
undef and
$status will contain the
zlib error code.
In a scalar context
bzdeflate will return
$out only.
As with the internal buffering of the
deflate function in
bzip2,
it is not necessarily the case that any output will be produced by this
method. So don't rely on the fact that
$out is empty for
an error test. In fact, given the size of bzdeflates internal buffer, with
most files it's likely you won't see any output at all until flush or close.
Typically used to finish the deflation. Any pending output will be returned via
$out.
$status will have a value
"BZ_OK" if successful.
In a scalar context
bzflush will return
$out only.
Note that flushing can seriously degrade the compression ratio, so it should
only be used to terminate a decompression (using "BZ_FLUSH") or when
you want to create a
full flush point (using "BZ_FINISH").
The allowable values for "flush_type" are "BZ_FLUSH" and
"BZ_FINISH".
For a handle opened for "w" (bzwrite), the default is
"BZ_FLUSH". For a stream, the default for "flush_type" is
"BZ_FINISH" (which is essentially a close and reopen).
It is strongly recommended that you only set the "flush_type"
parameter if you fully understand the implications of what it does. See the
"bzip2" documentation for details.
Here is a trivial example of using
bzdeflate. It simply reads standard
input, deflates it and writes it to standard output.
use strict ;
use warnings ;
use Compress::Bzip2 ;
binmode STDIN;
binmode STDOUT;
my $x = bzdeflateInit()
or die "Cannot create a deflation stream\n" ;
my ($output, $status) ;
while (<>)
{
($output, $status) = $x->bzdeflate($_) ;
$status == BZ_OK
or die "deflation failed\n" ;
print $output ;
}
($output, $status) = $x->bzclose() ;
$status == BZ_OK
or die "deflation failed\n" ;
print $output ;
Here is a definition of the interface:
Initialises an inflation stream.
In a list context it returns the inflation stream,
$i, and
the
zlib status code (
$status). In a scalar
context it returns the inflation stream only.
If successful,
$i will hold the inflation stream and
$status will be "BZ_OK".
If not successful,
$i will be
undef and
$status will hold the
bzlib.h error code.
The function optionally takes a number of named options specified as
"-Name=>value" pairs. This allows individual options to be
tailored without having to specify them all in the parameter list.
For backward compatibility, it is also possible to pass the parameters as a
reference to a hash containing the name=>value pairs.
The function takes one optional parameter, a reference to a hash. The contents
of the hash allow the deflation interface to be tailored.
Here is a list of the valid options:
- -small
-
small may be 0 or 1. Set "small" to one to
use a slower, less memory intensive algorithm.
- -verbosity
- Defines the verbosity level. Valid values are 0 through 4,
The default is "-verbosity => 0".
Here is an example of using the
bzinflateInit optional parameter.
bzinflateInit( -small => 1, -verbosity => 1 );
Inflates the complete contents of
$buffer. The buffer can
either be a scalar or a scalar reference.
Returns "BZ_OK" if successful and "BZ_STREAM_END" if the end
of the compressed data has been successfully reached. If not successful,
$out will be
undef and
$status will hold the
bzlib error code.
The $buffer parameter is modified by "bzinflate". On completion it
will contain what remains of the input buffer after inflation. This means that
$buffer will be an empty string when the return status is "BZ_OK".
When the return status is "BZ_STREAM_END" the $buffer parameter will
contains what (if anything) was stored in the input buffer after the deflated
data stream.
This feature is useful when processing a file format that encapsulates a
compressed data stream.
Here is an example of using
bzinflate.
use strict ;
use warnings ;
use Compress::Bzip2;
my $x = bzinflateInit()
or die "Cannot create a inflation stream\n" ;
my $input = '' ;
binmode STDIN;
binmode STDOUT;
my ($output, $status) ;
while (read(STDIN, $input, 4096))
{
($output, $status) = $x->bzinflate(\$input) ;
print $output
if $status == BZ_OK or $status == BZ_STREAM_END ;
last if $status != BZ_OK ;
}
die "inflation failed\n"
unless $status == BZ_STREAM_END ;
Here are some example scripts of using the interface.
use strict ;
use warnings ;
use Compress::Bzip2 ;
die "Usage: bzcat file...\n" unless @ARGV ;
my $file ;
foreach $file (@ARGV) {
my $buffer ;
my $bz = bzopen($file, "rb")
or die "Cannot open $file: $bzerrno\n" ;
print $buffer while $bz->bzread($buffer) > 0 ;
die "Error reading from $file: $bzerrno" . ($bzerrno+0) . "\n"
if $bzerrno != BZ_STREAM_END ;
$bz->bzclose() ;
}
use strict ;
use warnings ;
use Compress::Bzip2 ;
die "Usage: bzgrep pattern file...\n" unless @ARGV >= 2;
my $pattern = shift ;
my $file ;
foreach $file (@ARGV) {
my $bz = bzopen($file, "rb")
or die "Cannot open $file: $bzerrno\n" ;
while ($bz->bzreadline($_) > 0) {
print if /$pattern/ ;
}
die "Error reading from $file: $bzerrno\n"
if $bzerrno != Z_STREAM_END ;
$bz->bzclose() ;
}
This script,
bzstream, does the opposite of the
bzcat script
above. It reads from standard input and writes a bzip file to standard output.
use strict ;
use warnings ;
use Compress::Bzip2 ;
binmode STDOUT; # bzopen only sets it on the fd
my $bz = bzopen(\*STDOUT, "wb")
or die "Cannot open stdout: $bzerrno\n" ;
while (<>) {
$bz->bzwrite($_) or die "error writing: $bzerrno\n" ;
}
$bz->bzclose ;
Use the tags :all, :utilities, :constants, :bzip1 and :gzip.
This exports all the exportable methods.
This exports only the BZ_* constants.
This exports the Compress::Bzip2 1.x functions, for compatibility.
compress
decompress
compress_init
decompress_init
version
These are actually aliases to memBzip and memBunzip.
This gives an interface to the bzip2 methods.
bzopen
bzinflateInit
bzdeflateInit
memBzip
memBunzip
bzip2
bunzip2
bzcat
bzlibversion
$bzerrno
This gives compatibility with Compress::Zlib.
gzopen
gzinflateInit
gzdeflateInit
memGzip
memGunzip
$gzerrno
All the
bzlib constants are automatically imported when you make use of
Compress::Bzip2.
BZ_CONFIG_ERROR
BZ_DATA_ERROR
BZ_DATA_ERROR_MAGIC
BZ_FINISH
BZ_FINISH_OK
BZ_FLUSH
BZ_FLUSH_OK
BZ_IO_ERROR
BZ_MAX_UNUSED
BZ_MEM_ERROR
BZ_OK
BZ_OUTBUFF_FULL
BZ_PARAM_ERROR
BZ_RUN
BZ_RUN_OK
BZ_SEQUENCE_ERROR
BZ_STREAM_END
BZ_UNEXPECTED_EOF
The documentation for zlib, bzip2 and Compress::Zlib.
Rob Janes, <arjay at cpan.org>
Copyright (C) 2005 by Rob Janes
This library is free software; you can redistribute it and/or modify it under
the same terms as Perl itself, either Perl version 5.8.3 or, at your option,
any later version of Perl 5 you may have available.
The
Compress::Bzip2 module was originally written by Gawdi Azem
[email protected].
The first
Compress::Bzip2 module was written by Gawdi Azem
[email protected]. It provided an interface to
the in memory inflate and deflate routines.
Compress::Bzip2 was subsequently passed on to Marco Carnut
[email protected] who shepherded it through to version 1.03, a set of
changes which included upgrades to handle bzlib 1.0.2, and improvements to the
in memory inflate and deflate routines. The streaming interface and error
information were added by David Robins
[email protected].
Version 2 of
Compress::Bzip2 is due to Rob Janes, of
[email protected]. This
release is intended to give an interface close to that of Compress::Zlib. It's
development forks from 1.00, not 1.03, so the streaming interface is not the
same as that in 1.03, although apparently compatible as it passes the 1.03
test suite.
Minor subsequent fixes and releases were done by Reini Urban,
[email protected].
See the Changes file.
2.00 Second public release of
Compress::Bzip2.