archive - Usenet article archiver
archive [
-cfr] [
-a archive] [
-i index]
[
-p pattern] [
input]
archive makes copies of files specified on its standard input. It is
normally run either as a channel feed under
innd or by a script before
news.daily is run.
archive reads the named
input file, or standard input if no file
is given. The input is taken as a sequence of lines; blank lines and lines
starting with a number sign ("#") are ignored. All other lines
should specify the token of an article to archive. Every article is retrieved
from a token, and the Xref header field is used to determine the target file
in the archive directory. You can limit the targets taken from the Xref header
field with the
-p option.
Files are copied to a directory within the archive directory,
patharchive
in
inn.conf (or some other directory given with
-a). The default
is to create a hierarchy that mimics a traditional news spool storage of the
given articles; intermediate directories will be created as needed. For
example, if the input token represents article 2211 in the newsgroup
comp.sources.unix,
archive will by default store the article as:
comp/sources/unix/2211
in the archive area. This can be modified with the
-c and
-f
options.
-
-a archive
- If the -a flag is given, its argument specifies the
root of the archive area, instead of patharchive in
inn.conf.
- -c
- If the -c flag is given, directory names will be
flattened as described under the -f option. Then, additionally, all
posts will be concatenated into a single file, appending to that file if
it already exists. The file name will be "YYYYMM", formed from
the current time when archive is run. In other words, if given an
article in comp.sources.unix on December 14th, 1998, the article would be
appended to the file:
comp.sources.unix/199812
in the archive area.
Articles will be separated by a line containing only
"-----------".
- -f
- If the -f flag is used, directory names will be
flattened, replacing the slashes with the periods. In other words, article
2211 in comp.sources.unix will be written to:
comp.sources.unix/2211
in the archive area.
-
-i index
- If the -i flag is used, archive will append
one line to the file index for each article that it archives. This
line will contain the destination file name, the Message-ID header field,
and the Subject header field of the message, separated by spaces. If
either header is missing (normally not possible if the article was
accepted by innd), it will be replaced by "<none>".
The headers will be transformed using the same rules as are used to
generate overview data (unfolded and then with tabs, CR, and LF replaced
by spaces).
-
-p pattern
- Limits the targets taken from the Xref header field to the
groups specified in pattern. pattern is a uwildmat
pattern matching newsgroups that you wish to have archive
handle.
- -r
- By default, archive sets its standard error to
pathlog/errlog. To suppress this redirection, use the -r
flag.
If the input is exhausted,
archive will exit with a zero status. If an
I/O error occurs, it will try to spool its input, copying it to a file. If
there was no input filename, the standard input will be copied to
pathoutgoing/archive and the program will exit. If an input filename
was given, a temporary file named
input.bch (if
input is an
absolute pathname) or
pathoutgoing/
input.bch (if the filename
does not begin with a slash) is created. Once the input is copied,
archive will try to rename this temporary file to be the name of the
input file, and then exit.
A typical
newsfeeds(5) entry to archive most source newsgroups is as
follows:
source-archive!\
:!*,*sources*,!*wanted*,!*.d\
:Tc,Wn\
:<pathbin>/archive -f -i <patharchive>/INDEX
Replace <pathbin> and <patharchive> with the appropriate paths.
Written by Rich $alz <
[email protected]> for InterNetNews. Converted to
POD by Russ Allbery <
[email protected]>.
inn.conf(5),
libinn_uwildmat(3),
newsfeeds(5).