dggath, dgscat, gscat - convert distributed source graphs to or from
centralized ones
dggath [options] [igfile] [ogfile]
dgscat [options] [igfile] [ogfile]
gscat [options] [igfile] [ogfile]
The
dggath program gathers distributed graphs into centralized graphs. It
reads a set of files
igfile representing fragments of a distributed
source graph, and writes them back on the form of a single centralized source
graph
ogfile.
The
dgscat program scatters centralized source graphs into distributed
graphs. It reads a centralized source graph
igfile and writes it back
on the form of a set of files
ogfile representing fragments of the
corresponding distributed source graph.
The
gscat program does exactly the same as
dgscat, but does not
require to be run in a parallel environment. Since
gscat processes the
input centralized graph file as a text stream, it does not need to load the
full graph in memory before building the distributed graph fragment files. It
is therefore much less resource consuming, but does not allow for the checking
of graph consistency, as it has no global vision of the graph structure.
When file names are not specified, data is read from standard input and written
to standard output. Standard streams can also be explicitly represented by a
dash '-'.
When the proper libraries have been included at compile time,
dggath and
dgscat can directly handle compressed graphs, both as input and output.
A stream is treated as compressed whenever its name is postfixed with a
compressed file extension, such as in 'brol.grf.bz2' or '-.gz'. The
compression formats which can be supported are the bzip2 format ('.bz2'), the
gzip format ('.gz'), and the lzma format ('.lzma').
dggath and
dgscat base on implementations of the MPI interface to
spread work across the processing elements. It is therefore not likely to be
run directly, but instead through some launcher command such as
mpirun.
In order to tell whether programs should read from, or write to, a single file
located on only one processor, or to multiple instances of the same file on
all of the processors, or else to distinct files on each of the processors, a
special grammar has been designed, which is based on the '%' escape character.
Four such escape sequences are defined, which are interpreted independently on
every processor, prior to file opening. By default, when a filename is
provided, it is assumed that the file is to be opened on only one of the
processors, called the root processor, which is usually process 0 of the
communicator within which the program is run. The index of the root processor
can be changed by means of the
-r option. Using any of the first three
escape sequences below will instruct programs to open in parallel a file of
name equal to the interpreted filename, on every processor on which they are
run.
- %p
- Replaced by the number of processes in the global
communicator in which the program is run. Leads to parallel opening.
- %r
- Replaced on each process running the program by the rank of
this process in the global communicator. Leads to parallel opening.
- %-
- Discarded, but leads to parallel opening. This sequence is
mainly used to instruct programs to open on every processor a file of
identical name. The opened files can be, according whether the given path
leads to a shared directory or to directories that are local to each
processor, either to the opening of multiple instances of the same file,
or to the opening of distinct files which may each have a different
content, respectively (but in this latter case it is much recommended to
identify files by means of the '%r' sequence).
- %%
- Replaced by a single '%' character. File names using this
escape sequence are not considered for parallel opening, unless one or
several of the three other escape sequences are also present.
For instance, filename 'brol' will lead to the opening of file 'brol' on the
root processor only, filename '%
-brol' (or even 'br%
-ol') will
lead to the parallel opening of files called 'brol' on every processor, and
filename 'brol%p-%r' will lead to the opening of files ’brol2-0' and
'brol2-1', respectively, on each of the two processors on which the program
were to run.
- -c
- For dggath and dgscat only. Check the
consistency of the input source graph after loading it into memory.
- -h
- Display some help.
- -r?pnum
- Set root process for centralized files (default is 0).
- -V
- Display program version and copyright.
Run
dgscat on 5 processing elements to scatter centralized graph file
brol.grf into 5 gzipped file fragments brol5-0.dgr.gz to brol5-4.dgr.gz.
$ mpirun -np 5 dgscat brol.grf brol%p-%r.dgr.gz
dgmap(1),
dgord(1),
dgtst(1),
gmk_hy(1).
PT-Scotch user's manual.
Francois Pellegrini <
[email protected]>