arcsub - ARC Submission
The
arcsub command is used for submitting jobs to Grid enabled computing
resources.
arcsub [options] [filename ...]
-
-c, --cluster=name
- select one or more computing elements: name can be
an alias for a single CE, a group of CEs or a URL
-
-g, --index=name
- select one or more registries: name can be an alias
for a single registry, a group of registries or a URL
-
-R, --rejectdiscovery=URL
- skip the service with the given URL during service
discovery
-
-S,
--submissioninterface=InterfaceName
- only use this interface for submitting (e.g.
org.nordugrid.gridftpjob, org.ogf.glue.emies.activitycreation,
org.ogf.bes)
-
-I, --infointerface=InterfaceName
- the computing element specified by URL at the command line
should be queried using this information interface (possible options:
org.nordugrid.ldapng, org.nordugrid.ldapglue2, org.nordugrid.wsrfglue2,
org.ogf.glue.emies.resourceinfo)
-
-e, --jobdescrstring=String
- jobdescription string describing the job to be
submitted
-
-f, --jobdescrfile=filename
- jobdescription file describing the job to be submitted
-
-j, --joblist=filename
- the file storing information about active jobs (default
~/.arc/jobs.xml)
-
-o, --jobids-to-file=filename
- the IDs of the submitted jobs will be appended to this
file
-
-D, --dryrun
- submit jobs as dry run (no submission to batch system)
- --direct
- submit directly - no resource discovery or matchmaking
-
-x, --dumpdescription
- do not submit - dump job description in the language
accepted by the target
-
-P, --listplugins
- list the available plugins
-
-t, --timeout=seconds
- timeout in seconds (default 20)
-
-z, --conffile=filename
- configuration file (default ~/.arc/client.conf)
-
-d, --debug=debuglevel
- FATAL, ERROR, WARNING, INFO, VERBOSE or DEBUG
-
-b, --broker=broker
- selected broker: Random (default), FastestQueue or custom.
Use -P to find possible options.
-
-v, --version
- print version information
-
-?, --help
- print help
-
filename ...
- job description files describing the jobs to be
submitted
arcsub is the key command when submitting jobs to Grid enabled computing
resources with the ARC client. As default
arcsub is able to submit jobs
to A-REX, CREAM and EMI ES enabled computing elements (CEs), and as always for
successful submission you need to be authenticated at the targeted computing
services. Since
arcsub is build on a modular library, modules can be
installed which enables submission to other targets, e.g. the classic ARC CE
Grid-Manager.
Job submission can be accomplished by specifying a job description file to
submit as an argument.
arcsub will then by default perform resource
discovery on the Grid and then the discovered resources will be matched to the
job description and ranked according to the chosen broker (
--broker
option). If no Grid environment has been configured, please contact your
system administrator, or setup one yourself in the client configuration file
(see files section). Another option is to explicitly specify a registry
service (or multiple) to
arcsub using the
--index option, which
accepts an URL, alias or group. Alternatively a specific CE (or multiple) can
be targeted by using the
--cluster option. If such a scenario is the
most common, it is worthwhile to specify those CEs in the client configuration
as default services, which makes it superfluous to specify them as argument.
In the same manner aliases and groups, defined in the configuration file, can
be utilized, and can be used as argument to the
--cluster or
--index options. In all of the above scenarios
arcsub obtains
resource information from the services which is then used for matchmaking
against the job description, however that step can be avoided by specifying
the
--direct option, in which case the job description is submitted
directly to first specified endpoint.
The format of a classic GRIDFTP-based cluster URLs:
[ldap://]<hostname>[:2135/nordugrid-cluster-name=<hostname>,Mds-Vo-name=local,o=grid]
Only the
hostname part has to be specified, the rest of the URL is
automatically generated.
The format of an A-REX URL is:
[https://]<hostname>[:<port>][/<path>]
Here the port is 443 by default, but the path cannot be guessed, so if it is not
specified, then the service is assumed to live on the root path.
Job descriptions can also be specified using the
--jobdescrfile option
which expect the file name of the description as argument, or the
--jobdescrstring option which expect as argument the job description as
a string, and both options can be specified multiple times and one does not
exclude the other. The default supported job description languages are xRSL
and EMIES ADL.
If the job description is successfully submitted a job-ID is returned and
printed. This job-ID uniquely identifies the job while it is being executed.
On the other hand it is also possible that no CEs matches the constraints
defined in the description in which case no submission will be done. Upon
successful submission, the job-ID along with more technical job information is
stored in the job-list file (described below). The stored information enables
the job management commands of the ARC client to manage jobs easily, and thus
the job-ID need not to be saved manually. By default the job-list file is
stored in the .arc directory in the home directory of the user, however
another location can be specified using the
--joblist option taking the
location of this file as argument. If the
--joblist option was used
during submission, it should also be specified in the consecutive commands
when managing the job. If a Computing Element has multiple job submission
interfaces (e.g. gridftp, EMI-ES, BES), then the brokering algorithm will
choose one of them. With the
--submissioninterface option the requested
interface can be specified, and in that case only those Computing Elements
will be considered which has that specific interface, and only that interface
will be used to submit the jobs.
As mentioned above registry or index services can be specified with the
--index option. Specifying one or multiple index servers instructs the
arcsub command to query the servers for registered CEs, the returned
CEs will then be matched against the job description and those matching will
be ranked by the chosen broker (see below) and submission will be tried in
order until successful or reaching the end. From the returned list of CEs it
might happen that a troublesome or undesirable CE is selected for submission,
in that case it possible to reject that cluster using the
--rejectdiscovery option and providing the URL (or just the hostname)
of the CE, which will disregard that CE as a target for submission.
When multiple CEs are targeted for submission, the resource broker will be used
to filter out CEs which do not match the job description requirements and then
rank the remaining CEs. The broker used by default will rank the CEs randomly,
however a different broker can be chosen by using the
--broker option,
which takes the name of the broker as argument. The broker type can also be
specified in client.conf. The brokers available can be seen using
arcsub
-P. By default the following brokers are available:
- Random (default)
- Chooses a random CE matching the job requirements.
- FastestQueue
- Ranks matching CEs according to the length of the job queue
at the CEs, ranking those with shortest queue first/highest.
- Benchmark
- Ranks matching CEs according to a specified benchmark,
which should be specified by appending the broker name with ':' and then
the name of the benchmark. If no option is given to the Benchmark broker
then CEs will be ranked according to the 'specint2000' benchmark.
- Data
- Ranks matching CEs according to the amount of input data
cached by each CE, by querying the CE. Only CEs with the A-REX BES
interface support this operation.
- Null
- Choose a random CE with no filtering at all of CEs.
- PythonBroker
- User-defined custom brokers can be created in Python. See
the example broker SampleBroker.py or ACIXBroker.py (like Data broker but
uses the ARC Cache Index) that come installed with ARC for more details of
how to write your own broker. A PythonBroker is specified by --broker
PythonBroker:Filename.Class:args, where Filename is the file
containing the class Class which implements the broker interface. The
directory containing this file must be in the PYTHONPATH. args is optional
and allows specifying arguments to the broker.
Before submission,
arcsub performs an intelligent modification of the job
description (adding or modifying attributes, even converting the description
language to fit the needs of the CE) ensuring that it is valid. The modified
job description can be printed by specifying the
--dumpdescription
option. The format, i.e. job description language, of the printed job
description cannot be specified, and will be that which will be sent to and
accepted by the chosen target. Further information from
arcsub can be
obtained by increasing the verbosity, which is done with the
--debug
option where the default verbosity level is WARNING. Setting the level to
DEBUG will show all messages, while setting it to FATAL will only show fatal
log messages.
To
validate your job description without actually submitting a job, use
the
--dryrun option: it will capture possible syntax or other errors,
but will instruct the site not to submit the job for execution. Only the
grid-manager (ARC0) and A-REX (ARC1) CEs support this feature.
Submission of a job description file "helloworld.adl" to the Grid
arcsub helloworld.adl
A information index server (registry) can also be queried for CEs to submit to:
arcsub -g registry.example.com helloworld.adl
Submission of a job description file "helloworld.adl" to
ce.example.com:
arcsub -c ce.example.com helloworld.adl
Direct submission to a CE is done as:
arcsub --direct -c cd.example.com helloworld.adl
The job description can also be specified directly on the command line as shown
in the example, using the XRSL job description language:
arcsub -c example.com/arex -e \
´&(executable="/bin/echo")(arguments="Hello
World!")´
When submitting against CEs retrieved from information index servers it might be
useful to do resource brokering:
arcsub -g registry.example.com -b FastestQueue helloworld.adl
If the job has a large input data set, it can be useful to send it to a CE where
those files are already cached. The ACIX broker can be used for this:
arcsub -g registry.example.com -b
PythonBroker:ACIXBroker.ACIXBroker:https://cacheindex.ndgf.org:6443/data/index
helloworld.adl
Disregarding a specific CE for submission submitting against an information
index server:
arcsub -g registry.example.com -R badcomputingelement.com/arex
helloworld.adl
Dumping the job description is done as follows:
arcsub -c example.com/arex -x helloworld.adl
- ~/.arc/client.conf
- Some options can be given default values by specifying them
in the ARC client configuration file. Registry and computing element
services can be specified in separate sections of the config. The default
services can be specified by adding 'default=yes' attribute to the section
of the service, thus when no --cluster or --index options
are given these will be used for submission. Each service has an alias,
and can be member of any number of groups. Then specifying the alias or
the name of the group with the --cluster or --index options
will select the given services. By using the --conffile option a
different configuration file can be used than the default. Note that some
installations also have a system client configuration file, however
attributes in the client one takes precedence, and then command line
options takes precedence over configuration file attributes.
- ~/.arc/jobs.xml
- This a local list of the user's active jobs. When a job is
successfully submitted it is added to this list and when it is removed
from the remote cluster it is removed from this list. This list is used as
the list of all active jobs when the user specifies the --all
option to the various NorduGrid ARC user interface commands. By using the
--joblist option a different file can be used than the default.
- X509_USER_PROXY
- The location of the user's Grid proxy file. Shouldn't be
set unless the proxy is in a non-standard location.
- ARC_LOCATION
- The location where ARC is installed can be specified by
this variable. If not specified the install location will be determined
from the path to the command being executed, and if this fails a WARNING
will be given stating the location which will be used.
- ARC_PLUGIN_PATH
- The location of ARC plugins can be specified by this
variable. Multiple locations can be specified by separating them by : (;
in Windows). The default location is $ARC_LOCATION/lib/arc (\ in
Windows).
APACHE LICENSE Version 2.0
ARC software is developed by the NorduGrid Collaboration
(
http://www.nordugrid.org), please consult the AUTHORS file distributed with
ARC. Please report bugs and feature requests to
http://bugzilla.nordugrid.org
arccat(1),
arcclean(1),
arccp(1),
arcget(1),
arcinfo(1),
arckill(1),
arcls(1),
arcmkdir(1),
arcproxy(1),
arcrenew(1),
arcresub(1),
arcresume(1),
arcrm(1),
arcstat(1),
arcsync(1),
arctest(1)