CTDConverter - Convert CTD files into Galaxy tool and CWL CommandLineTool files
CTDConverter - A project from the WorkflowConversion family
(
https://github.com/WorkflowConversion/CTDConverter)
Copyright 2017, WorklfowConversion
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You may obtain a
copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.
USAGE:
- $ convert.py [FORMAT] [ARGUMENTS ...]
FORMAT can be either one of the supported output formats: cwl, galaxy.
There is one converter for each supported FORMAT, each taking a different set of
arguments. Please consult the detailed documentation for each of the
converters. Nevertheless, all converters have the following common
parameters/options:
I - Parsing a single CTD file and convert it:
- $ convert.py [FORMAT] -i [INPUT_FILE] -o [OUTPUT_FILE]
II - Parsing several CTD files, output converted wrappers in a given folder:
- $ convert.py [FORMAT] -i [INPUT_FILES] -o
[OUTPUT_DIRECTORY]
III - Hardcoding parameters
- It is possible to hardcode parameters. This makes sense if
you want to set a tool in 'quiet' mode or if your tools support
multi-threading and accept the number of threads via a parameter, without
giving end users the chance to change the values for these
parameters.
- In order to generate hardcoded parameters, you need to
provide a simple file. Each line of this file contains two or three
columns separated by whitespace. Any line starting with a '#' will be
ignored. The first column contains the name of the parameter, the second
column contains the value that will always be set for this parameter. Only
the first two columns are mandatory.
- If the parameter is to be hardcoded only for a set of
tools, then a third column can be added. This column contains a
comma-separated list of tool names for which the parameter will be
hardcoded. If a third column is not present, then all processed tools
containing the given parameter will get a hardcoded value for it.
- The following is an example of a valid file:
- ##################################### HARDCODED PARAMETERS
example ##################################### # Every line starting with a
# will be handled as a comment and will not be parsed. # The first column
is the name of the parameter and the second column is the value that will
be used.
- # Parameter name
- # Value # Tool(s)
- threads
- 8
- mode
- quiet
- xtandem_executable
- xtandem XTandemAdapter
- verbosity
- high Foo, Bar
- #########################################################################################################
- Using the above file will produce a command-line similar
to:
- [TOOL] ... -threads 8 -mode quiet ...
- for all tools. For XTandemAdapter, however, the
command-line will look like:
- XtandemAdapter ... -threads 8 -mode quiet
-xtandem_executable xtandem ...
- And for tools Foo and Bar, the command-line will be similar
to:
- Foo -threads 8 -mode quiet -verbosity
high ...
- IV - Engine-specific parameters
- i - Galaxy
- a. Providing file formats, mimetypes
- Galaxy supports the concept of file format in order to
connect compatible ports, that is, input ports of a certain data format
will be able to receive data from a port from the same format. This
converter allows you to provide a personalized file in which you can
relate the CTD data formats with supported Galaxy data formats. The layout
of this file consists of lines, each of either one or four columns
separated by any amount of whitespace. The content of each column is as
follows:
- * 1st column: file extension * 2nd column: data type, as
listed in Galaxy * 3rd column: full-named Galaxy data type, as it will
appear on datatypes_conf.xml * 4th column: mimetype (optional)
- The following is an example of a valid "file
formats" file:
- ########################################## FILE FORMATS
example ########################################## # Every line starting
with a # will be handled as a comment and will not be parsed. # The first
column is the file format as given in the CTD and second column is the
Galaxy data format. The # second, third, fourth and fifth columns can be
left empty if the data type has already been registered # in Galaxy,
otherwise, all but the mimetype must be provided.
- # CTD type
- # Galaxy type # Long Galaxy data type # Mimetype
- csv
- tabular galaxy.datatypes.data:Text
- fasta ini txt galaxy.datatypes.data:Text txt idxml txt
galaxy.datatypes.xml:GenericXml application/xml options txt
galaxy.datatypes.data:Text grid grid galaxy.datatypes.data:Grid
##########################################################################################################
- Note that each line consists precisely of either one, three
or four columns. In the case of data types already registered in Galaxy
(such as fasta and txt in the above example), only the first column is
needed. In the case of data types that haven't been yet registered in
Galaxy, the first three columns are needed (mimetype is optional).
- For information about Galaxy data types and subclasses, see
the following page:
https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
- b. Finer control over which tools will be converted
- Sometimes only a subset of CTDs needs to be converted. It
is possible to either explicitly specify which tools will be converted or
which tools will not be converted.
- The value of the -s/--skip-tools parameter is a file
in which each line will be interpreted as the name of a tool that will not
be converted. Conversely, the value of the -r/--required-tools is a
file in which each line will be interpreted as a tool that is required.
Only one of these parameters can be specified at a given time.
- The format of both files is exactly the same. As stated
before, each line will be interpreted as the name of a tool. Any line
starting with a '#' will be ignored.
- ii - CWL
- There are, for now, no CWL-specific parameters or
options.