NAME

CTDConverter - Convert CTD files into Galaxy tool and CWL CommandLineTool files

DESCRIPTION

CTDConverter - A project from the WorkflowConversion family (https://github.com/WorkflowConversion/CTDConverter) Copyright 2017, WorklfowConversion
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
USAGE:
$ convert.py [FORMAT] [ARGUMENTS ...]
FORMAT can be either one of the supported output formats: cwl, galaxy.
There is one converter for each supported FORMAT, each taking a different set of arguments. Please consult the detailed documentation for each of the converters. Nevertheless, all converters have the following common parameters/options:
I - Parsing a single CTD file and convert it:
$ convert.py [FORMAT] -i [INPUT_FILE] -o [OUTPUT_FILE]
II - Parsing several CTD files, output converted wrappers in a given folder:
$ convert.py [FORMAT] -i [INPUT_FILES] -o [OUTPUT_DIRECTORY]
III - Hardcoding parameters
It is possible to hardcode parameters. This makes sense if you want to set a tool in 'quiet' mode or if your tools support multi-threading and accept the number of threads via a parameter, without giving end users the chance to change the values for these parameters.
In order to generate hardcoded parameters, you need to provide a simple file. Each line of this file contains two or three columns separated by whitespace. Any line starting with a '#' will be ignored. The first column contains the name of the parameter, the second column contains the value that will always be set for this parameter. Only the first two columns are mandatory.
If the parameter is to be hardcoded only for a set of tools, then a third column can be added. This column contains a comma-separated list of tool names for which the parameter will be hardcoded. If a third column is not present, then all processed tools containing the given parameter will get a hardcoded value for it.
The following is an example of a valid file:
##################################### HARDCODED PARAMETERS example ##################################### # Every line starting with a # will be handled as a comment and will not be parsed. # The first column is the name of the parameter and the second column is the value that will be used.
# Parameter name
# Value # Tool(s)
threads
8
mode
quiet
xtandem_executable
xtandem XTandemAdapter
verbosity
high Foo, Bar
#########################################################################################################
Using the above file will produce a command-line similar to:
[TOOL] ... -threads 8 -mode quiet ...
for all tools. For XTandemAdapter, however, the command-line will look like:
XtandemAdapter ... -threads 8 -mode quiet -xtandem_executable xtandem ...
And for tools Foo and Bar, the command-line will be similar to:
Foo -threads 8 -mode quiet -verbosity high ...
IV - Engine-specific parameters
i - Galaxy
a. Providing file formats, mimetypes
Galaxy supports the concept of file format in order to connect compatible ports, that is, input ports of a certain data format will be able to receive data from a port from the same format. This converter allows you to provide a personalized file in which you can relate the CTD data formats with supported Galaxy data formats. The layout of this file consists of lines, each of either one or four columns separated by any amount of whitespace. The content of each column is as follows:
* 1st column: file extension * 2nd column: data type, as listed in Galaxy * 3rd column: full-named Galaxy data type, as it will appear on datatypes_conf.xml * 4th column: mimetype (optional)
The following is an example of a valid "file formats" file:
########################################## FILE FORMATS example ########################################## # Every line starting with a # will be handled as a comment and will not be parsed. # The first column is the file format as given in the CTD and second column is the Galaxy data format. The # second, third, fourth and fifth columns can be left empty if the data type has already been registered # in Galaxy, otherwise, all but the mimetype must be provided.
# CTD type
# Galaxy type # Long Galaxy data type # Mimetype
csv
tabular galaxy.datatypes.data:Text
fasta ini txt galaxy.datatypes.data:Text txt idxml txt galaxy.datatypes.xml:GenericXml application/xml options txt galaxy.datatypes.data:Text grid grid galaxy.datatypes.data:Grid ##########################################################################################################
Note that each line consists precisely of either one, three or four columns. In the case of data types already registered in Galaxy (such as fasta and txt in the above example), only the first column is needed. In the case of data types that haven't been yet registered in Galaxy, the first three columns are needed (mimetype is optional).
For information about Galaxy data types and subclasses, see the following page: https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
b. Finer control over which tools will be converted
Sometimes only a subset of CTDs needs to be converted. It is possible to either explicitly specify which tools will be converted or which tools will not be converted.
The value of the -s/--skip-tools parameter is a file in which each line will be interpreted as the name of a tool that will not be converted. Conversely, the value of the -r/--required-tools is a file in which each line will be interpreted as a tool that is required. Only one of these parameters can be specified at a given time.
The format of both files is exactly the same. As stated before, each line will be interpreted as the name of a tool. Any line starting with a '#' will be ignored.
ii - CWL
There are, for now, no CWL-specific parameters or options.