transmute

NAME

transmute - transform data, particularly within NCBI Entrez Direct

SYNOPSIS

transmute -x2p|-j2p

transmute -align [ -a codes] [-g N] [-h N] [-w N]

transmute -a2x [ -set tag] [ -rec tag]

transmute -t2x|-c2x|-s2x ( tbl2xml / csv2xml / scn2xml) [ -set tag] [ -rec tag] [ -skip N] [ -header] [ -lower|-upper] [ -indent|-flush] columnName1 ...

transmute -g2x (gbf2xml)

transmute -g2r (gbf2ref)

transmute -r2p (ref2pmid) [ -options confirm|verbose|fast|slow|exact ...]

transmute -revcomp

transmute -remove [ -first N] [ -last N]

transmute -retain -leading N-trailing N

transmute -replace -offset N|-column N [ -delete N] [ -insert seq] [ -lower]

transmute -extract [ -1-based] [ -0-based] [ -lower] feat_loc

transmute -cds2prot [ -code N] [ -frame N] [ -stop] [ -trim] [ -part5] [ -part3] [ -every]

transmute -molwt [ -met]

transmute -hgvs

transmute -counts

transmute -diff

transmute -codons -nuc seq -prot seq [ -frame N] [ -three]

transmute -search [ -protein] [ -circular] [ -top] pattern ...

transmute -find [ -relaxed] [ -sensitive] [ -whole] pattern ...

transmute -encodeXML|-decodeXML|-plainXML

transmute -encodeURL|-decodeURL

transmute -encode64|-decode64

transmute -plain

transmute -upper|-lower

transmute -aa1to3|-aa3to1

transmute -relax

transmute -format [fmt] [ -xml declaration] [ -doctype declaration] [ -comment] [ -cdata] [ -combine] [ -self] [ -unicode style] [ -script style] [ -mathml terse]

transmute -filter element action target

transmute -normalize database

-x2p: Reformat XML.

-j2p: Reformat JSON.

-align: Table column alignment.

-a codes: Column alignment codes:

l: Left.

c: Center.

r: Right.

n: Numeric align on decimal point.

N: Trailing zero-pad decimals.

z: Leading zero-pad integers.

m: Commas to group by 3 digits.

M: Commas plus zero-pad decimals.

-g N: Spacing between columns.

-h N: Indentation before columns.

-w N: Minimum column width.

Data Conversion

-j2x: Convert JSON stream to XML suitable for -path navigation.

-set tag: Replace set wrapper tag.

-rec tag: Replace record wrapper tag.

-nest flat|recurse|plural|singular| depth|element: Nested array naming policy.

-a2x: Convert text ASN.1 stream to XML suitable for -path navigation.

-set tag: Replace set wrapper tag.

-rec tag: Replace record wrapper tag.

-t2x, -c2x, -s2x: Convert tab-delimited table, comma-separated values file, or semicolon-delimited table, respectively, to XML.

-set tag: Replace set wrapper tag.

-rec tag: Replace record wrapper tag.

-skip N: Skip the first N lines.

-header: Use fields from first row for column names.

-lower: Convert text to lowercase.

-upper: Convert text to uppercase.

-indent: Indent XML output.

-flush: Do not indent XML output.

columnName1 ...: XML object names per column.

-g2x: Convert GenBank/GenPept flatfile format to INSDSeq XML.

-g2r: Convert GenBank/GenPept flatfile format to Reference XML.

-r2p [-options option ...]: Reference Index XML lookup to find PMIDs. Supported option values:

confirm: Recheck existing PMID claims.

verbose: Add NOTE nodes explaining reasoning.

fast: Prefilter candidates relatively heavily (default).

slow: Prefilter candidates less heavily.

exact: Require exact, unique title matches.

Sequence Editing

-revcomp: Reverse complement nucleotide sequence.

-remove: Trim at ends of sequence.

-first N: Delete first N bases or residues.

-last N: Delete last N bases or residues.

-retain: Save either end of sequence.

-leading N: Keep first N bases or residues.

-trailing N: Keep last N bases or residues.

-replace: Apply base or residue substition.

-offset N: Skip ahead by 0-based count (SPDI), or

-column N: Move just before 1-based position (HGVS).

-delete N: Delete N bases or residues.

-insert seq: Insert given sequence.

-lower: Lower-case original sequence.

-extract [-lower] feat_loc: Use xtract -insd ... feat_location instructions.

-1-based: GenBank feat_location convention.

-0-based: Alignment, or -insd feat_intervals.

-lower: Lower-case extracted sequence.

Sequence Processing

-cds2prot: Translate coding region into protein.

-code N: Use genetic code N (1 by default).

-frame N: Offset in sequence.

-stop: Include stop residue.

-trim: Remove trailing Xs and *s.

-part5: CDS partial at 5' end.

-part3: CDS extends past 3' end.

-every: Translate all codons.

-molwt: Calculate molecular weight of peptide.

-met: Do not cleave leading methionine.

Variation Processing

-hgvs: Convert Human Genome Variation Society variation format to XML.

Sequence Comparison

-counts: Print summary of base or residue counts.

-diff: Compare two aligned files for point differences.

-codons: Display nucleotide codons above amino acid residues.

-nuc seq: Nucleotide sequence.

-prot seq: Protein sequence.

[-frame N]: Offset in nucleotide sequence.

[-three]: Use three-letter residue abbreviations.

Sequence Searching

-search: Search for one or more patterns in a sequence, skipping any FASTA definition line (with a leading >). Each pattern can have an optional alias, e.g., GGATCC:BamHI.

-protein: Do not expand nucleotide ambiguity characters.

-circular: Match patterns spanning the origin of a circular molecule.

-top: Do not search reverse complements of non-palindromic patterns.

pattern: Pattern to search for.

Text Searching

-find: Find one or more patterns in text, allowing digits, spaces, punctuation, and phrases, e.g., "double, double toil and trouble".

-relaxed: Match on words with letters and digits, ignoring spacing and punctuation.

-sensitive: Case-sensitive match, distinguishing upper-case and lower-case letters.

-whole: Match on whole words or multi-word phrases; implies -relaxed.

pattern: Pattern to search for.

String Transformations

XML

-encodeXML: XML-encode <, >, &, ", and ' characters.

-decodeXML: Decode XML entity references.

-plainXML: Remove embedded mixed-content tags and compress runs of spaces.

URL

-encodeURL: Compress runs of spaces, and URI-escape the result.

-decodeURL: URI-unescape the input.

Base64

-encode64: Base64-encode the input.

-decode64: Base64-decode the input.

Accent

-plain: Strip accents from the input.

Case

-upper: Convert the input to uppercase.

-lower: Convert the input to lowercase.

Protein

-aa1to3: Convert amino acids from 1-character to 3-character format.

-aa3to1: Convert amino acids from 3-character to 1-character format.

Letters plus Digits

-relax: Remove all punctuation and compress whitespace.

Customized XML Reformatting

-format [fmt]

compact: Compress runs of spaces.

flush: Suppress line indentation.

indent: Indent according to nesting depth.

expand: Place each attribute on a separate line.

-xml declaration: Use the given XML declaration.

-doctype declaration: Use the given document type declaration.

-comment: Preserve comments.

-cdata: Preserve cdata blocks.

-combine: If the input contains multiple top-level documents, combine them.

-self: Keep empty self-closing tags.

-unicode style: How to handle Unicode superscript and subscript digits (first converted to ASCII form in all cases).

fuse: Run them all together, with no additional markup.

space: Add spaces between digits in different positions.

period: Add periods between digits in different positions.

brackets: Surround superscripts by square brackets and subscripts by parentheses.

markdown: Surround superscripts with carets and subscripts with tildes.

slash: Add backslashes when going up in height and forward slashes when going down.

tag: Put superscripts in XML sup elements and subscripts in sub elements.

-script style: How to handle XML sup and sub elements (denoting superscripts and subscripts, respectively).

brackets: Surround superscripts by square brackets and subscripts by parentheses.

markdown: Surround superscripts with carets and subscripts with tildes.

-mathml terse: Flatten MathML markup tersely.

XML Modification

-filter element action target: Actions:

retain: Keep matching elements (no-op).

remove: Remove matching elements.

encode: HTML-escape special characters.

decode: Decode HTML escapes.

shrink: Compress runs of spaces.

expand: Place each attribute on a separate line.

accent: Strip off Unicode accents.

Targets:

content: Plain-text content.

cdata: CDATA blocks.

comment: Comments.

object: The whole object.

attributes: Attributes.

container: Start and end tags.

EFetch XML Normalization

-normalize database: Adjust XML fields to conform to common conventions.