NAME

cleanasn - clean up irregularities in NCBI ASN.1 objects

SYNOPSIS

cleanasn [ -] [ -A filename] [ -B  str] [ -C str] [ -D  str] [ -F str] [ -K  str] [ -L filename] [ -M  filename] [ -N str] [ -O  str] [ -P str] [ -Q  str] [ -R] [ -S str] [ -T] [ -U str] [ -V str] [ -X  str] [ -Z str] [ -a  str] [ -b] [ -c] [ -d str] [ -f str] [ -i  filename] [ -j filename] [ -k  filename] [ -m str] [ -n  path] [ -o filename] [ -p  path] [ -q path] [ -r  path] [ -v path] [ -x  ext]

DESCRIPTION

cleanasn is a utility program to clean up irregularities in NCBI ASN.1 objects.

OPTIONS

A summary of options is included below.
-
Print usage message
-A filename
Accession list file
-B str
Branch, per the flags in str:
c
Has coding regions
d
No coding regions
p
Passes validation
q
Validator errors or rejects
r
Only pop/phy/mut/eco/WGS sets
s
Exclude pop/phy/mut/eco/WGS sets
t
Only nuc-prot sets
u
Exclude nuc-prot sets
v
Only segmented sequences
w
Exclude segmented sequences
x
Only segmented proteins
y
Exclude segmented proteins
-C str
Sequence operations, per the flags in str:
c
Compress
d
Decompress
l
Recalculated segmented sequence length
v
Virtual gaps inside segmented sequence
s
Convert segmented set to delta sequence
t
Non-NucProt segmented set to delta sequence
u
Improved non-NucProt segmented set to delta sequence
g
Raw to delta by assembly gap
m
Merge assembly gap features
-D str
Clean up descriptors, per the flags in str:
t
Remove Title
c
Remove Comment
n
Remove Nuc-Prot Set title
e
Remove Pop/Phy/Mut/Eco Set title
m
Remove mRNA title
p
Remove Protein title
a
Title to name
b
AutoDef title or name
x
Prefix title with organism name
-F str
Clean up features, per the flags in str:
u
Remove User-objects
d
Remove db_xrefs
e
Remove /evidence and /inference
g
Fuse multi-interval genes
i
Fuse adjacent-interval imported features
r
Remove redundant gene xrefs
f
Fuse duplicate features
s
Package features on referenced Bioseq
k
Package coding-region or parts features
z
Delete or update EC numbers
b
Set Best coding-region reading frame
x
Retranslate coding regions
a
Adjust for missing stop codon
-K str
Perform a general cleanup, per the flags in str:
b
BasicSeqEntryCleanup
p
C++ BasicCleanup (via an external utility)
v
AdvancedSeqEntryCleanup
s
SeriousSeqEntryCleanup
x
ExtendedSeqEntryCleanup
g
GpipeSeqEntryCleanup
n
Normalize descriptor order
u
Remove NcbiCleanup User Objects
c
Synchronize genetic Codes
f
CDS partial from translation
e
Impose CDS partials
d
Resynchronize CDS partials
m
Resynchronize mRNA partials
t
Resynchronize Peptide partials
a
Adjust consensus splice
i
Promote to "worst" Seq-ID
r
Reassign local IDs
l
Remove locus
-L filename
Log file
-M filename
Macro file
-N str
Clean up links, per the flags in str:
o
Link CDS mRNA by Overlap
p
Link CDS mRNA by Product
l
Link CDS mRNA by Label and Location
r
Reassign feature IDs
m
Merge colliding feature IDs
f
Fix missing reciprocal feature IDs
c
Clear feature IDs
-O str
Missing prot-ref name
-P str
Publication options:
a
Remove All publications
s
Remove Serial number
f
Remove Figure, numbering, and name
r
Remove Remark
u
Update PMID-only publication
j
Lookup ISO Journal title abbreviation
m
Merge identical publication features
#
Replace unpublished with PMID
-Q str
Report:
c
Record count
r
ASN.1 BSEC report
s
ASN.1 SSEC report
n
NORM vs. SSEC report
e
PopPhyMutEco AutoDef report
o
Overlap report
l
Latitude-longitude country diff
d
Log SSEC differences
g
GenBank SSEC diff
f
asn2gb/asn2flat diff
h
Seg-to-delta GenBank diff
v
Validator SSEC diff
m
Modernize Gene/RNA/PCR
u
Unpublished Pub lookup
p
Published Pub lookup
j
Unindexed Journal report
t
tRNA anticodon report
w
Component offset report
x
Custom scan
-R
Remote fetching from ID (NCBI sequence databases)
-S str
Selective difference filter (capital letters skip)
s
SSEC
b
BSEC
A
Author
p
Publication
l
Location
r
RNA
q
Qualifier sort order
g
Genbank block
k
Package CdRegion or parts features
m
Move publication
o
Leave duplicate Bioseq publication
d
Automatic definition line
e
Pop/Phy/Mut/Eco Set definition line
-T
Taxonomy Lookup
-U str
Modernize, per the flags in str:
g
Genes
r
RNA
p
PCR Primers
-V str
Remove features by validator severity:
r
Reject
e
Error
w
Warning
i
Info
-X str
Miscellaneous options, per str:
d
Automatic definition line
s
Automatic definition line with Source qualifiers
e
Pop/Phy/Mut/Eco Set definition line
n
Instantiate NC title
m
Instantiate NM titles
x
Special XM titles
p
Instantiate Protein titles
g
GPipe instantiate titles
c
Create mRNAs for coding sequences
f
Fix reciprocal protein_id/transcript_id
v
Revert preRNA or ncRNA transcript_id
t
Parse anticodon from Sequence
b
Batch cleanup of multireader output
z
Wrap SegSet with NucProt set
w
GFF/WGS genome cleanup
-Z str
Remove indicated User-object
-a str
ASN.1 type
a
Any (default)
e
Seq-entry
b
Bioseq
s
Bioseq-set
m
Seq-submit
t
Batch Bioseq-set
u
Batch Seq-submit
-b
Input ASN.1 is Binary
-c
Input ASN.1 is Compressed
-d str
Source database
a
Any (default)
g
GenBank
e
EMBL
d
DDBJ
b
EMBL or DDBJ
i
INSD
r
RefSeq
n
NCBI
x
Exclude EMBL/DDBJ
y
Exclude gbcon, gbest, gbgss, gbhtg, gbpat, gbsts
-f str
Substring filter
-i filename
Single input file (defaults to stdin)
-j filename
First filename
-k filename
Last filename
-m str
Flatfile mode:
r
Release
e
Entrez
s
Sequin
d
Dump
-n path
asn2flat executable (default is /netopt/ncbi_tools/bin/asn2flat)
-o filename
Single output file (defaults to stdout)
-p path
Process all matching files in path
-q path
ffdiff executable (default is /netopt/genbank/subtool/bin/ffdiff)
-r path
Path for results
-v path
asnval executable (default is /netopt/ncbi_tools/bin/asnval)
-x ext
File selection suffix for use with -p (defaults to .ent)

AUTHOR

The National Center for Biotechnology Information.

SEE ALSO

asndisc(1), asnval(1), sequin(1).

Recommended readings

Pages related to cleanasn you should read also: