apertium-deshtml-alt —
HTML format processor for Apertium with
alt-translation
apertium-deshtml-alt |
[-hino]
[input_file
[output_file]] |
This tool is part of
the
Apertium open-source machine translation toolbox.
apertium-deshtml-alt is an HTML format processor.
Data should be passed through this processor before being piped to
lt-proc(1). The program takes input in the form
of an HTML document and produces output suitable for processing with
lt-proc(1). HTML tags and other format
information are enclosed in brackets so that
lt-proc(1) treats them as whitespace between
words. Unlike
apertium-deshtml(1) it unwraps the
alt-attribute of images, letting the alt-text be translated.
-
-h,
--help
- Display this help.
- -i
- Makes the addition of trailing sentence terminator
(‘
.
’) unconditional, often leading
to duplicates.
- -n
- Suppresses the addition of a trailing sentence
terminator.
- -o
- Inserts a "❡" (U+2761 CURVED STEM
PARAGRAPH SIGN ORNAMENT) at the end of <h[1–6]> and
<title> tags.
You could write the following to show how the word “gener” is
analysed:
echo “<b>gener</b><img
alt="gener"/>” | apertium-deshtml-alt | lt-proc
ca-es.automorf.bin
apertium(1),
apertium-desrtf(1),
apertium-destxt(1),
apertium-rehtml(1),
apertium-rehtml-noent(1),
lt-proc(1)
Copyright © 2005-2019 Universitat d'Alacant / Universidad de Alicante.
This is free software. You may redistribute copies of it under the terms of
the GNU
General Public License.
Many... lurking in the dark and waiting for you!