apertium-deshtml-alt
—
HTML format processor for Apertium with
alt-translation
apertium-deshtml-alt |
[-hino ] [input_file
[output_file]] |
This tool is part of
the Apertium open-source machine
translation toolbox.
apertium-deshtml-alt
is an HTML format
processor. Data should be passed through this processor before being piped
to
lt-proc(1).
The program takes input in the form of an HTML document and produces output
suitable for processing with
lt-proc(1).
HTML tags and other format information are enclosed in brackets so that
lt-proc(1)
treats them as whitespace between words. Unlike
apertium-deshtml(1)
it unwraps the alt-attribute of images, letting the alt-text be
translated.
-h
,
--help
- Display this help.
-i
- Makes the addition of trailing sentence terminator
(‘
.
’) unconditional, often leading
to duplicates.
-n
- Suppresses the addition of a trailing sentence terminator.
-o
- Inserts a "❡" (U+2761 CURVED STEM PARAGRAPH SIGN
ORNAMENT) at the end of <h[1–6]> and <title> tags.
You could write the following to show how the word
“gener” is analysed:
echo
"<b>gener</b><img alt="gener"/>" |
apertium-deshtml-alt | lt-proc ca-es.automorf.bin
Copyright © 2005-2019 Universitat d'Alacant / Universidad
de Alicante. This is free software. You may redistribute copies of it under
the terms of the
GNU General Public License.
Many... lurking in the dark and waiting for you!