GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
APERTIUM-TAGGER(1) FreeBSD General Commands Manual APERTIUM-TAGGER(1)

apertium-tagger
part-of-speech tagger and trainer for Apertium

apertium-tagger [options] -g serialized_tagger [input [output]]

apertium-tagger [options] -r iterations corpus serialized_tagger

apertium-tagger [options] -s iterations dictionary corpus tagger_spec serialized_tagger tagged_corpus untagged_corpus

apertium-tagger [options] -s 0 dictionary tagger_spec serialized_tagger tagged_corpus untagged_corpus

apertium-tagger [options] -s 0 -u model serialized_tagger tagged_corpus

apertium-tagger [options] -t iterations dictionary corpus tagger_spec serialized_tagger

apertium-tagger is the application responsible for the apertium part-of-speech tagger training or tagging, depending on the calling options. This command only reads from the standard input if the option --tagger or -g is used.

, --tagger
Tags input text by means of Viterbi algorithm.
n, --retrain n
Retrains the model with n additional Baum-Welch iterations (unsupervised). This option is incompatible with -u (--unigram)
n, --supervised n
Initializes parameters against a hand-tagged text (supervised) through the maximum likelihood estimate method, then performs n iterations of the Baum-Welch training algorithm (unsupervised). The CRP argument can be omitted only when n = 0.
n, --train n
Initializes parameters through Kupiec's method (unsupervised), then performs n iterations of the Baum-Welch training algorithm (unsupervised).

, --unigram=MODEL
use unigram algorithm MODEL from <https://coltekin.net/cagri/papers/trmorph-tools.pdf>
, --sliding-window
use the Light Sliding Window algorithm
, --perceptron
use the averaged perceptron algorithm

, --debug
Print error (if any) or debug messages while operating.
--skip-on-error
Used with -xs to ignore certain types of errors with the training corpus
, --first
Used in conjunction with -g (- -tagger) makes the tagger give all lexical forms of each word, with the chosen one in the first place (after the lemma)
, --mark
Mark disambiguated words.
, --show-superficial
Prints the superficial form of the word along side the lexical form in the output stream.
, --null-flush
Used in conjunction with -g (- -tagger) to flush the output after getting each null character.
--help
Display a help message.

These are the kinds of files used with each option:
dictionary
Full expanded dictionary file
corpus
Training text corpus file
tagger_spec
Tagger specification file, in XML format
serialized_tagger
Tagger data file, built in the training and used while tagging
tagged_corpus
Hand-tagged text corpus
untagged_corpus
Untagged text corpus, morphological analysis of hand-tagged corpus to use both jointly with -s option
input
Input file, stdin by default
output
Output file, stdout by default

apertium(1), lt-comp(1), lt-expand(1), lt-proc(1)

Copyright © 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License.

Many... lurking in the dark and waiting for you!
February 22, 2021 Apertium

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.