GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  LT-TRIM (1)

NAME

lt-trim - This application is part of the lexical processing modules and tools ( lttoolbox )

This tool is part of the apertium machine translation architecture: http://www.apertium.org.

CONTENTS

Synopsis
Description
Files
See Also
Bugs
Author

SYNOPSIS

lt-trim analyser_binary bidix_binary trimmed_analyser_binary

DESCRIPTION

lt-trim is the application responsible for trimming compiled dictionaries. The analyses (right-side when compiling lr) of analyser_binary are trimmed to the input side of bidix_binary (left-side when compiling lr, right-side when compiling rl), such that only analyses which would pass through ‘lt-proc -b bidix_binary’ are kept.

Warning: this program is experimental! It has been tested, but not deployed extensively yet.

Both compund tags (‘<compound-only-L>’, ‘<compound-R>’) and join elements (‘<j/>’ in XML, ‘+’ in the stream) and the group element (‘<g/>’ in XML, ‘#’ in the stream) should be handled correctly, even combinations of + followed by # in monodix are handled.

Some minor caveats: If you have the capitalised lemma "Foo" in the monodix, but "foo" in the bidix, an analysis "^Foo<tag>$" would pass through bidix when doing lt-proc -b, but will not make it through trimming. Make sure your lemmas have the same capitalisation in the different dictionaries. Also, you should not have literal ‘+’ or ‘#’ in your lemmas. Since lt-comp doesn’t escape these, lt-trim cannot know that they are different from ‘<j/>’ or ‘<g/>’, and you may get @-marked output this way. You can analyse ‘+’ or ‘#’ by having the literal symbol in the ‘<l>’ part and some other string (e.g. "plus") in the ‘<r>’.

You should not trim a generator unless you have a very simple translator pipeline, since the output of bidix seldom goes unchanged through transfer.

FILES

analyser_binary The untrimmed analyser dictionary (a finite state transducer).

bidix_binary The dictionary to use as trimmer (a finite state transducer).

trimmed_analyser_binary The trimmed analyser dictionary (a finite state transducer).

SEE ALSO

lt-comp(1), lt-proc(1), lt-print(1), lt-expand(1), apertium-tagger(1), apertium(1).

BUGS

Lots of...lurking in the dark and waiting for you!

AUTHOR

(c) 2013--2014 Universitat d’Alacant / Universidad de Alicante.
Search for    or go to Top of page |  Section 1 |  Main Index


LT-TRIM (1) 2014-02-07

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.