GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Lingua::ZH::TaBE(3) User Contributed Perl Documentation Lingua::ZH::TaBE(3)

Lingua::ZH::TaBE - Chinese processing via libtabe

This document describes version 0.07 of Lingua::ZH::TaBE, released December 31, 2005.

    use Lingua::ZH::TaBE;

    my $tabe = Lingua::ZH::TaBE->new;

    # Phrase splitter
    my @phrases = $tabe->split(
        "當我們在電腦中處理中文資訊時,相信其中最惱人的".
        "狀況之一,莫過於想打的字打不出來了。"
    );

    # Chaining various components
    print $tabe->Chu("道可道,非常道。")    # sentence
        ->chunks->[2]       # 非常道           # chunk
        ->tsis->[0]         # 非常            # phrase
        ->zhis->[1]         # 常     # character
        ->yins->[0]         # ㄔㄤˊ           # pronounciation
        ->zuyins->[0],      # ㄔ     # phonetic symbols

This module is a Perl interface to the TaBE (Taiwan and Big5 Encoding) library, an unified interface and library dealing with Chinese words, phrases, sentences, and phonetic symbols; it is intended to be used as the foundation of Chinese text processing.

Lingua::ZH::TaBE provides an object-oriented interface (preferred), as well as a procedural interface consisting of all C functions in "tabe.h".

new( [tsi_db => $file, tsiyin_db => $file] )
Creates a LibTaBE handle and opens databases. If unspecified, find in the usual libtabe data directory automatically.
split( $string [, $method] )
Split the text in $string; returns a list of strings representing the words obtained. You may specify "Complex" or "Backward" as $method to use an alternate segmentation algorithm.
Chu(), Chunk(), Tsi(), Zhi(), Yin(), ZuYin()
Constructors for various level of objects, each taking one argument for initialization.

chunks()

tsis([$method])

zhis()
yins()

yins()
ToZhi()
ToZhiCode()
IsBig5Code()
ToPackedBig5Code()
LookupRefCount()

zuyins()
zhis()
ToYin()
ToZuYinSymbolSequence()

yin()
zhi()

All functions below belong to the Lingua::ZH::TaBE class; they are not exported by default, but may be imported explicitly, or implicitly via "use Lingua::ZH::TaBE ':all'".

    $TsiDB      = TsiDBOpen($type, $db_name, $flags);
    $num        = TsiInfoLookupPossibleTsiYin($TsiDB, $Tsi);
    $TsiYinDB   = TsiYinDBOpen($type, $db_name, $flags);
    $num        = ChuInfoToChunkInfo($Chu);
    $num        = ChunkSegmentationSimplex($TsiDB, $Chunk);
    $num        = ChunkSegmentationComplex($TsiDB, $Chunk);
    $num        = ChunkSegmentationBackward($TsiDB, $Chunk);
    $num        = TsiInfoLookupZhiYin($TsiDB, $Tsi);
    $string     = YinLookupZhiList($Yin);
    $string     = YinToZuYinSymbolSequence($Yin);
    $yin        = ZuYinSymbolSequenceToYin($string);
    $zhi        = ZuYinIndexToZuYinSymbol($ZuYin);
    $zuyin      = ZuYinSymbolToZuYinIndex($Zhi);
    $zuyin      = ZozyKeyToZuYinIndex($key);
    $num        = ZhiIsBig5Code($Zhi);
    $zhicode    = ZhiToZhiCode($Zhi);
    $zhi        = ZhiCodeToZhi($zhicode);
    $num        = ZhiCodeToPackedBig5Code($zhicode);
    $num        = ZhiCodeLookupRefCount($zhicode);

All constants below belong to the Lingua::ZH::TaBE class; they are not exported by default, but may be imported explicitly, or implicitly via "use Lingua::ZH::TaBE ':all'".

    DB_TYPE_DB                  0
    DB_TYPE_LAST                1
    DB_FLAG_OVERWRITE           0x01
    DB_FLAG_CREATEDB            0x02
    DB_FLAG_READONLY            0x04
    DB_FLAG_NOSYNC              0x08
    DB_FLAG_SHARED              0x10
    DB_FLAG_NOUNPACK_YIN        0x20

The TsiYin family of functions are yet incomplete.

<ftp://xcin.linux.org.tw/pub/xcin/libtabe/devel/>

<http://libtabe.sourceforge.net/>

Audrey Tang <autrijus@autrijus.org>

Copyright 2003, 2004, 2005 by Audrey Tang <autrijus@autrijus.org>.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See <http://www.perl.com/perl/misc/Artistic.html>

2005-12-31 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.