|cdecode||use Encode::Guess to decode the character. It behavers like: decode(cp936, $word) under ASCII editing mode and decode(utf8, $word) under Unicode editing mode.|
|Unihan_value||the first field of Unihan.txt is the Unicode scalar value as U+[x]xxxx, we return the [x]xxxx.|
|csplit||split the Chinese characters into an array, English words can be mixed in.|
|csubstr(WORD, OFFSET, LENGTH)||
treat the Chinese character as one word, substr it.
(BE CAFEFUL! its NOT lvalue, we cannt use csubstr($word, 2, 3) = $REPLACEMENT)
if no LENGTH is specified, substr form OFFSET to END.
|clength||treat the Chinese character as one word(length 1).|
a Chinese version of document can be found @ <http://www.fayland.org/journal/Lingua-Han-Utils.html>
Fayland Lam, <fayland at gmail.com>
Please report any bugs or feature requests to bug-lingua-han-utils at rt.cpan.org, or through the web interface at <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-Han-Utils>. I will be notified, and then youll automatically be notified of progress on your bug as I make changes.
You can find documentation for this module with the perldoc command.
You can also look for information at:
o AnnoCPAN: Annotated CPAN documentation o CPAN Ratings o RT: CPANs request tracker
o Search CPAN
the wonderful Encode::Guess
Copyright 2005 Fayland Lam, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
|perl v5.20.3||LINGUA::HAN::UTILS (3)||2014-09-16|