GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  ENCODE::DETECT::CJK (3)

.ds Aq ’

NAME

Encode::Detect::CJK - A Charset Detector, optimized for EastAsia charset and website content

CONTENTS

SYNOPSIS



        use Encode::Detect::CJK; #just use
       
        use Encode::Detect::CJK qw(detect); #use and export function
       
        #simple use it
        my $charset=CharsetDetector::detect($octets);
       
        #use it with advanced option
        my $charset = CharsetDetector::detect($octets,$max_len,$is_consider_html_head_charset);
        #return the charset of binary string $octets
        #$max_len if $octets s size is big, will make detect slow, sometimes you need specify $max_len for detect,null is for DEFAULT(unlimit max_len)
        #$is_consider_html_header_charset, by DEFAULT, detetor will consider
        #       html header (e.g. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> ) as a factor to detect charset,
        #       if you dont want detetor to consider html header as a factor, set $is_consider_html_header_charset to "" or 0



Basic Function

    detect - detect the charset of string



        $charset=CharsetDetector::detect($octets,$max_len,$is_consider_html_head_charset);
        $charset=CharsetDetector::detect($octets,$max_len);#CharsetDetector::detect($octets,$max_len,1);
        $charset=CharsetDetector::detect($octets);#same as CharsetDetector::detect($octets,undef);



Param $octets - input binary string

input binary string

Param $max_len - max length for charset detector

if $octets ’s size is big, will make detect slow, sometimes you need specify $max_len for detect,null is for DEFAULT(unlimit max_len) DEFAULT is unlimit

Param $is_consider_html_head_charset

by DEFAULT, detetor will consider html header (e.g. <meta http-equiv=Content-Type content=text/html; charset=utf-8 /> ) as a factor to detect charset, if you don’t want detetor to consider html header as a factor, set $is_consider_html_header_charset to "" or 0

Return Value $charset

if $octets is null return ’’ if $octets is ’’ return ’iso-8859-1’ else return charset name

Supported Charset List



        return value: alias
       
        ascii       : ascii
        iso-8859-1  : iso-8859-1
        utf8        : utf8 utf-8-strict
        utf16       : utf16
        cp936       : euc-cn(gb2312) cp936(gbk) gb18030
        big5-eten   : big5-eten
        euc-jp      : euc-jp
        shiftjis    : shiftjis
        iso-2022-jp : iso-2022-jp
        euc-kr      : euc-kr
        iso-2022-kr : iso-2022-kr



COPYRIGHT

The CharsetDetector module is Copyright (c) 2003-2008 QIAN YU. All rights reserved.

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 ENCODE::DETECT::CJK (3) 2008-12-05

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.