GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  ENCODE::JP (3)

.ds Aq ’

NAME

Encode::JP - Japanese Encodings

CONTENTS

SYNOPSIS



    use Encode qw/encode decode/;
    $euc_jp = encode("euc-jp", $utf8);   # loads Encode::JP implicitly
    $utf8   = decode("euc-jp", $euc_jp); # ditto



ABSTRACT

This module implements Japanese charset encodings. Encodings supported are as follows.



  Canonical   Alias             Description
  --------------------------------------------------------------------
  euc-jp      /\beuc.*jp$/i     EUC (Extended Unix Character)
              /\bjp.*euc/i  
          /\bujis$/i
  shiftjis    /\bshift.*jis$/i  Shift JIS (aka MS Kanji)
          /\bsjis$/i
  7bit-jis    /\bjis$/i         7bit JIS
  iso-2022-jp                   ISO-2022-JP                  [RFC1468]
                = 7bit JIS with all Halfwidth Kana
                  converted to Fullwidth
  iso-2022-jp-1                 ISO-2022-JP-1                [RFC2237]
                                = ISO-2022-JP with JIS X 0212-1990
                  support.  See below
  MacJapanese                   Shift JIS + Apple vendor mappings
  cp932       /\bwindows-31j$/i Code Page 932
                                = Shift JIS + MS/IBM vendor mappings
  jis0201-raw                   JIS0201, raw format
  jis0208-raw                   JIS0201, raw format
  jis0212-raw                   JIS0201, raw format
  --------------------------------------------------------------------



DESCRIPTION

To find out how to use this module in detail, see Encode.

Note on ISO-2022-JP(-1)?

ISO-2022-JP-1 (RFC2237) is a superset of ISO-2022-JP (RFC1468) which adds support for JIS X 0212-1990. That means you can use the same code to decode to utf8 but not vice versa.



  $utf8 = decode(iso-2022-jp-1, $stream);



and



  $utf8 = decode(iso-2022-jp,   $stream);



yield the same result but



  $with_0212 = encode(iso-2022-jp-1, $utf8);



is now different from



  $without_0212 = encode(iso-2022-jp, $utf8 );



In the latter case, characters that map to 0212 are first converted to U+3013 (0xA2AE in EUC-JP; a white square also known as ’Tofu’ or ’geta mark’) then fed to the decoding engine. U+FFFD is not used, in order to preserve text layout as much as possible.

BUGS

The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium.

SEE ALSO

Encode
Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.22.1 ENCODE::JP (3) 2015-10-17

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.