GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Encode::IMAPUTF7(3) User Contributed Perl Documentation Encode::IMAPUTF7(3)

Encode::IMAPUTF7 - modification of UTF-7 encoding for IMAP

  use Encode qw/encode decode/;

  print encode('IMAP-UTF-7', 'Répertoire');
  print decode('IMAP-UTF-7', R&AOk-pertoire');

IMAP mailbox names are encoded in a modified UTF7 when names contains international characters outside of the printable ASCII range. The modified UTF-7 encoding is defined in RFC2060 (section 5.1.3).

There is another CPAN module with same purpose, Unicode::IMAPUtf7. However, it works correctly only with strings, which encoded form does not contain plus sign. For example, the Cyrillic string \x{043f}\x{0440}\x{0435}\x{0434}\x{043b}\x{043e}\x{0433} is represented in UTF-7 as +BD8EQAQ1BDQEOwQ+BDM- Note the second plus sign 4 characters before the end. Unicode::IMAPUtf7 encodes the above string as +BD8EQAQ1BDQEOwQ&BDM- which is not valid modified UTF-7 (the ampersand and the plus are swapped). The problem is solved by the current module, which is slightly modified Encode::Unicode::UTF7 and has nothing common with Unicode::IMAPUtf7.

By convention, international mailbox names are specified using a modified version of the UTF-7 encoding described in [UTF-7]. The purpose of these modifications is to correct the following problems with UTF-7:

1) UTF-7 uses the "+" character for shifting; this conflicts with the common use of "+" in mailbox names, in particular USENET newsgroup names.

2) UTF-7's encoding is BASE64 which uses the "/" character; this conflicts with the use of "/" as a popular hierarchy delimiter.

3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with the use of "\" as a popular hierarchy delimiter.

4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with the use of "~" in some servers as a home directory indicator.

5) UTF-7 permits multiple alternate forms to represent the same string; in particular, printable US-ASCII chararacters can be represented in encoded form.

In modified UTF-7, printable US-ASCII characters except for "&" represent themselves; that is, characters with octet values 0x20-0x25 and 0x27-0x7e. The character "&" (0x26) is represented by the two- octet sequence "&-".

All other characters (octet values 0x00-0x1f, 0x7f-0xff, and all Unicode 16-bit octets) are represented in modified BASE64, with a further modification from [UTF-7] that "," is used instead of "/". Modified BASE64 MUST NOT be used to represent any printing US-ASCII character which can represent itself.

"&" is used to shift to modified BASE64 and "-" to shift back to US- ASCII. All names start in US-ASCII, and MUST end in US-ASCII (that is, a name that ends with a Unicode 16-bit octet MUST end with a "- ").

For example, here is a mailbox name which mixes English, Japanese, and Chinese text: ~peter/mail/&ZeVnLIqe-/&U,BTFw-

Please report any requests, suggestions or bugs via the RT bug-tracking system at http://rt.cpan.org/ or email to bug-Encode-IMAPUTF7@rt.cpan.org.

http://rt.cpan.org/NoAuth/Bugs.html?Dist=Encode-IMAPUTF7 is the RT queue for Encode::IMAPUTF7. Please check to see if your bug has already been reported.

Copyright 2005 Sava Chankov

Sava Chankov, sava@cpan.org

This software may be freely copied and distributed under the same terms and conditions as Perl.

Peter Makholm <peter@makholm.net>, current maintainer

Sava Chankov <sava@cpan.org>, original author

perl(1), Encode.

Hey! The above document had some coding errors, which are explained below:
Around line 90:
Non-ASCII character seen before =encoding in ''Répertoire');'. Assuming UTF-8
2010-06-18 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.