|lowercase||Map uppercase letters to lowercase.|
|width||Decompose full-width and half-width characters.|
|nfc||Unicode Normalization Form C.|
|nfkc||Unicode Normalization Form KC.|
|delimitermap||Map specific characters to . (U+002E; FULL STOP).|
|tr46-map||UTS #46 non-transitional mapping.|
|tr46-map-deviation||UTS #46 transitional mapping.|
|tr46-check||UTS #46 non-transitional validation.|
|tr46-check-deviation||UTS #46 transitional validation.|
|language-local||Language based local mapping.|
|tld-local||TLD based local mapping.|
|rfc5895||Apply lowercase, width, nfc and delimitermap in that order.|
|resman-idna2008-mappings-01||Same as rfc5895.|
|tr46-processing||Apply tr46-map, nfc and tr46-check in that order.|
|tr46-processing-deviation||Apply tr46-map-deviation, nfc and tr46-check-deviation in that order.|
The procedures are executed in the order listed in the entry. The same procedure can be specified twice or more. Suppose that map nfc language-local nfc is specified, idnkit does Unicode Normalization Form C, language based local mapping, and then performs NFC again.
The entry can be specified only once. If the entry is not specified, the library supposes that:
map rfc5895 language-local nfc
delimiters specifies code points which should be mapped to . (U+002E; FULL STOP) at delimitermap.
The mapping is applied only when delimitermap is specified in a map entry.
syntax)delimitermap code-point ...
code-point is a hexadecimal integer of Unicode code point of a delimiter (e.g. 3002), which can be preceded by U+ (e.g. U+3002).
The entry can be specified only once. If the entry is not specified, the library assumes "3002" is specified.
language-local entry specifies language based local mapping.
The mapping procedure is applied only when language-local is specified in a map entry.
syntax)language-local language map-file
If the current language matches language, mapping specified by map-file is performed. Otherwise no mappings are performed.
language must be an ISO639 language code. Both ISO639-1 (e.g. en for English) and ISO639-2 (e.g. eng) codes are recognized.
A local mapping with * as language is a default mapping. When the current language is not matched to any languages of language-local entries, the default mapping is applied.
tld-local entry specifies TLD (top level domain) based local mapping.
The mapping is applied only when tld-local is specified in a map entry.
syntax)tld-local tld map-file
If a TLD of a domain name matches tld, mapping specified by map-file is performed on the domain name. Otherwise no mappings are performed.
If tld is *, mapping is applied to domain names whose TLD does not match any TLDs specified in tld-local entries. If tld is -, the mapping is applied to domain names without any dots.
For backward compatibility to idnkit version 1.0, the entry name local-map can be used instead of tld-local. The entry can be defined multiple times.
idn2.conf or ~/.idn2rc doesnt have an entry to specify the local encoding, since it is determined from the applications current locale information. That is to say each application can use different local encoding.
Though idnkit tries hard to find out the local encoding, sometimes it fails. For example, there are applications which use non-ASCII encoding but work in C locale. In this case, you can specify the applications local encoding by an environment variable IDN_LOCAL_CODESET. Just set the encoding name (or its alias name) to the variable, and idnkit will use the encoding as the local one, regardless of the locale setting.
idnkit version 2 also supports UTS (Unicode Technical Standard) #46, but it is restrictive since the goal of idnkit version 2 is to support IDNA2008.
idnkit version 2 provides four mapping procedures for UTS #46:
tr46-map tr46-map-deviation tr46-check tr46-check-deviation
Input of the mapping procedure is a whole domain name, not a list of labels, and the domain name may contains A-labels. tr46-check and tr46-check-deviation themselves dont split the domain name into labels or convert A-labels in it to U-labels. That is to say that idnkit cannot apply tr46-check or tr46-check-deviation to A-labels.
The following shows a sample configuration file.
# # a sample configuration. #
# The current language. language ja
# Mapping procedures. map lowercase width nfc delimitermap language-local nfc
# Register delimiters delimiters 3002 ff0e ff61
# Register language-specific mappings for Japanese and Turkish. language-local ja /usr/local/share/idnkit/map/ja.map language-local tr /usr/local/share/idnkit/map/tr.map
/usr/local/etc/idn2.conf.sample - sample configuration with comments
|-->||IDN2.CONF (5)||Sep 21, 2012|