 |
|
| |
| m17nMtext(3m17n) |
The m17n Library |
m17nMtext(3m17n) |
m17nMtext_-_M- - - M-text objects and API for them.
enum MTextFormat { MTEXT_FORMAT_US_ASCII,
MTEXT_FORMAT_UTF_8, MTEXT_FORMAT_UTF_16LE,
MTEXT_FORMAT_UTF_16BE, MTEXT_FORMAT_UTF_32LE,
MTEXT_FORMAT_UTF_32BE, MTEXT_FORMAT_MAX }
Enumeration for specifying the format of an M-text. enum
MTextLineBreakOption { MTEXT_LBO_SP_CM = 1,
MTEXT_LBO_KOREAN_SP = 2, MTEXT_LBO_AI_AS_ID = 4,
MTEXT_LBO_MAX }
Enumeration for specifying a set of line breaking option.
int mtext_line_break (MText *mt, int pos, int
option, int *after)
Find a linebreak postion of an M-text. MText * mtext ()
Allocate a new M-text. MText * mtext_from_data (const void
*data, int nitems, enum MTextFormat format)
Allocate a new M-text with specified data. void * mtext_data
(MText *mt, enum MTextFormat *fmt, int *nunits, int *pos_idx,
int *unit_idx)
Get information about the text data in M-text. int mtext_len
(MText *mt)
Number of characters in M-text. int mtext_ref_char (MText *mt,
int pos)
Return the character at the specified position in an M-text. int
mtext_set_char (MText *mt, int pos, int c)
Store a character into an M-text. MText * mtext_cat_char
(MText *mt, int c)
Append a character to an M-text. MText * mtext_dup (MText
*mt)
Create a copy of an M-text. MText * mtext_cat (MText
*mt1, MText *mt2)
Append an M-text to another. MText * mtext_ncat (MText
*mt1, MText *mt2, int n)
Append a part of an M-text to another. MText * mtext_cpy
(MText *mt1, MText *mt2)
Copy an M-text to another. MText * mtext_ncpy (MText
*mt1, MText *mt2, int n)
Copy the first some characters in an M-text to another. MText *
mtext_duplicate (MText *mt, int from, int to)
Create a new M-text from a part of an existing M-text. MText *
mtext_copy (MText *mt1, int pos, MText *mt2, int from,
int to)
Copy characters in the specified range into an M-text. int mtext_del
(MText *mt, int from, int to)
Delete characters in the specified range destructively. int mtext_ins
(MText *mt1, int pos, MText *mt2)
Insert an M-text into another M-text. int mtext_insert (MText
*mt1, int pos, MText *mt2, int from, int to)
Insert sub-text of an M-text into another M-text. int mtext_ins_char
(MText *mt, int pos, int c, int n)
Insert a character into an M-text. int mtext_replace (MText
*mt1, int from1, int to1, MText *mt2, int from2, int to2)
Replace sub-text of M-text with another. int mtext_character
(MText *mt, int from, int to, int c)
Search a character in an M-text. int mtext_chr (MText *mt, int
c)
Return the position of the first occurrence of a character in an M-text. int
mtext_rchr (MText *mt, int c)
Return the position of the last occurrence of a character in an M-text. int
mtext_cmp (MText *mt1, MText *mt2)
Compare two M-texts character-by-character. int mtext_ncmp
(MText *mt1, MText *mt2, int n)
Compare initial parts of two M-texts character-by-character. int
mtext_compare (MText *mt1, int from1, int to1, MText
*mt2, int from2, int to2)
Compare specified regions of two M-texts. int mtext_spn (MText
*mt, MText *accept)
Search an M-text for a set of characters. int mtext_cspn (MText
*mt, MText *reject)
Search an M-text for the complement of a set of characters. int
mtext_pbrk (MText *mt, MText *accept)
Search an M-text for any of a set of characters. MText *
mtext_tok (MText *mt, MText *delim, int *pos)
Look for a token in an M-text. int mtext_text (MText *mt1, int
pos, MText *mt2)
Locate an M-text in another. int mtext_search (MText *mt1, int
from, int to, MText *mt2)
Locate an M-text in a specific range of another. int mtext_casecmp
(MText *mt1, MText *mt2)
Compare two M-texts ignoring cases. int mtext_ncasecmp (MText
*mt1, MText *mt2, int n)
Compare initial parts of two M-texts ignoring cases. int
mtext_case_compare (MText *mt1, int from1, int to1,
MText *mt2, int from2, int to2)
Compare specified regions of two M-texts ignoring cases. int
mtext_lowercase (MText *mt)
Lowercase an M-text. int mtext_titlecase (MText *mt)
Titlecase an M-text. int mtext_uppercase (MText *mt)
Uppercase an M-text.
enum MTextFormat MTEXT_FORMAT_UTF_16
Variable of value MTEXT_FORMAT_UTF_16LE or MTEXT_FORMAT_UTF_16BE. const int
MTEXT_FORMAT_UTF_32
Variable of value MTEXT_FORMAT_UTF_32LE or MTEXT_FORMAT_UTF_32BE.
M-text objects and API for them.
In the m17n library, text is represented as an object called
M-text rather than as a C-string (char * or unsigned char *).
An M-text is a sequence of characters whose length is equals to or more than
0, and can be coined from various character sources, e.g. C-strings, files,
character codes, etc.
M-texts are more useful than C-strings in the following
points.
- •
- M-texts can handle mixture of characters of various scripts, including all
Unicode characters and more. This is an indispensable facility when
handling multilingual text.
- •
- Each character in an M-text can have properties called text
properties. Text properties store various kinds of information
attached to parts of an M-text to provide application programs with a
unified view of those information. As rich information can be stored in
M-texts in the form of text properties, functions in application programs
can be simple.
In addition, the library provides many functions to manipulate an
M-text just the same way as a C-string.
enum MTextLineBreakOption
Enumeration for specifying a set of line breaking option. The enum
MTextLineBreakOption is to control the line breaking algorithm of the
function mtext_line_break() by specifying logical-or of the members
in the arg option.
Enumerator
- MTEXT_LBO_SP_CM
- Specify the legacy support for space character as base for combining
marks. See the section 8.3 of UAX#14.
- MTEXT_LBO_KOREAN_SP
- Specify to use space characters for line breaking Korean text.
- MTEXT_LBO_AI_AS_ID
- Specify to treat characters of ambiguous line-breaking class as of
ideographic line-breaking class.
- MTEXT_LBO_MAX
enum MTextFormat MTEXT_FORMAT_UTF_16 [extern]
Variable of value MTEXT_FORMAT_UTF_16LE or MTEXT_FORMAT_UTF_16BE.
The global variable MTEXT_FORMAT_UTF_16 is initialized to
MTEXT_FORMAT_UTF_16LE on a 'Little Endian' system (storing words with
the least significant byte first), and to MTEXT_FORMAT_UTF_16BE on a
'Big Endian' system (storing words with the most significant byte first).
SEE ALSO
Variable of value MTEXT_FORMAT_UTF_32LE or MTEXT_FORMAT_UTF_32BE.
The global variable MTEXT_FORMAT_UTF_32 is initialized to
MTEXT_FORMAT_UTF_32LE on a 'Little Endian' system (storing words with
the least significant byte first), and to MTEXT_FORMAT_UTF_32BE on a
'Big Endian' system (storing words with the most significant byte first).
SEE ALSO
The symbol whose name is 'language'.
Generated automatically by Doxygen for The m17n Library from the
source code.
Copyright (C) 2001 Information-technology Promotion Agency (IPA)
Copyright (C) 2001-2011 National Institute of Advanced Industrial Science and
Technology (AIST)
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License
<http://www.gnu.org/licenses/fdl.html>.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc.
|