groff_char - groff character names
' . ds aq ´ This manual page lists the standard groff
characters. The output characters in this document will look different
depending on which output device was chosen (with option -T
(1) program or the roff formatter). Only the characters that are
available for the device that is being used to print or view this manual page
will be displayed (the device currently used is `').
In the actual version, groff
provides only 8-bit characters for direct
input and named characters for further glyphs. On ASCII platforms, character
codes in the range 0 to 127 (decimal) represent the usual 7-bit ASCII
characters, while codes between 127 and 255 are interpreted as the
corresponding characters in the Latin-1
) code set.
On EBCDIC platforms, only the code page cp1047
is supported (which
contains the same characters as Latin-1). It is rather straightforward (for
the experienced user) to set up other 8bit encodings like Latin-2
will use Unicode in the next major version, no additional
encodings are provided.
All roff systems provide the concept of named characters. In traditional roff
systems, only names of length 2 were used, while groff also provides
support for longer names. It is strongly suggested that only named characters
are used for all characters outside of the 7-bit ASCII range.
Some of the predefined groff escape sequences (with names of length 1)
also produce single characters; these exist for historical reasons or are
printable versions of syntactical characters. They include \\
, and \e
In groff, all of these different types of characters can be tested positively
with the .if c
In this section, the characters in groff are specified in tabular form. The
meaning of the columns is as follows.
- shows how the character is printed for the current device; although this
can have quite a different shape on other devices, it always represents
the same glyph.
- Input name
- specifies how the character is input either directly by a key on the
keyboard, or by a groff escape sequence.
- Input code
- applies to characters which can be input with a single character, and
gives the ISO Latin-1 decimal code of that input character. Note that this
code is equivalent to the lowest 256 Unicode characters; (including 7-bit
ASCII in the range 0 to 127).
- PostScript name
- gives the usual PostScript name of the output character.
These are the basic characters having 7-bit ASCII code values. These are
identical to the first 127 characters of the character standards ISO-8859-1
(Latin-1) and Unicode (range C0 Controls and Basic Latin
). To save
space, not every code has an entry in the following because the following code
ranges are well known.
- Control characters (print as themselves).
- Decimal digits 0 to 9 (print as themselves).
- Upper case letters A-Z (print as themselves).
- Lower case letters a-z (print as themselves).
- Control character (prints as itself).
The remaining ranges constitute the printable, non-alphanumeric ASCII
characters; only these are listed below. As can be seen in the table below,
most of these characters print as themselves; the only exceptions are the
- the ISO Latin-1 `Grave Accent' (code 96) prints as `, a left single
- the ISO Latin-1 `Apostrophe' (code 39) prints as ', a right single
quotation mark; the corresponding ISO Latin-1 characters can be obtained
with \` and \(aq.
- the ISO Latin-1 `Hyphen, Minus Sign' (code 45) prints as a hyphen;
a minus sign can be obtained with \-.
- the ISO Latin-1 `Tilde' (code 126); a larger glyph can be obtained
- the ISO Latin-1 `Circumflex Accent' (code 94); a larger glyph can
be obtained with \(ha.
Output Input Input PostScript Notes
name code name
These characters have character codes between 128 and 255. They are
interpreted as characters according to the Latin-1
code set, being identical to the Unicode range C1 Controls and Latin-1
- the C1 Controls; they print as themselves, but the effect is mostly
- the ISO Latin-1 no-break space is mapped to
`\ ´, the escaped space character.
- the soft hyphen control character (prints as itself). groff never use this
character for output (thus it is omitted in the table below); the input
character 173 is mapped onto \%.
The remaining ranges (161-172, 174-255), called the Latin-1 Supplement
Unicode, are printable characters that print as themselves. Although they can
be specified directly with the keyboard on systems with a Latin-1 code page,
it is better to use their named character equivalent; see next section.
Output Input Input PostScript Notes
name code name
The named character idiom is the standard way to specify special characters in
roff systems. They can be embedded into the document text by using escape
(7) describes how these escape sequences look. The
character names can consist of quite arbitrary characters from the ASCII or
Latin-1 code set, not only alphanumeric characters. Here some examples:
- named character having the name c, which consists of a single
character (length 1).
- named character having the 2-character name ch.
- named character having the name char_name (having length 1, 2, 3,
In groff, each 8bit input character can also referred to by the construct
is the decimal code of the
character, a number between 0 and 255 without leading zeros. They are
mapped onto glyph entities using the .trin
request. Moreover, new
character names can be created by the .char
Output Input PostScript Notes
Copyright © 1989-2000, 2001, 2002 Free Software Foundation, Inc.
This document is distributed under the terms of the FDL (GNU Free Documentation
License) version 1.1 or later. You should have received a copy of the FDL on
your system, it is also available on-line at the
This document is part of groff
, the GNU roff distribution. It was written
by with additions by and
- the GNU roff formatter.
An extension to the troff character set for Europe
- a short reference of the groff formatting language.
, E.G. Keizer, K.J.
Simonsen, J. Akkerhuis; EUUG Newsletter, Volume 9, No. 2, Summer 1989