 |
|
| |
LOWDOWN(3) |
FreeBSD Library Functions Manual |
LOWDOWN(3) |
lowdown — simple
markdown translator library
All lowdown functions use one or more of
the following structures.
The main structure for configuring parsing and output is
struct lowdown_opts. It has the following fields:
- enum lowdown_type type
- The output medium:
LOWDOWN_HTML
- HTML5
LOWDOWN_LATEX
- LaTeX
LOWDOWN_MAN
- roff
-m an macros
LOWDOWN_FODT
- “flat” OpenDocument
LOWDOWN_TERM
- ANSI-compatible UTF-8 terminal output
LOWDOWN_GEMINI
- Gemini “gemtext” format
LOWDOWN_NROFF
- roff
-m s macros
LOWDOWN_TREE
- syntax tree (debugging)
- unsigned int feat
- Parse-time features. This bit-field may have the following bits OR'd:
LOWDOWN_ATTRS
- Parse PHP extra link, header, and image attributes.
LOWDOWN_AUTOLINK
- Parse
http , https ,
ftp , mailto , and
relative links or link fragments.
LOWDOWN_CALLOUTS
- Parse MDN/GFM callouts (“admonitions”).
LOWDOWN_COMMONMARK
- Tighten input parsing to the CommonMark specification. This also uses
the first ordered list value instead of starting all lists at one.
This feature is
experimental
and
incomplete.
LOWDOWN_DEFLIST
- Parse PHP extra definition lists. This is currently constrained to
single-key lists.
LOWDOWN_FENCED
- Parse GFM fenced (language-specific) code blocks.
- Parse MMD style footnotes. This only supports the referenced footnote
style, not the “inline” style.
LOWDOWN_HILITE
- Parse highlit sequences. This are disabled by default because it may
be erroneously interpreted as section headers.
LOWDOWN_IMG_EXT
- Deprecated. Use
LOWDOWN_ATTRS instead.
LOWDOWN_MANTITLE
- Recognise manpage titles in Pandoc metadata title lines. Only
applicable if
LOWDOWN_METADATA is also
provided. Manpages titles must begin with a non-empty title followed
by an open parenthesis, digit or “n”, optional letters
after, then a closing parenthesis. This may be optionally followed by
a source and, if a vertical bar is detected, the content after as the
volume. These are passed to the renderers as the
title , volume , and
optionally source and
volume metadata key-value pairs. The original
title is not recoverable.
LOWDOWN_MATH
- Parse mathematics equations.
LOWDOWN_METADATA
- Parse in-document metadata.
LOWDOWN_NOCODEIND
- Do not parse indented content as code blocks.
LOWDOWN_NOINTEM
- Do not parse emphasis within words.
LOWDOWN_STRIKE
- Parse strikethrough sequences.
LOWDOWN_SUPER
- Parse super-scripts. This accepts foo^bar^ GFM super-scripts.
LOWDOWN_SUPER_SHORT
- If
LOWDOWN_SUPER is enabled, instead of the
GFM style, accept the “short” form of superscript. This
accepts foo^bar, which puts the parts following the caret until
whitespace in superscripts; or foo^(bar), which puts only the parts in
parenthesis.
LOWDOWN_TABLES
- Parse GFM tables.
LOWDOWN_TASKLIST
- Parse GFM task list items.
- unsigned int oflags
- Output-time features. Bit values are specific to the
type and are not guaranteed to be globally unique.
For all types:
LOWDOWN_SMARTY
- Don't use smart typography formatting.
LOWDOWN_STANDALONE
- Emit a full document instead of a document fragment. This envelope is
largely populated from metadata if
LOWDOWN_METADATA was provided as an option or
as given in meta or
metaovr.
For LOWDOWN_HTML :
LOWDOWN_HTML_CALLOUT_MDN ,
LOWDOWN_HTML_CALLOUT_GFM
- Output MDN and/or GFM-style callout syntax.
LOWDOWN_HTML_ESCAPE
- If
LOWDOWN_HTML_SKIP_HTML has not been set,
escapes in-document HTML so that it is rendered as opaque text.
LOWDOWN_HTML_HARD_WRAP
- Retain line-breaks within paragraphs.
LOWDOWN_HTML_HEAD_IDS
- Have an identifier written with each header element consisting of an
HTML-escaped version of the header contents.
LOWDOWN_HTML_NUM_ENT
- Convert, when possible, HTML entities to their numeric form. If not
set, the entities are used as given in the input.
LOWDOWN_HTML_OWASP
- When escaping text, be extra paranoid in following the OWASP
suggestions for which characters to escape.
LOWDOWN_HTML_SKIP_HTML
- Do not render in-document HTML at all.
LOWDOWN_HTML_TITLEBLOCK
- Output a Pandoc-style title block. This is a
<header
id="title-block-header"> element right after the
opening <body> containing elements for
specified title, author(s), and date. These are
<h1> and
<p> elements, respectively, with classes
set to what's being output (title, etc.). At least one of these must
be specified for the title block to be output.
For LOWDOWN_GEMINI , there are several
flags for controlling link placement. By default, links (images,
autolinks, and links) are queued when specified in-line then emitted in
a block sequence after the nearest block node. (See
ABSTRACT SYNTAX
TREE.)
LOWDOWN_GEMINI_LINK_END
- Emit the queue of links at the end of the document instead of after
the nearest block node.
LOWDOWN_GEMINI_LINK_IN
- Render all links within the flow of text. This will cause breakage
when nested links, such as images within links, links in blockquotes,
etc. It should not be used unless in carefully crafted documents.
LOWDOWN_GEMINI_LINK_NOREF
- Do not format link labels. Takes precedence over
LOWDOWN_GEMINI_LINK_ROMAN .
LOWDOWN_GEMINI_LINK_ROMAN
- When formatting link labels, use lower-case Roman numerals instead of
the default lowercase hexavigesimal (i.e., “a”,
“b”, ..., “aa”, “ab”,
...).
LOWDOWN_GEMINI_METADATA
- Print metadata as the canonicalised key followed by a colon then the
value, each on one line (newlines replaced by spaces). The metadata
block is terminated by a double newline. If there is no metadata, this
does nothing.
There may only be one of
LOWDOWN_GEMINI_LINK_END or
LOWDOWN_GEMINI_LINK_IN . If both are specified,
the latter is unset.
For LOWDOWN_FODT :
LOWDOWN_ODT_SKIP_HTML
- Do not render in-document HTML at all. Text within HTML elements
remains.
For LOWDOWN_LATEX :
LOWDOWN_LATEX_NUMBERED
- Use the default numbering scheme for sections, subsections, etc. If
not specified, these are inhibited.
LOWDOWN_LATEX_SKIP_HTML
- Do not render in-document HTML at all. Text within HTML elements
remains.
For LOWDOWN_MAN and
LOWDOWN_NROFF :
LOWDOWN_NROFF_GROFF
- Use GNU extensions (i.e., for
groff(1))
when rendering output. The groff arguments must include
-m pdfmark for formatting
links with LOWDOWN_MAN or
-m spdf instead of
-m s for
LOWDOWN_NROFF . Applies to the
LOWDOWN_MAN and
LOWDOWN_NROFF output types.
LOWDOWN_NROFF_NOLINK
- Don't show links at all if they have embedded text. Applies to images
and regular links. Only in
LOWDOWN_MAN or when
LOWDOWN_NROFF_GROFF is not specified.
LOWDOWN_NROFF_NUMBERED
- Use numbered sections if
LOWDOWON_NROFF_GROFF
is not specified. Only applies to the
LOWDOWN_NROFF output type.
LOWDOWN_NROFF_SHORTLINK
- Render link URLs in short form. Applies to images, autolinks, and
regular links. Only in
LOWDOWN_MAN or when
LOWDOWN_NROFF_GROFF is not specified.
LOWDOWN_NROFF_SKIP_HTML
- Do not render in-document HTML at all. Text within HTML elements
remains.
For LOWDOWN_TERM :
LOWDOWN_TERM_ALL_META
- If
LOWDOWN_STANDALONE is specified, output all
metadata instead of just the title, author, and date.
LOWDOWN_TERM_NOANSI
- Don't apply ANSI style codes at all. This implies
LOWDOWN_TERM_NOCOLOUR .
LOWDOWN_TERM_NOCOLOUR
- Don't apply ANSI colour codes. This will still show underline, bold,
etc. This should not be used in difference mode, as the output will
make no sense.
LOWDOWN_TERM_NOLINK
- Don't show links at all. Applies to images and regular links:
autolinks are still shown. This may be combined with
LOWDOWN_TERM_SHORTLINK to also shorten
autolinks.
LOWDOWN_TERM_NORELLINK
- Like
LOWDOWN_TERM_NOLINK , but only for
relative links.
LOWDOWN_TERM_SHORTLINK
- Render link URLs in short form. Applies to images, autolinks, and
regular links. This may be combined with
LOWDOWN_TERM_NOLINK to only show shortened
autolinks.
- size_t maxdepth
- The maximum parse depth before the parser exits. Most documents will have
a parse depth in the single digits.
- struct lowdown_opts_nroff nroff
- If type is
LOWDOWN_MAN or
LOWDOWN_NROFF , this contains constant-width font
variants: const char *cr for roman constant-width,
const char *cb for bold, const char
*ci for italic, and const char *cbi for
bold-italic. If any of these are NULL , they
default to their constant-width variants.
- struct lowdown_opts_odt odt
- If type is
LOWDOWN_FODT ,
this contains const char *sty, which is either
NULL or the OpenDocument styles used when creating
standalone documents. If NULL , the default styles
are used.
- struct lowdown_opts_term term
- If type is
LOWDOWN_TERM ,
this contains size_t cols, the non-zero number of
columns in the terminal; size_t width, the requested
content width or zero for auto; size_t hmargin,
left-margin width; size_t hpadding, left-padding
width eating into width; size_t
vmargin, the vertical margin in lines; and int
centre if the content should be centred
(hmargin is ignored).
- char **meta
- An array of metadata key-value pairs or
NULL . Each
pair must appear as if provided on one line (or multiple lines) of the
input, including the terminating newline character. If not consisting of a
valid pair (e.g., no newline, no colon), then it is ignored. When
processed, these values are overridden by those in the document (if
LOWDOWN_METADATA is specified) or by those in
metaovr.
- size_t metasz
- Number of pairs in metaovr.
- char **metaovr
- See meta. The difference is that
metaovr is applied after meta
and in-document metadata, so it overrides prior values.
- size_t metaovrsz
- Number of pairs in metaovr.
- const char *templ
- If
LOWDOWN_STANDALONE is specified, this is set to
the external template file or NULL to use internal
templating. This is only valid for output media supporting external
templates; otherwise, it may be ignored.
Parsed metadata is held in key-value struct
lowdown_meta pairs, or collectively as struct
lowdown_metaq, if LOWDOWN_METADATA is set in
feat. The former structure consists of the following
fields:
- char *key
- The metadata key in its canonical form: lowercase alphanumerics, hyphen,
and underscore. Whitespace is removed and other characters replaced by a
question mark.
- char *value
- The metadata value. This may be an empty string.
The abstract syntax tree is encoded in struct
lowdown_node, which consists of the following.
- enum lowdown_rndrt type
- The node type, using HTML5 output as an illustration:
LOWDOWN_BLOCKCODE
- A block-level snippet of code described by
<pre><code> .
LOWDOWN_BLOCKHTML
- A block-level snippet of HTML. This is simply opaque HTML
content.
LOWDOWN_BLOCKQUOTE
- A block-level quotation described by
<blockquote> .
LOWDOWN_CODESPAN
- An inline-level snippet of code described by
<code> .
LOWDOWN_DEFINITION
- A definition list described by
<dl> .
LOWDOWN_DEFINITION_DATA
- Definition data described by
<dd> .
LOWDOWN_DEFINITION_TITLE
- Definition title described by
<dt> .
- Container for metadata described by
<head> .
LOWDOWN_DOUBLE_EMPHASIS
- Bold (or otherwise notable) content described by
<strong> .
LOWDOWN_EMPHASIS
- Italic (or otherwise notable) content described by
<em> .
LOWDOWN_ENTITY
- Named or numeric HTML entity.
- Footnote content.
- A block-level header described by one of
<h1> through
<h6> .
LOWDOWN_HIGHLIGHT
- Marked test described by
<mark> .
LOWDOWN_HRULE
- A horizontal line described by
<hr> .
LOWDOWN_IMAGE
- An image described by
<img> .
LOWDOWN_LINEBREAK
- A hard line-break within a block context described by
<br> .
LOWDOWN_LINK
- A link to external media described by
<a> . Links may contain limited child
markup, but not nested links.
LOWDOWN_LINK_AUTO
- Like
LOWDOWN_LINK , except inferred from text
content.
LOWDOWN_LIST
- A list enclosure described by
<ul> or
<ol> .
LOWDOWN_LISTITEM
- A list item described by
<li> .
LOWDOWN_MATH_BLOCK
- A snippet of mathematical text in LaTeX format described within
\[xx\] or \(xx\) . This
is usually (in HTML) externally handled by a JavaScript renderer.
LOWDOWN_META
- Meta-data keys and values. These are described by elements in
<head> .
LOWDOWN_NORMAL_TEXT
- Normal text content.
LOWDOWN_PARAGRAPH
- A block-level paragraph described by
<p> .
LOWDOWN_RAW_HTML
- An inline of raw HTML. (Only if configured during parse.)
LOWDOWN_ROOT
- The root of the document. This is always the topmost node, and the
only node where the parent field is
NULL .
LOWDOWN_STRIKETHROUGH
- Content struck through. Described by
<del> .
LOWDOWN_SUBSCRIPT ,
LOWDOWN_SUPERSCRIPT
- A subscript or superscript described by
<sub> or
<sup> , respectively.
LOWDOWN_TABLE_BLOCK
- A table block described by
<table> .
LOWDOWN_TABLE_BODY
- A table body section described by
<tbody> .
LOWDOWN_TABLE_CELL
- A table cell described by
<td> , or
<th> if in the header.
- A table header section described by
<thead> .
LOWDOWN_TABLE_ROW
- A table row described by
<tr> .
LOWDOWN_TRIPLE_EMPHASIS
- Combination of
LOWDOWN_EMPHASIS and
LOWDOWN_DOUBLE_EMPHASIS .
- size_t id
- An identifier unique within the document. This can be used as a table
index since the number is assigned from a monotonically increasing point
during the parse.
- struct lowdown_node *parent
- The parent of the node, or
NULL at the root.
- enum lowdown_chng chng
- Change tracking: whether this node was inserted
(
LOWDOWN_CHNG_INSERT ), deleted
(LOWDOWN_CHNG_DELETE ), or neither
(LOWDOWN_CHNG_NONE ).
- struct lowdown_nodeq children
- A possibly-empty list of child nodes.
- <anon union>
- An anonymous union of type-specific structures.
- rndr_autolink
- For
LOWDOWN_LINK_AUTO , the link address as
link and the link type
type, which may be one of
HALINK_EMAIL for e-mail links and
HALINK_NORMAL otherwise. Any buffer may be
empty-sized.
- rndr_blockcode
- For
LOWDOWN_BLOCKCODE , the opaque
text of the block and the optional
lang of the code language.
- rndr_blockhtml
- For
LOWDOWN_BLOCKHTML , the opaque HTML
text.
- rndr_codespan
- The opaque text of the contents.
- rndr_definition
- For
LOWDOWN_DEFINITION , containing
flags that may be
HLIST_FL_BLOCK if the definition list should
be interpreted as containing block nodes.
- rndr_entity
- For
LOWDOWN_ENTITY , the entity
text.
- For
LOWDOWN_HEADER , the
level of the header starting at zero (this value
is relative to the metadata base header level, defaulting to one),
optional space-separated class list attr_cls,
and optional single identifier attr_id.
- rndr_image
- For
LOWDOWN_IMAGE , the image address
link, the image title
title, dimensions NxN (width by height) in
dims, and alternate text
alt. CSS in-line style for width and height may
be given in attr_width and/or
attr_height, and a space-separated list of
classes may be in attr_cls and a single
identifier may be in attr_id.
- rndr_link
- Like rndr_autolink, but without a type and
further defining an optional link title title,
optional space-separated class list attr_cls,
and optional single identifier attr_id.
- rndr_list
- For
LOWDOWN_LIST , consists of a bitfield
flags that may be set to
HLIST_FL_ORDERED for an ordered list and
HLIST_FL_UNORDERED for an unordered one. If
HLIST_FL_BLOCK is set, the list should be
output as if items were separate blocks. The
start value for
HLIST_FL_ORDERED is the starting list item
position, which is one by default and never zero. The
items is the number of list items.
- rndr_listitem
- For
LOWDOWN_LISTITEM , consists of a bitfield
flags that may be set to
HLIST_FL_ORDERED for an ordered list,
HLIST_FL_UNORDERED for an unordered list,
HLIST_FL_DEF for definition list data,
HLIST_FL_CHECKED or
HLIST_FL_UNCHECKED for an unordered
“task” list, and/or
HLIST_FL_BLOCK for list item output as if
containing block nodes. The HLIST_FL_BLOCK
should not be used: use the parent list (or definition list) flags for
this. The num is the index in a
HLIST_FL_ORDERED list. It is monotonically
increasing with each item in the list, starting at the
start variable given in struct
rndr_list.
- rndr_math
- For
LOWDOWN_MATH , the mode of display in
blockmode: if 1, in-line math; if 2, multi-line.
The opaque equation, which is assumed to be in LaTeX format, is in the
opaque text.
- rndr_meta
- Each
LOWDOWN_META key-value pair is
represented. The keys are lower-case without spaces or non-ASCII
characters. If provided, enclosed nodes may consist only of
LOWDOWN_NORMAL_TEXT and
LOWDOWN_ENTITY .
- rndr_normal_text
- The basic text content for
LOWDOWN_NORMAL_TEXT . If
flags is set to
HTEXT_ESCAPED , the text may be escaped for
output, but may not be altered by any smart typography or similar (it
should be passed as-is).
- rndr_paragraph
- For
LOWDOWN_PARAGRAPH , species how many
lines the paragraph has in the input file and
beoln, set to non-zero if the paragraph ends
with an empty line instead of a breaking block node.
- rndr_raw_html
- For
LOWDOWN_RAW_HTML , the opaque HTML
text.
- rndr_table
- For
LOWDOWN_TABLE_BLOCK , the number of
columns in each row or header row. The number of
columns in rndr_table,
rndr_table_header, and
rndr_table_cell are the same.
- rndr_table_cell
- For
LOWDOWN_TABLE_CELL , the current
col column number out of
columns. See
rndr_table_header for a description of the bits
in flags. The number of columns in
rndr_table,
rndr_table_header, and
rndr_table_cell are the same.
- For
LOWDOWN_TABLE_HEADER , the number of
columns in each row and the per-column
flags, which may tested for equality against
HTBL_FL_ALIGN_LEFT ,
HTBL_FL_ALIGN_RIGHT , or
HTBL_FL_ALIGN_CENTER after being masked with
HTBL_FL_ALIGNMASK ; or
HTBL_FL_HEADER . If no alignment is specified
after the mask, the default should be left-aligned. The number of
columns in rndr_table,
rndr_table_header, and
rndr_table_cell are the same.
A parsed document is a tree of struct
lowdown_node nodes. If a node is “block”, it may contain
other block or inline nodes. If “inline,” it may only contain
other inline nodes. “Special” nodes are documented below. An
additional mark of “void” means that the node will never
contain children.
Node |
Scope |
LOWDOWN_BLOCKCODE |
block, void |
LOWDOWN_BLOCKHTML |
block, void |
LOWDOWN_BLOCKQUOTE |
block |
LOWDOWN_CODESPAN |
inline, void |
LOWDOWN_DEFINITION |
block |
LOWDOWN_DEFINITION_DATA |
special |
LOWDOWN_DEFINITION_TITLE |
special |
LOWDOWN_DOUBLE_EMPHASIS |
inline |
LOWDOWN_EMPHASIS |
inline |
LOWDOWN_ENTITY |
inline, void |
LOWDOWN_HRULE |
inline, void |
LOWDOWN_IMAGE |
inline, void |
LOWDOWN_LINEBREAK |
inline, void |
LOWDOWN_LINK |
inline |
LOWDOWN_LINK_AUTO |
inline, void |
LOWDOWN_LIST |
block |
LOWDOWN_LISTITEM |
special |
LOWDOWN_MATH_BLOCK |
inline, void |
LOWDOWN_META |
special |
LOWDOWN_NORMAL_TEXT |
inline, void |
LOWDOWN_PARAGRAPH |
block |
LOWDOWN_RAW_HTML |
inline, void |
LOWDOWN_ROOT |
special |
LOWDOWN_STRIKETHROUGH |
inline |
LOWDOWN_SUBSCRIPT |
inline |
LOWDOWN_SUPERSCRIPT |
inline |
LOWDOWN_TABLE_BLOCK |
block |
LOWDOWN_TABLE_BODY |
special |
LOWDOWN_TABLE_CELL |
special |
LOWDOWN_TABLE_ROW |
special |
LOWDOWN_TRIPLE_EMPHASIS |
inline |
The general structure of the AST is as follows. Nodes have no
order imposed on them unless as noted:
Special nodes have specific placement within their parents as
follows:
Lastly, LOWDOWN_FOOTNOTE may appear
anywhere in the document and contains block nodes.
lowdown(1),
lowdown_buf(3),
lowdown_buf_diff(3),
lowdown_diff(3),
lowdown_doc_free(3),
lowdown_doc_new(3),
lowdown_doc_parse(3),
lowdown_file(3),
lowdown_file_diff(3),
lowdown_gemini_free(3),
lowdown_gemini_new(3),
lowdown_gemini_rndr(3),
lowdown_html_free(3),
lowdown_html_new(3),
lowdown_html_rndr(3),
lowdown_latex_free(3),
lowdown_latex_new(3),
lowdown_latex_rndr(3),
lowdown_metaq_free(3),
lowdown_nroff_free(3),
lowdown_nroff_new(3),
lowdown_nroff_rndr(3),
lowdown_odt_free(3),
lowdown_odt_new(3),
lowdown_odt_rndr(3),
lowdown_term_free(3),
lowdown_term_new(3),
lowdown_term_rndr(3),
lowdown_tree_rndr(3),
lowdown(5)
lowdown was forked from
hoedown by
Kristaps Dzonsons,
kristaps@bsd.lv. It has been
considerably modified since.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc.
|