|
NAMEOSSP cfg - Configuration Parsing VERSIONOSSP cfg 0.9.11 (10-Aug-2006) SYNOPSIS
DESCRIPTIONOSSP cfg is a ISO-C library for parsing arbitrary C/C++-style configuration files. A configuration is sequence of directives. Each directive consists of zero or more tokens. Each token can be either a string or again a complete sequence. This means the configuration syntax has a recursive structure and this way allows to create configurations with arbitrarily nested sections. Additionally the configuration syntax provides complex
single/double/balanced quoting of tokens, hexadecimal/octal/decimal
character encodings, character escaping, C/C++ and Shell-style comments,
etc. The library API allows importing a configuration text into an Abstract
Syntax Tree (AST), traversing the AST and optionally exporting the AST again
as a configuration text.
CONFIGURATION SYNTAX The configuration syntax is described by the following context-free (Chomsky-2) grammar: sequence ::= empty
directive ::= token
token ::= OPEN sequence CLOSE
string ::= DQ_STRING # double quoted string
The other contained terminal symbols are defined itself by the following set of grammars production (regular sub-grammars for character sequences given as Perl-style regular expressions "/regex/"): SEP ::= /;/ OPEN ::= /{/ CLOSE ::= /}/ DQ_STRING ::= /"/ DQ_CHARS /"/ DQ_CHARS ::= empty
DQ_CHAR ::= /\\"/ # escaped quote
SQ_STRING ::= /'/ SQ_CHARS /'/ SQ_CHARS ::= empty
SQ_CHAR ::= /\\'/ # escaped quote
FQ_STRING ::= /q/ FQ_OPEN FQ_CHARS FQ_CLOSE FQ_CHARS ::= empty
FQ_CHAR ::= /\\/ FQ_OPEN # escaped open
FQ_OPEN ::= /[!"#$%&'()*+,-./:;<=>?@\[\\\]^_`{⎪}~]/ FQ_CLOSE ::= << FQ_OPEN or corresponding
closing char
PT_STRING ::= PT_CHAR PT_CHARS PT_CHARS ::= empty
PT_CHAR ::= /[^ \t\n;{}"']/ # none of specials Additionally, white-space WS and comment CO tokens are allowed at any position in the above productions of the previous grammar part. WS ::= /[ \t\n]+/ CO ::= CO_C # style of C
CO_C ::= /\/\*([^*]⎪\*(?!\/))*\*\// CO_CXX ::= /\/\/[^\n]*/ CO_SH ::= /#[^\n]*/ Finally, any configuration line can have a trailing backslash
character (\) just before the newline character for
simple line continuation. The backslash, the newline and (optionally) the
leading whitespaces on the following line are silently obsorbed and as a
side-effect continue the first line with the contents of the second lines.
CONFIGURATION EXAMPLE A more intuitive description of the configuration syntax is perhaps given by the following example which shows all features at once: /* single word */ foo; /* multi word */ foo bar quux; /* nested structure */
foo { bar; baz } quux;
/* quoted strings */ 'foo bar' "foo\x0a\t\n\ bar" APPLICATION PROGRAMMING INTERFACE (API)... NODE SELECTION SPECIFICATIONThe cfg_node_select function takes a node selection specification string select for locating the intended nodes. This specification is defined as: select ::= empty
select-step ::= select-direction
select-direction ::= "./" # current node
select-pattern ::= /</ regex />/
select-filter ::= empty
filter-range ::= num # short for: num,num
num ::= /^[+-]?[0-9]+/ regex ::= << Regular Expression (PCRE-based) >> token ::= << Plain-Text Token String >> IMPLEMENTATION ISSUESGoal: non-hardcoded syntax tokens, only hard-coded syntax structure Goal: time-efficient parsing Goal: space-efficient storage Goal: representation of configuration as AST Goal: manipulation (annotation, etc) of AST via API Goal: dynamic syntax verification HISTORYOSSP cfg was implemented in lots of small steps over a very long time. The first ideas date back to the year 1995 when Ralf S. Engelschall attended his first compiler construction lessons at university. But it was first time finished in summer 2002 by him for use in the OSSP project. AUTHORRalf S. Engelschall rse@engelschall.com www.engelschall.com
|