void bt_postprocess_string (char * s, ushort options)
Post-processes an individual string, s, which is modified in place. The only post-processing option that makes sense on individual strings is whether to collapse whitespace according to the BibTeX rules; thus, if options & BTO_COLLAPSE is false, this function has no effect. (Although it makes a complete pass over the string anyways. This is for future expansion.)
The exact rules for collapsing whitespace are simple: non-space whitespace characters (tabs and newlines mainly) are converted to space, any strings of more than one space within are collapsed to a single space, and any leading or trailing spaces are deleted. (Ensuring that all whitespace is spaces is actually done by btparses lexical scanner, so strings in btparse ASTs will never have whitespace apart from space. Likewise, any strings passed to bt_postprocess_string() should not contain non-space whitespace characters.)
char * bt_postprocess_value (AST * value, ushort options, boolean replace);
Post-processes a single field value, which is the head of a list of simple values as returned by bt_next_value(). All of the relevant string-processing options come into play here: conversion of numbers to strings (BTO_CONVERT), macro expansion (BTO_EXPAND), collapsing of whitespace (BTO_COLLAPSE), and string pasting (BTO_PASTE). Since pasting substrings together without first expanding macros and converting numbers would be nonsensical, attempting to do so is a fatal error.
If replace is true, then the list headed by value will be replaced by a list representing the processed value. That is, if string pasting is turned on (options & BTO_PASTE is true), then this list will be collapsed to a single node containing the single string that results from pasting together all the substrings. If string pasting is not on, then each node in the list will be left intact, but will have its text replaced by processed text.
If replace is false, then a new string will be built on the fly and returned by the function. Note that if pasting is not on in this case, you will only get the last string in the list. (It doesnt really make a lot of sense to post-process a value without pasting unless youre replacing it with the new value, though.)
Returns the string that resulted from processing the whole value, which only makes sense if pasting was on or there was only one value in the list. If a multiple-value list was processed without pasting, the last string in the list is returned (after processing).
Consider what might be done to the value of the author field in the above example, which is the concatenation of a string, a macro, and another string. Assume that the macro and expands to " and ", and that the variable value points to the sub-AST for this value. The original sub-AST corresponding to this value is
To fully process this value in-place, you would call
This would convert the value to a single-element list,
and return the fully-processed string "Bob Jones and Jim Smith". Note that the and macro has been expanded, interpolated between the two literal strings, everything pasted together, and finally whitespace collapsed. (Collapsing whitespace before concatenating the strings would be a bad idea.)
(Incidentally, BTO_FULL is just a macro for the combination of all possible string-processing options, currently:
There are two other similar shortcut macros: BTO_MACRO to express the special string-processing done on macro values, which is the same as BTO_FULL except for the absence of BTO_COLLAPSE; and BTO_MINIMAL, which means no string-processing is to be done.)
Lets say youd rather preserve the list nature of the value, while expanding macros and converting any numbers to strings. (This conversion is trivial: it just changes the type of the node from BTAST_NUMBER to BTAST_STRING. Number values are always stored as a string of digits, just as they appear in the file.) This would be done with the call
which would change the list to
Note that whitespace is collapsed here before any concatenation can be done; this is probably a bad idea. But you can do it if you wish. (If you get any ideas about cooking up your own value post-processing scheme by doing it in little steps like this, take a look at the source to bt_postprocess_value(); it should dissuade you from such a venture.)
char * bt_postprocess_field (AST * field, ushort options, boolean replace);
This is little more than a front-end to bt_postprocess_value(); the only difference is that you pass it a field AST node (eg. the (field,"AuThOr") in the above example), and that it transforms the field name in addition to its value. In particular, the field name is forced to lowercase; this behaviour is (currently) not optional.
Returns the string returned by bt_postprocess_value().
void bt_postprocess_entry (AST * entry, ushort options);
Post-processes all values in an entry. If entry points to the AST for a regular or macro definition entry, then the values are just what youd expect: everything on the right-hand side of a field or macro assignment. You can also post-process comment and preamble entries, though. Comment entries are essentially one big string, so only whitespace collapsing makes sense on them. Preambles may have multiple strings pasted together, so all the string-processing options apply to them. (And theres nothing to prevent you from using macros in a preamble.)
btparse, bt_input, bt_traversal
Greg Ward <email@example.com>
|btparse, version 0.34||BT_POSTPROCESS (3)||2003-10-25|