Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Contact Us
Online Help
Domain Status
Man Pages

Virtual Servers

Topology Map

Server Agreement
Year 2038

USA Flag



Man Pages

Manual Reference Pages  -  KHTTP_PARSE (3)


khttp_parse, khttp_parsex - parse a CGI instance for kcgi


Return Values
See Also


.Lb libkcgi


.In stdint.h
.In kcgi.h enum kcgi_err
.Fo khttp_parse struct kreq *req const struct kvalid *keys size_t keysz const char *const *pages size_t pagesz size_t defpage
.Fc enum kcgi_err
.Fo khttp_parsex struct kreq *req const struct kmimemap *suffixes const char *const *mimes size_t mimemax const struct kvalid *keys size_t keysz const char *const *pages size_t pagesz size_t defmime size_t defpage void *arg void (*argfree)(void *arg) unsigned int debugging const struct kopts *opts
.Vt extern const char *const kmimetypes[KMIME__MAX];
.Vt extern const char *const khttps[KHTTP__MAX];
.Vt extern const char *const kschemes[KSCHEME__MAX];
.Vt extern const char *const kresps[KRESP__MAX];
.Vt extern const char *const kmethods[KMETHOD__MAX];
.Vt extern const struct kmimemap ksuffixmap[];
.Vt extern const char *const ksuffixes[KMIME__MAX];


The khttp_parse and khttp_parsex functions parse and validate input and the HTTP environment (compression, paths, MIME types, and so on). They are the central functions in the kcgi(3) library, parsing and validating key-value form (query string, message body, cookie) data and opaque message bodies. They must always be matched by khttp_free(3) regardless of return value.

The collective arguments are as follows:
arg A pointer to private application data. It is not touched unless argfree is provided.
  Function invoked with arg by the child process starting to parse untrusted network data. This makes sure that no unnecessary data is leaked into the child.
  This bit-field sets debugging of the underlying parse and/or write routines. Debugging messages are sent to stderr and consist of the process ID, a colon, then the logged data. Logged data consists of printable ASCII characters and spaces. A newline will flush the existing line. There are at most BUFSIZ characters per line. Other characters are either escaped (\v, \r, \b) or replaced with a question mark. If the KREQ_DEBUG_WRITE bit is set, write operations directly or indirectly via khttp_write(3) will be logged. When the request is torn down with khttp_free(3), the process ID and total logged bytes are printed on their own line. If the KREQ_DEBUG_READ_BODY bit is set, the entire input body is logged. The total byte count is printed on its own line afterward.
  If no MIME type is specified (that is, there’s no suffix to the page request), use this index in the mimes array.
  If no page was specified (e.g., the default landing page), this is provided as the requested page index.
keys An array of input and validation fields.
keysz The number of elements in keys.
  The MIME index used if no MIME type was matched.
mimes An array of MIME types (e.g., "text/html"), mapped into a MIME index during MIME body parsing. This relates both to pages and input fields with a body type.
opts Tunable options regarding socket buffer sizes and so on. If set to NULL, meaningful defaults are used.
pages An array of recognised pathnames. When pathnames are parsed, they’re matched to indices in this array.
  The number of pages in pages. Also used if the requested page was not in pages.
req Fill with input fields and HTTP context parsed from the CGI environment. This is the main structure carried around in a kcgi(3) application.
  Define the MIME type (suffix) mapping.

The first form, khttp_parse, is for applications using the system-recognised MIME types. This should work well enough for most applications. It is equivalent to invoking the second form, khttp_parsex, as follows:

khttp_parsex(req, ksuffixmap,
  kmimetypes, KMIME__MAX, keys, keysz,
  pages, pagesz, KMIME_TEXT_HTML,
  defpage, NULL, NULL, 0, NULL);


.Vt struct kreq object is filled in by khttp_parse and khttp_parsex. It consists of the following fields:
arg Private application data. This is set during khttp_parse.
auth Type of "managed" HTTP authorisation, if any. This is digest (KAUTH_DIGEST) or basic (KAUTH_BASIC) authorisation performed by the web server. See the rawauth field for raw authorisation requests. If a managed authorisation is specified but with unknown type (i.e., not digest or basic authentiation), this is set to KAUTH_UNKNOWN.
  All key-value pairs read from user cookies.
  Entries in successfully-parsed (or un-parsed) cookies mapped into field indices as defined by the keys argument to khttp_parse.
  Entries in unsuccessfully-parsed (but still attempted) cookies mapped into field indices as defined by the keys argument to khttp_parse.
  The size of the cookies array.
  All key-value pairs read from the requests (query string, cookies, message body).
  Entries in successfully-parsed (or un-parsed) fields mapped into field indices as defined by the keys arguments to khttp_parse.
  Entries in unsuccessfully-parsed (but still attempted) fields mapped into field indices as defined by the keys argument to khttp_parse.
  The number of elements in the fields array.
  The full path following the server name or NULL if there is no path following the server. For example, if foo.cgi/bar/baz is the PATH_INFO, this would be /bar/baz.
host The host-name (i.e., the host of the web application) request passed to the application. This shouldn’t be confused with the application host’s canonical name.

Note: applications will usually accept only KMETHOD_GET and KMETHOD_POST, so be sure to emit a KHTTP_405 status for non-conforming methods.

kdata Internal data. Should not be touched.
keys Value passed to khttp_parse.
keysz Value passed to khttp_parse.
mime The MIME type of the requested file as determined by its suffix matched to the mimemap map passed to khttp_parsex or the default kmimemap if using khttp_parse. This defaults to the mimemax value passed to khttp_parsex or the default KMIME__MAX if using khttp_parse when no suffix is specified or when the suffix is specified but not known.
page The page index as defined by the pages array passed to khttp_parse and parsed from the requested file. This is the first path component! The default page provided to khttp_parse is used if no path was specified or pagesz if the path failed lookup.
  The string corresponding to page.
port The server’s receiving TCP port.
path The path (or empty string) following the parsed component regardless of whether it was located in the path array provided to khttp_parse. For example, if the PATH_INFO is foo.cgi/bar/baz.html, the path component would be baz (with the leading slash stripped).
pname The script name (which may be an empty string in degenerate cases) passed to the server. This may not reflect a file-system entity if re-written by the web server.
  If the web server passes the "Authorization" header (which, for example, Apache doesn’t by default), then the header is parsed into this field, which is of type
.Vt struct khttpauth .
  The string form of the client’s IPV4 or IVP6 address.
  Mapping of
.Vt enum krequ enumeration values to reqs parsed from the input stream.
reqs List of all HTTP request headers, known via
.Vt enum krequ and not known, parsed from the input stream.
reqsz Number of request headers in reqs.
  The access scheme, which is either KSCHEME_HTTP or KSCHEME_HTTPS. The scheme defaults to KSCHEME_HTTP if not specified by the request.
  The suffix part of the PATH_INFO or NULL if none exists. For example, if the PATH_INFO is foo.cgi/bar/baz.html, the suffix would be html. See the mime field for the MIME type parsed from the suffix.

The application may optionally define
.Vt keys provided to khttp_parse and khttp_parsex as an array of
.Vt struct kvalid . This structure is central to the validation of input data. It consists of the following fields:
name The field name, i.e., how it appears in the HTML form input name. This cannot be NULL. If the field name is an empty string and the HTTP message consists of an opaque body (and not key-value pairs), then that field will be used to validate the HTTP message body. This is useful for KMETHOD_PUT style requests.
valid Validating function. This function accepts a single struct kpair * argument and returns an int. If the function is NULL, then no validation is performed and the data is considered as always valid. If you provide your own valid function, it must set the field and parsed variables in the key-value pair. You can also allocate new memory for the val and thus valsz: if the value of val changes during your validation, the new value will be freed with free(3) after being passed out of the sandbox. Note: these functions are invoked from within a system-specific sandbox. You should assume that you cannot invoke any "invasive" system calls such as opening files, sockets, etc. In other words, these must be pure computation.

.Vt struct kpair structure presents the user with fields parsed from input and (possibly) matched to the keys variable passed to khttp_parse and khttp_parsex. It is also passed to the validation function to be filled in. In this case, the MIME-related fields are already filled in and may be examined to determine the method of validation. This is useful when validating opaque message bodies.
ctype The value’s MIME content type (e.g., image/jpeg), or NULL if not defined.
  If ctype is not NULL, it is looked up in the mimes parameter passed to khttp_parsex or ksuffixmap if using khttp_parse. If found, it is set to the appropriate index. Otherwise, it’s mimesz.
file The value’s MIME source filename or NULL if not defined.
key The nil-terminated key (input) name. If the HTTP message body is opaque (e.g., KMETHOD_PUT), then an empty-string key is cooked up.
  If looked up in the keys variable passed to khttp_parse, the index of the looked-up key. Otherwise keysz.
next In a cookie or field map, next points to the next parsed key-value pair with the same key name. This occurs most often in HTML checkbox forms, where many fields may have the same name.
  The parsed, validated value. These may be integer, for a 64-bit signed integer; string, for a nil-termianted character string; or double, for a double-precision floating-point number. This is intentionally basic because the resulting data must be reliably passed from the parsing context back into the web application.
state The validation state: whether validated by a parse, invalidated by a parse, or non-validated (unparsed).
type If parsed, the type of data in parsed, otherwise KFIELD__MAX.
val The (input) value, which is always nil-terminated, but if the data is binary, nil terminators may occur before the true data length of valsz.
valsz The true length of val.
xcode The value’s MIME content transfer encoding (e.g., base64), or NULL if not defined.

.Vt struct khttpauth structure holds authorisation data if passed by the server. If no data was passed by the server, the type value is KAUTH_NONE. Otherwise it’s KAUTH_BASIC or KAUTH_DIGEST, with KAUTH_UNKNOWN if the authorisation type was not recognised. The specific fields are as follows.
  For KAUTH_BASIC or KAUTH_DIGEST authorisation, this field indicates whether all required values were specified.
d A union containing parsed fields per type: basic for KAUTH_BASIC or digest for KAUTH_DIGEST.

If the field for an HTTP authorisation request is KAUTH_BASIC, it will consist of the following for its parsed entities in its
.Vt struct khttpbasic structure:
  The hashed and encoded response string.

If the field for an HTTP authorisation request is KAUTH_DIGEST, it will consist of the following in its
.Vt struct khttpdigest structure:
alg The encoding algorithm, parsed from the possible MD5 or MD5-Sess values.
qop The quality of protection algorithm, which may be unspecified, Auth or Auth-Init.
user The user coordinating the request.
uri The URI for which the request is designated. (This must match the request URI).
realm The request realm.
nonce The server-generated nonce value.
  The (optional) client-generated nonce value.
  The hashed and encoded response string, which entangled fields depending on algorithm and quality of protection.
count The (optional) cnonce counter.
  The (optional) opaque string requested by the server.

.Vt struct kopts structure consists of tunables for network performance. You probably don’t want to use these unless you really know what you’re doing!
  The size of the output buffer. The output buffer is a heap-allocated region into which writes (via khttp_write(3)), are buffered instead of being flushed directly to the wire. The buffer is flushed when it is full or when khttp_free(3) is invoked. If the buffer size is zero, writes are flushed immediately to the wire. If the buffer size is less than zero, it is filled with a meaningful default.

Lastly, the
.Vt struct khead structure holds parsed HTTP headers.
key Holds the HTTP header name. This is not the CGI header name (e.g., HTTP_COOKIE), but the reconstituted HTTP name (e.g., Coookie).
val The opaque header value, which may be an empty string.


A number of variables are defined
.In kcgi.h to simplify invocations of the khttp_parse family. Applications are strongly suggested to use these variables (and associated enumerations) in khttp_parse instead of overriding them with hand-rolled sets in khttp_parsex.
  Indexed list of common MIME types, for example, "text/html" and "application/json". Corresponds to
.Vt enum kmime
.Vt enum khttp .
  Indexed list of HTTP status code and identifier, for example, "200 OK". Corresponds to
.Vt enum khttp .
  Indexed list of URL schemes, for example, "https" or "ftp". Corresponds to
.Vt enum kscheme .
  Indexed list of header response names, for example, "Cache-Control" or "Content-Length". Corresponds to
.Vt enum kresp .
  Indexed list of HTTP methods, for example, "GET" and "POST". Corresponds to
.Vt enum kmethod .
  Map of MIME types defined in
.Vt enum kmime to possible suffixes. This array is terminated with a MIME type of KMIME__MAX and name NULL.
  Indexed list of canonical suffixes for MIME types corresponding to
.Vt enum kmime . Note: this may be a NULL pointer for types that have no canonical suffix, for example. "application/octet-stream".


khttp_parse and khttp_parsex return an error code:
  Success (not an error).
  Memory failure. This can occur in many places: spawning a child, allocating memory, creating sockets, etc.
  Could not allocate file descriptors.
  Could not spawn a child.
  Malformed data between parent and child whilst parsing an HTTP request. (Internal system error.)
  Opaque operating system error.

On failure, the calling application should terminate as soon as possible. Applications should not try to write an HTTP 505 error or similar, but allow the web server to handle the empty CGI response on its own.


kcgi(3), khttp_free(3)


The khttp_parse and khttp_parsex functions were written by
.An Kristaps Dzonsons Aq Mt .
Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.