GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
dtsrhanfile(special file) dtsrhanfile(special file)

dtsrhanfile — Describes the format and syntax of DtSearch han files

filename.han

Han files are the user generated profile files for dtsrhan. They identify fields in incoming text from which output fzk file fields can be constructed. The data from han files are loaded into memory by dtsrhan at initialization time. dtsrhan and han files have not been internationalized; han files may only contain ASCII characters.

All identifiers must begin with a letter, and must be composed entirely of alphanumerics and/or the underscore.

Observe the following points when using using "strings":

If an identifying string contains quotes, use a backslash to create the quote. Example:

this string
would find the string this string "contains" quotes.
The above point makes it necessary to use double backslashes to create a single backslash. Example:

this string has a \ backslash
would find the string this string has a  backslash.
Actually, using the backslash in any string will cause the next character to be included without exception. Thus, a string with this is test will end up being this is a test. The backslash is ignored, and the next character is imbedded in the string. This is only needed in the two cases described above, but can be used for any purpose.

# ... | blank line
Han file comment. Any line beginning with a pound sign in the first column, or any blank line, is discarded.
Defines a line with a physical line number in the record. physical_line_number must be a number.
Defines a line using a column number and a 'signature' string that should appear at that column. column_number can be a number, or * for 'any column'. "string" should be a string that occurs on the line in question. It is possible to define complex signatures using multiple clauses.
Defines a field based on a declared line, a string found on that line, the offset from the first letter of the string, and the length of field.
line_identifier is an identifier declared with the line directive (see above).
"string" is a string for relative positioning, where a field will follow a string that may not always occur in the same position on a line. If it is known that the field will always be in the same position, an empty string("") may be used. string must be enclosed in double quotes. offset must be a number, identifying the offset from the first character in the string. It starts at position 1, not 0, and may be negative.
length represents the length of the field. It may be a number, or it may be one of two special tokens:
End of word. The field will begin at offset and continue until the next white-space character.
End of line. The field will begin at offset and continue to the end of the line.
An identifier string beginning with 3 uppercase M's ("MMM...") will be considered an English month name string. At run time, if the first 3 chars of the field's value equal the first three chars of an English month name, the value string will be translated to a two character string of digits in the range "01" to "12". For example, if field MMMmymonth had an original value of "April ", it will be translated to "04" before use.
In the case where a line identifier is associated with multiple lines in a single document, the field value will be determined from the last occurrence of the line within the record.
Defines a constant field that can be used in abstracts and keys. The identifier is defined exactly the same as a field identifier. The value must be enclosed in double quotes.
Defines the document date for each document. It will be converted into a correctly formatted fzk file date line.
null specifies undated documents. Undated documents always qualify for searches irrespective of date qualifiers in DtSearchQuery.
field_id is an identifier declared using the field or constant directives (see above). "MMM" fields are often useful for date assemblies.
Multiple fields may be concatenated into a date.
After concatenation, the assembled date must be of the following format: YYYYMMDDhhmm (exactly 12 digits). For example, 199404171701 is April 17, 1994 at 5:01 pm. 200405031000 is May 3, 2004, at 10:00 am (10 o'oclock).
Dates before 1900 or after 5995 are invalid.
If date is not specified or is invalid, a generated date based on the current date and time will be used, but an invalid date will also generate an error message.
Defines the unique database key for each record in a fzk file.
field_id is a field identifier declared using the field or constant directives.
Multiple fields may be concatenated into a key.
time is a special keyword used to generate keys based on the current run date and time, plus a sequential count suffix.
count is a special keyword used to generate keys based on a sequential count of records.
Specifies that keys written by handel are to be entirely converted to upper case. Without using this directive, mixed-case keys are allowed.
Defines the character used to categorize keys for DtSearch. It must be an uppercase ASCII alphabetic character.
Defines the end of text (ETX) delimiter that will separate records.
line_identifier is an identifier declared with the line directive.
bottom is required. It specifies that the ETX will occur at the bottom of each record. Top of record delimiters are not supported.
Defines whether the document image retrieved by DtSearchRetrieve is to contain all or none of the record, prior to application of imageinclude or imageexclude directives later in the han file. It defaults to all.
Defines a line (or range of lines) to be included in the image. line_identifier is an identifier declared with the line directive.
Defines a line (or range of lines) to be excluded from the image. line_identifier is an identifier declared with the line directive.
Defines the abstract to be placed into the fzk file. It is created from the concatenations of fields. field_identifier is an identifier declared with the field directive.
Determines if blank lines are to be removed from the record image or not. It defaults to false.

The sample han file shown here describes a text file containing a concatenated set of man pages documents.

# All records in the incoming text file are delimited by the same
# end of text convention as the default for an fzk file, namely
# a linefeed (control-L) on a line by itself (").
# Define a line named "etx" with that description,
# and declare it to be the &<delimiter>.
# Note that there must be a real ASCII control-L character between
# the quotes in the line below.
line etx = *,"^L"
delimiter = etx, bottom
# The command name that the man page is describing is on the first line.
# To access it we need to define a line directive for line number 1.
line line1 = 1
# The name of the man page command begins in column 3 of line 1,
# and the length is variable.  So we define a field identifier
# named "command1" from column 3 to the end of the word.
field command1 = line1,"",3,eow
# We want each document abstract to have a constant prefix
# followed by the name of the command.
constant preabs = "Man Pages for "
abstract = fields preabs + command1
# We want all keys to be the name of the command, prefixed with
# the same identifying character, an uppercase M.
keychar = M
key = command1
# We want the each document date to be equivalent to the release
# date of the original man pages, which we choose here to hard code
# as November 1, 1994, at 1 o'clock in the afternoon.
constant datecons = "199411011300"
date = datecons

dtsrhan(1), dtsrindex(1), dtsrfzkfiles(4), dtsrlangfiles(4), DtSearch(5)


Search for    or go to Top of page |  Section s |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.