ftimes-dig2ctx.pl - Extract context around matched dig strings
ftimes-dig2ctx.pl [-hLRv] [-d dir] [-e
{file|hex|url}] [-c length] [-p length] [-l regex]
[-r regex] [-i count] [-M pattern] [-m
pattern-file] [-T drop-file] [-t keep-file] -f
{file|-}
This utility extracts a variable amount of context around matched dig strings
using data collected with ftimes(1) or hipdig(1). Data collected
by either of these tools has the following format:
name|type|offset|string
or for FTimes releases < 3.5.0
name|offset|string
Output from this utility is written to stdout and has the
following format:
dig_name|dig_offset|dig_string|ctx_offset|lh_length|mh_length|rh_length|ctx_string
- -c length
- Specifies the desired context length in bytes. You may get less than this
amount depending on where the match occurrs and the size of the input
file.
- -d dir
- Specifies the name of the output directory. The default name is digtree.
This option is ignored unless the encoding scheme, -e, is set to
file. Note: The program will abort if the specified or default directory
exists.
- -e {file|hex|url}
- Specifies the type of encoding to use when printing the context (i.e.,
ctx_string).
If file is specified, then a new file containing the requested
context in raw form will be created under the directory specified by the
-d option. The name and location of this file will be listed in
the ctx_string field. The name format used for these files is as
follows:
<relative_dig_name>.<ctx_offset>_<relative_dig_offset>_<mh_length>
where <relative_dig_name> is the same as
<dig_name> except that leading path information has been removed,
and <relative_dig_offset> is the offset of the dig string in the
newly created file.
- -f {file|-}
- Specifies the name of the input file. A value of '-' will cause the
program to read from stdin.
- -h
- Print a header line.
- -i count
- Specifies the number of input lines to ignore.
- -L
- Preserve the contents of the left-hand boundary. This option is disabled
by default.
- -l regex
- Specifies the left-hand boundary. This is a Perl regular expression that
can be used to limit the amount of context returned.
- -M pattern
- Specifies a pattern that is to be applied to the raw context. The output
records for any context not matched by the pattern will be discarded. Use
the -v option to invert the sense of the match.
Note: The -T and -t options may be used to tee
the input to corresponding drop and keep files -- similar to
tee(1). Matched input records are copied to the keep file, and
unmatched records are copied to the drop file. This is useful for
building a context filter chain where the drop/keep results can be
supplied as input to subsequent stages.
- -m pattern-file
- Specifies a file containing zero or more patterns, one per line, that are
to be applied to the raw context. The output records for any context not
matched by the patterns will be discarded. Use the -v option to
invert the sense of the match.
Note: The -T and -t options may be used to tee
the input to corresponding drop and keep files -- similar to
tee(1). Matched input records are copied to the keep file, and
unmatched records are copied to the drop file. This is useful for
building a context filter chain where the drop/keep results can be
supplied as input to subsequent stages.
- -p length
- Specifies the desired prefix length in bytes. You may get less than this
amount depending on where the match occurrs in the input file.
- -R
- Preserve the contents of the right-hand boundary. This option is disabled
by default.
- -r regex
- Specifies the right-hand boundary. This is a Perl regular expression that
can be used to limit the amount of context returned.
- -T
- Specifies the name of a drop tee file that can be used to capture negative
pattern matches.
- -t
- Specifies the name of a keep tee file that can be used to capture positive
pattern matches.
- -v
- Invert the sense of pattern matching -- similar to the way that
egrep(1) works.
All documentation and code are distributed under same terms and conditions as
FTimes.