 |
|
| |
rwfglob(1) |
SiLK Tool Suite |
rwfglob(1) |
rwfglob - Print files that rwfilter's File Selection switches will
access
rwfglob { [--class=CLASS] [--type={all | TYPE[,TYPE ...]}]
| [--flowtypes=CLASS/TYPE[,CLASS/TYPE ...]] }
[--sensors=SENSOR[,SENSOR ...]]
[--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]]
[--data-rootdir=ROOT_DIRECTORY] [--site-config-file=FILENAME]
[--print-missing-files] [--no-block-check] [--no-file-names]
[--no-summary]
rwfglob [--data-rootdir=ROOT_DIRECTORY]
[--site-config-file=FILENAME] --help
rwfglob --version
rwfglob accepts the same File "Selection
Switches" of rwfilter(1) and prints, to the standard output, the
pathnames of the files that rwfilter would process, one file name per
line. At the end, a summary is printed to the standard output of the number
of files that rwfglob found. To suppress the printing of the file
names and/or the summary, specify the --no-file-names and/or
--no-summary switches, respectively.
By default, rwfglob only prints the names of files that
exist. When the --print-missing-files switch is provided,
rwfglob prints, to the standard error, the names of files that it did
not find, one file name per line, preceded by the text 'Missing '. To
redirect the output of --print-missing-files to the standard output,
use the following in a Bourne-compatible shell:
$ rwfglob --print-missing-files ... 2>&1
As of SiLK 3.20, the "Selection Switches"
--class, --type, --flowtypes, and --sensors
accept a value in the form "@PATH", where
"@" is the "at" character (ASCII
0x40) and PATH names a file or a path to a file. For example, the
following reads the name of types from the file t.txt and uses the
sensors "S3",
"S7", and the names and/or IDs read from
/tmp/sensor.txt:
rwfglob --type=@t.txt --sensors=S3,@/tmp/sensor.txt,S7
Multiple @PATH values are allowed within a single argument.
If the name of the file is "-", the names
are read from the standard input.
The file must be a text file. Blank lines are ignored as are
comments, which begin with the "#"
character and continue to the end of the line. Whitespace at the beginning
and end of a line is ignored as is whitespace that surrounds commas; all
other whitespace within a line is significant.
A file may contain a value on each line and/or multiple values on
a line separated by commas and optional whitespace. For example:
# Sensor 4
S4
# The first sensors
S0, S1,S2
S3 # Sensor 3
An attempt to use an @PATH directive in a file is an
error.
When rwfglob is parsing the name of a file, it converts the
sequences "@," and
"@@" to
"," and
"@", respectively. For example,
--class=@cl@@ss.txt@,v reads the class from the file
cl@ss.txt,v. It is an error if any other character follows an
embedded "@" (--flowtypes=@f@il
contains @i) or if a single
"@" occurs at the end of the name
(--sensor=@errat@).
For each file it finds, rwfglob will check the size of the
file and the number of blocks allocated to the file. If the block count is
zero but the file size is non-zero, rwfglob treats the file as
existing but as residing on tape. The names of these files are
printed to the standard output, but each name is preceded by the text
' \t*** ON_TAPE ***' where '\t' represents a
tab character. The summary line will include the number of files that
rwfglob believes are on tape. To suppress this check and to remove
the count from the summary line, use the --no-block-check switch.
Option names may be abbreviated if the abbreviation is unique or
is an exact match for an option. A parameter to an option may be specified
as --arg=param or --arg param, though the first
form is required for options that take optional parameters.
This set of switches are the same as those used by rwfilter
to select the files to process. At least one of these switches must be
provided.
- --class={CLASS
| @PATH}
- The --class switch is used to specify a group of files to print.
Only a single class may be selected with the --class switch; for
multiple classes, use the --flowtypes switch. The argument may be
"@PATH" which causes rwfglob to open the file
PATH and read the class name from it; see "Read Selection
Argument Values from a File" for details. Classes are defined in the
silk.conf(5) site configuration file. If neither the --class
nor --flowtypes option is given, the default-class as specified in
silk.conf is used. To see the available classes and the default
class, either examine the output from rwfglob --help or invoke
rwsiteinfo(1) with the switch
--fields=class,default-class.
- --type={"all"
| TYPE[,TYPE,@PATH ...]}
- The --type predicate further specifies data within the selected
CLASS by listing the TYPEs of traffic to process. The switch
takes either the keyword "all" to select
all types for CLASS or a comma-separated list of type names and
"@PATH" directives, where @PATH tells
rwfglob to read type names from the file PATH; see
"Read Selection Argument Values from a File" for details. Types
are defined in silk.conf, they typically refer to the direction of
the flow, and they may vary by class. When neither the --type nor
--flowtypes switch is given, a list of default types is used: The
default-type list is determined by the value of CLASS, and the
default types often include only incoming traffic. To see the available
types and the default types for each class, examine the --help
output of rwfglob or run rwsiteinfo with
--fields=class,type,default-type.
- --flowtypes=CLASS/TYPE[,CLASS/TYPE,@PATH
...]
- The --flowtypes predicate provides an alternate way to specify
class/type pairs. The --flowtypes switch allows a single
rwfglob invocation to print filenames from multiple classes. The
keyword "all" may be used for the
CLASS and/or TYPE to select all classes and/or types. As of
SiLK 3.20.0, the arguments may also include "@PATH" which
causes rwfglob to open the file PATH and read the class/type
pairs from it; see "Read Selection Argument Values from a
File".
- --sensors=SENSOR[,SENSOR,SENSOR-GROUP,@PATH
...]
- The --sensors switch is used to select data from specific sensors.
The parameter is a comma separated list of sensor names, sensor IDs
(integers), ranges of sensor IDs, sensor group names, and/or
"@PATH" directives. As described in "Read Selection
Argument Values from a File", @PATH tells rwfglob to
read the names of the sensors from the file PATH. Sensors and
sensor groups are defined in the silk.conf(5) site configuration
file, and the rwsiteinfo(1) command can be used to print a mapping
of sensor names to IDs and classes
(--fields=sensor,id-sensor,class:list). When the --sensors
switch is not specified, the default is to use all sensors which are valid
for the specified class(es). Support for using sensor group names was
added in SiLK 3.21.0.
- --start-date=YYYY/MM/DD[:HH]
- --end-date=YYYY/MM/DD[:HH]
- The date predicates indicate which days and hours to consider when
creating the list of files. The dates may be expressed as seconds since
the UNIX epoch or in "YYYY/MM/DD[:HH]"
format, where the hour is optional. A
"T" may be used in place of the
":" to separate the day and hour.
Whether the "YYYY/MM/DD[:HH]" strings
represent times in UTC or the local timezone depend on how SiLK was
compiled. To determine how your version of SiLK was compiled, see the
"Timezone
support" setting in the output from
rwfglob --version.
When times are expressed in
"YYYY/MM/DD[:HH]" format:
- When both --start-date and --end-date are specified to hour
precision, all hours within that time range are processed.
- When --start-date is specified to day precision, the hour specified
in --end-date (if any) is ignored, and files for all dates between
midnight on start-date and 23:59 on end-date are
processed.
- When --start-date is specified to hour precision and
--end-date is specified to day precision, the hour of the
start-date is used as the hour for the end-date.
- When --end-date is not specified and --start-date is
specified to day precision, files for that complete day are
processed.
- When --end-date is not specified and --start-date is
specified to hour precision, files for that single hour are
processed.
When at least one time is expressed as seconds since the UNIX
epoch:
- When --end-date is specified in epoch seconds, the given
--start-date and --end-date are considered to be in hour
precision.
- When --start-date is specified in epoch seconds and
--end-date is specified in
"YYYY/MM/DD[:HH]" format, the start-date
is considered to be in day precision if it divisible by 86400, and hour
precision otherwise.
- When --start-date is specified in epoch seconds and
--end-date is not given, the start-date is considered to be in
hour-precision.
When neither --start-date nor --end-date is given,
rwfglob prints all files for the current day.
It is an error to specify --end-date without specifying
--start-date.
- --data-rootdir=ROOT_DIRECTORY
- Tell rwfglob to use ROOT_DIRECTORY as the root of the data
repository, which overrides the location given in the SILK_DATA_ROOTDIR
environment variable, which in turn overrides the location that was
compiled into rwfglob (/data).
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwfglob searches for the site
configuration file in the locations specified in the "FILES"
section.
- --print-missing-files
- This option prints to the standard error the names of the files that
rwfglob expected to find but did not. The file names are preceded
by the text 'Missing '; each file name appears on a separate line.
This switch is useful for debugging, but the list of files it produces can
be misleading. For example, suppose there is a decommissioned sensor that
still appears in the silk.conf file; rwfglob considers these
data files as missing even though their absence is expected. Use
the output from this switch judiciously.
- --no-block-check
- This option instructs rwfglob not to check whether the file exists
on tape by checking whether the number of blocks allocated to the
file is zero. By default, rwfglob precedes a file name that has a
block count of 0 with the text
' \t*** ON_TAPE ***'.
- --no-file-names
- This option instructs rwfglob not to print the names of the files
that it successfully finds. By default, rwfglob prints the names of
the files it finds and a summary line showing the number of files it
found. When both this switch and --print-missing-files are
specified, rwfglob prints only the names of missing files (and the
summary).
- --no-summary
- This option instructs rwfglob not to print the summary line (that
is, the line that shows the number of files found). By default,
rwfglob prints the names of the files it finds and a summary line
showing the number of files it found.
- --help
- Print the available options and exit. The available classes and types will
be included in output; you may specify a different root directory or site
configuration file before --help to see the classes and types
available for that site.
- --version
- Print the version number and information about how SiLK was configured,
then exit the application.
In the following examples, the dollar sign
("$") represents the shell prompt. The
text after the dollar sign represents the command line.
Looking at a day on a single sensor:
$ rwfglob --start=2003/10/11 --sensor=2
/data/in/2003/10/11/in-GAMMA_20031011.23
/data/in/2003/10/11/in-GAMMA_20031011.22
/data/in/2003/10/11/in-GAMMA_20031011.21
/data/in/2003/10/11/in-GAMMA_20031011.20
/data/in/2003/10/11/in-GAMMA_20031011.19
/data/in/2003/10/11/in-GAMMA_20031011.18
/data/in/2003/10/11/in-GAMMA_20031011.17
/data/in/2003/10/11/in-GAMMA_20031011.16
/data/in/2003/10/11/in-GAMMA_20031011.15
/data/in/2003/10/11/in-GAMMA_20031011.14
/data/in/2003/10/11/in-GAMMA_20031011.13
/data/in/2003/10/11/in-GAMMA_20031011.12
/data/in/2003/10/11/in-GAMMA_20031011.11
/data/in/2003/10/11/in-GAMMA_20031011.10
/data/in/2003/10/11/in-GAMMA_20031011.09
/data/in/2003/10/11/in-GAMMA_20031011.08
/data/in/2003/10/11/in-GAMMA_20031011.07
/data/in/2003/10/11/in-GAMMA_20031011.06
/data/in/2003/10/11/in-GAMMA_20031011.05
/data/in/2003/10/11/in-GAMMA_20031011.04
/data/in/2003/10/11/in-GAMMA_20031011.03
/data/in/2003/10/11/in-GAMMA_20031011.02
/data/in/2003/10/11/in-GAMMA_20031011.01
/data/in/2003/10/11/in-GAMMA_20031011.00
globbed 24 files; 0 on tape
If you only want the summary, specify --no-file-names
$ rwfglob --start-date=2003/10/11 --sensor=2 --no-file-names
globbed 24 files; 0 on tape
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
This value overrides the compiled-in value, and rwfglob uses it
unless the --data-rootdir switch is specified. In addition,
rwfglob may use this value when searching for the SiLK site
configuration file. See the "FILES" section for details.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, rwfglob may use this environment
variable. See the "FILES" section for details.
- TZ
- When a SiLK installation is built to use the local timezone (to determine
if this is the case, check the "Timezone
support" value in the output from rwfglob --version),
the value of the TZ environment variable determines the timezone in which
rwfglob parses timestamps. (The date on the filenames that
rwfglob returns are always in UTC.) If the TZ environment variable
is not set, the default timezone is used. Setting TZ to 0 or the empty
string causes timestamps to be parsed as UTC. The value of the TZ
environment variable is ignored when the SiLK installation uses utc. For
system information on the TZ variable, see tzset(3) or
environ(7).
- ${SILK_CONFIG_FILE}
- ROOT_DIRECTORY/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided, where
ROOT_DIRECTORY/ is the directory rwfglob is using as the
root of the data repository.
- ${SILK_DATA_ROOTDIR}/
- /data/
- Locations for the root directory of the data repository when the
--data-rootdir switch is not specified.
rwfilter(1), rwsiteinfo(1), silk.conf(5),
silk(7), tzset(3), environ(7)
The ability to use @PATH in --class, --type,
--flowtypes, and --sensors was added in SiLK 3.20.0.
As of SiLK 3.20.0, --types is an alias for
--type.
The --sensors switch also accepts the names of groups
defined in the silk.conf(5) file as of SiLK 3.21.0.
The output of --print-missing-files goes to the standard
error, while all other output goes to the standard output. To redirect the
output of --print-missing-files to the standard output, use the
following in a Bourne-compatible shell:
$ rwfglob --print-missing-files ... 2>&1
The --print-missing-files option needs to be smarter about
what files are really missing.
The block count check is of unknown portability across different
tape-farm systems.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc.
|