GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  RWCOUNT (1)

.ds Aq ’

NAME

rwcount - Print traffic summary across time

CONTENTS

SYNOPSIS



  rwcount [--bin-size=SIZE] [--load-scheme=LOADSCHEME]
        [--start-time=START_TIME] [--end-time=END_TIME]
        [--skip-zeroes] [--bin-slots] [--epoch-slots]
        [--timestamp-format=FORMAT] [--no-titles]
        [--no-columns] [--column-separator=CHAR]
        [--no-final-delimiter] [{--delimited | --delimited=CHAR}]
        [--print-filenames] [--copy-input=PATH] [--output-path=PATH]
        [--pager=PAGER_PROG] [--site-config-file=FILENAME]
        [{--legacy-timestamps | --legacy-timestamps={1,0}}]
        {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

  rwcount --help

  rwcount --version



DESCRIPTION

rwcount summarizes SiLK flow records across time. It counts the records in the input stream, and groups their byte and packet totals into time bins. rwcount produces textual output with one row for each bin.

rwcount reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use - or stdin as a file name. If an input file name ends in .gz, the file will be uncompressed as it is read. When the --xargs switch is provided, rwcount will read the names of the files to process from the named text file, or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.

rwcount splits each flow record into bins whose size is determined by the argument to the --bin-size switch. When that switch is not provided, rwcount uses 30-second bins by default.

By default, the first row of data rwcount prints is the bin containing the starting time of the earliest record that appears in the input. rwcount then prints a row for every bin until it reaches the bin containing the most recent ending time. Rows whose counts are zero are printed unless the --skip-zero switch is specified.

The --start-time and --end-time switches tell rwcount to use a specific time for the first row and the final row. The --start-time switch always sets the time stamp on the first bin to the specified time. With the --end-time switch, rwcount computes a maximum end-time by setting any unspecified hour, minute, second, and millisecond field to its maximum value, and the final bin is that which contains the maximum end-time.

When --start-time and --end-time are both specified, rwcount reserves the memory for the bins before it begins processing the records. If the memory cannot be allocated, rwcount exits. If this happens, try reducing the time span or increasing the bin-size.

    Load Scheme

A router or other flow generator summarizes the traffic it sees into records. In addition to the five-tuple (source port and address, destination port and address, and protocol), the record has its start time, end time, total byte count, and total packet count. There is no way to know how the bytes and packets were distributed during the duration of the record: their distribution could be front-loaded, back-loaded, uniform, et cetera.

When the start and end times of a individual flow record put that record into a single bin, rwcount can simply add that record’s volume (byte and packet counts) to the bin.

When the duration of a flow record causes it to span multiple bins, rwcount must to told how to allocate the volume among the bins. The --load-scheme switch determines this, and it has supports the following allocation schemes:
time-proportional Divides the total volume of the flow by the duration of the flow, and multiplies the quotient by the time spent in the bin. Thus, the volume the flow contributes to a bin is proportional to the time the flow spent in the bin. This models a flow where the volume/second ratio is uniform.
bin-uniform Divides the volume of the flow by the number of bins the flow spans, and adds the quotient to each of the bins. In this scheme, the volume/bin ratio is uniform.
start-spike Adds the total volume for the flow into the bin containing the start time of the flow. This models a flow that is front-loaded to the point where the entire volume is a single spike occurring in the initial millisecond of flow.
middle-spike Determines the time at the midpoint of the flow, and adds the entire volume for the flow into the bin containing that time.
end-spike Adds the total volume for the flow into the bin containing the end time of the flow. This models a flow that is back-loaded to the point where the entire volume is a single spike occurring in final millisecond of the flow.
maximum-volume Adds the entire volume for the flow into every bin that contains any part of the flow. In theory, the distribution of the bytes in the record could be a spike that occurs at any point during the flow’s duration. This scheme allows one to determine, in aggregate, the maximum possible volume that could have occurred during this bin. In this scheme, the Records column gives the number of records that were active during the bin.
minimum-volume Acts as though the volume for the flow occurred in some other bin. It is possible that a record that spans multiple bins did not contribute any volume to the current bin. This scheme allows one to determine, in aggregate, the minimum possible volume that may have occurred during this bin. The Records column in this scheme, as in the maximum-volume scheme, gives the number of flow records that were active during the bin.
Be aware that the spike load-schemes allocate the entire flow to a single bin. This can create the impression that there is more traffic occurring during a particular time window that the physical network supports.

The maximum-volume and minimum-volume schemes are used to compute the maximum and minimum volumes that could have been transferred during any one bin. maximum-volume intentionally over-counts the flow volume and minimum-volume intentionally under-counts.

To see the effect of the various load-schemes, suppose rwcount is using 60-second bins and the input contains two records. The first record begins at 12:03:50, ends at 12:06:20, and contains 12,600 bytes (60 bytes/second for 210 seconds). This record may contribute to bins at 12:03, 12:04, 12:05, and 12:06. The second record begins at 12:04:05 and lasts 15 seconds; this record’s volume always contributes its 200 bytes to the 12:04 bin. The --load-scheme option splits the byte-counts of the records as follows:



 BIN                 12:03:00    12:04:00    12:05:00    12:06:00
                   
 time-proportional        600        3800        3600        1200
 bin-uniform             3150        3350        3150        3150
 start-spike            12600         200           0           0
 middle-spike               0         200       12600           0
 end-spike                  0         200           0       12600
 maximum-volume         12600       12800       12600       12600
 minimum-volume             0         200           0           0



For the record that spans multiple bins: the time-proportional scheme assumes 60 bytes/second, the bin-uniform scheme divides the volume evenly by the four bins, the middle-spike scheme assumes all the volume occurs at 12:05:05, the maximum-volume scheme adds the volume to every bin, and the minimum-volume scheme ignores the record.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
--bin-size=SIZE Denote the size of each time bin, in seconds; defaults to 30 seconds. rwcount supports millisecond size bins; SIZE may be a floating point value equal to or greater than than 0.001.
--load-scheme=LOADSCHEME Specify how a flow record that spans multiple bins allocates its bytes and packets among the bins. The default scheme is time-proportional, which assumes the volume/second ratio of the flow record is constant. See the Load Scheme section for additional information on the load-scheme choices. The LOADSCHEME may be one of the following names or numbers; names may be abbreviated to the shortest prefix that is unique.
time-proportional,4 Allocate the volume in proportion to the amount of time the flow spent in the bin.
bin-uniform,0 Allocate the volume evenly across the bins that contain any part of the flow’s duration.
start-spike,1 Allocate the entire volume to the bin containing the start time of the flow.
middle-spike,3 Allocate the entire volume to the bin containing the time at the midpoint of the flow.
end-spike,2 Allocate the entire volume to the bin containing the end time of the flow.
maximum-volume,5 Allocate the entire volume to all of the bins containing any part of the flow.
minimum-volume,6 Allocate the flow’s volume to a bin only if the flow is completely contained within the bin; otherwise ignore the flow.
--start-time=START_TIME Set the time of the first bin to START_TIME. When this switch is not given, the first bin is one that holds the starting time of the earliest record. The START_TIME may be specified in a format of yyyy/mm/dd[:HH[:MM[:SS[.sss]]]] (or T may be used in place of : to separate the day and hour). The time must be specified to at least day precision, and unspecified hour, minute, second, and millisecond values are set to zero. Whether the date strings represent times in UTC or the local timezone depend on how SiLK was compiled, which can be determined from the Timezone support setting in the output from rwcount --version. Alternatively, the time may be specified as seconds since the UNIX epoch, and an unspecified milliseconds value is set to 0.
--end-time=END_TIME Set the time of the final bin to END_TIME. When this switch is not given, the final bin is one that holds the ending time of the latest record. The format of END_TIME is the same as that for START_TIME. Unspecified hour, minute, second, and millisecond values are set to 23, 59, 59, and 999 respectively. When END_TIME is specified as seconds since the UNIX epoch, an unspecified milliseconds value is set to 999. When both --start-time and --end-time are used, the END_TIME is adjusted so that the final bin represents a complete interval.
--skip-zeroes Disable printing of bins with no traffic. By default, all bins are printed.
--bin-slots Use the internal bin index as the label for each bin in the output; the default is to label each bin with the time in a human-readable format.
--epoch-slots Use the UNIX epoch time (number of seconds since midnight UTC on 1970-01-01) as the label for each bin in the output; the default is to label each bin with the time in a human-readable format. This switch is equivalent to --timestamp-format=epoch. This switch is deprecated as of SiLK 3.11.0, and it will be removed in the SiLK 4.0 release.
--timestamp-format=FORMAT Specify the format and/or timezone to use when printing timestamps. When this switch is not specified, the SILK_TIMESTAMP_FORMAT environment variable is checked for a default format and/or timezone. If it is empty or contains invalid values, timestamps are printed in the default format, and the timezone is UTC unless SiLK was compiled with local timezone support. FORMAT is a comma-separated list of a format and/or a timezone. The format is one of:
default Print the timestamps as YYYY/MM/DDThh:mm:ss.
iso Print the timestamps as YYYY-MM-DD hh:mm:ss.
m/d/y Print the timestamps as MM/DD/YYYY hh:mm:ss.
epoch Print the timestamps as the number of seconds since 00:00:00 UTC on 1970-01-01.

When a timezone is specified, it is used regardless of the default timezone support compiled into SiLK. The timezone is one of:
utc Use Coordinated Universal Time to print timestamps.
local Use the TZ environment variable or the local timezone.

--no-titles Turn off column titles. By default, titles are printed.
--no-columns Disable fixed-width columnar output.
--column-separator=C Use specified character between columns and after the final column. When this switch is not specified, the default of ’|’ is used.
--no-final-delimiter Do not print the column separator after the final column. Normally a delimiter is printed.
--delimited
--delimited=C Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default ’|’.
--print-filenames Print to the standard error the names of input files as they are opened.
--copy-input=PATH Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the --output-path switch has been used to redirect rwcount’s ASCII output.
--output-path=PATH Determine where the output of rwcount (ASCII text) is written. If this option is not given, output is written to the standard output.
--pager=PAGER_PROG When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
--site-config-file=FILENAME Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwcount searches for the site configuration file in the locations specified in the FILES section.
--legacy-timestamps
--legacy-timestamps=NUM When NUM is not specified or is 1, this switch is equivalent to --timestamp-format=m/d/y. Otherwise, the switch has no effect. This switch is deprecated as of SiLK 3.0.0, and it will be removed in the SiLK 4.0 release.
--xargs
--xargs=FILENAME Cause rwcount to read file names from FILENAME or from the standard input if FILENAME is not provided. The input should have one file name per line. rwcount will open each file in turn and read records from it, as if the files had been listed on the command line.
--help Print the available options and exit.
--version Print the version number and information about how SiLK was configured, then exit the application.
--start-epoch=START_TIME Alias the --start-time switch. This switch is deprecated as of SiLK 3.8.0.
--end-epoch=START_TIME Alias the --end-time switch. This switch is deprecated as of SiLK 3.8.0.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

To count all web traffic on Feb 12, 2009, into 1 hour bins:



 $ rwfilter --pass=stdout --start-date=2009/02/12:00        \
        --end-date=2009/02/12:23 --proto=6 --aport=80       \
   | rwcount --bin-size=3600
                Date|      Records|          Bytes|      Packets|
 2009/02/12T00:00:00|      1490.49|   578270918.16|    463951.55|
 2009/02/12T01:00:00|      1459.33|   596455716.52|    457487.80|
 2009/02/12T02:00:00|      1529.06|   562602842.44|    451456.41|
 2009/02/12T03:00:00|      1503.89|   562683116.38|    455554.81|
 2009/02/12T04:00:00|      1561.89|   590554569.78|    489273.81|
 ....



To bin the records according to their start times, use the --load-scheme switch:



 $ rwfilter ... --pass=stdout       \
   | rwcount --bin-size=3600 --load-scheme=1
                Date|      Records|          Bytes|      Packets|
 2009/02/12T00:00:00|      1494.00|   580350969.00|    464952.00|
 2009/02/12T01:00:00|      1462.00|   596145212.00|    457871.00|
 2009/02/12T02:00:00|      1526.00|   561629416.00|    451088.00|
 2009/02/12T03:00:00|      1502.00|   563500618.00|    455262.00|
 2009/02/12T04:00:00|      1562.00|   589265818.00|    489279.00|
 ...



To bin the records by their end times:
$ rwfilter ... --pass=stdout \
| rwcount --bin-size=3600 --load-scheme=2
Date| Records| Bytes| Packets|
2009/02/12T00:00:00| 1488.00| 577132372.00| 463393.00|
2009/02/12T01:00:00| 1458.00| 596956697.00| 457376.00|
2009/02/12T02:00:00| 1530.00| 562806395.00| 451551.00|
2009/02/12T03:00:00| 1506.00| 562101791.00| 455671.00|
2009/02/12T04:00:00| 1562.00| 591408602.00| 489371.00|
...

To force the hourly bins to run from 30 minutes past the hour, use the --start-time switch:



 $ rwfilter ... --pass=stdout       \
   | rwcount --bin-size=3600 --start-time=2002/12/31:23:30
                Date|      Records|          Bytes|      Packets|
 2009/02/12T00:30:00|      1483.26|   581251364.04|    456554.40|
 2009/02/12T01:30:00|      1494.00|   575037453.00|    449280.00|
 2009/02/12T02:30:00|      1486.36|   559700466.61|    447700.15|
 2009/02/12T03:30:00|      1555.23|   588882400.58|    480724.48|
 2009/02/12T04:30:00|      1537.79|   564756248.52|    472003.45|
 ...



ENVIRONMENT

SILK_TIMESTAMP_FORMAT This environment variable is used as the value for --timestamp-format when that switch is not provided. Since SiLK 3.11.0.
SILK_PAGER When set to a non-empty string, rwcount automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcount does not automatically page its output.
PAGER When set and SILK_PAGER is not set, rwcount automatically invokes this program to display its output a screen at a time.
SILK_CLOBBER The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.
SILK_CONFIG_FILE This environment variable is used as the value for the --site-config-file when that switch is not provided.
SILK_DATA_ROOTDIR This environment variable specifies the root directory of data repository. As described in the FILES section, rwcount may use this environment variable when searching for the SiLK site configuration file.
SILK_PATH This environment variable gives the root of the install tree. When searching for configuration files, rwcount may use this environment variable. See the FILES section for details.
TZ When the argument to the --timestamp-format switch includes local or when a SiLK installation is built to use the local timezone, the value of the TZ environment variable determines the timezone in which rwcount displays timestamps. (If both of those are false, the TZ environment variable is ignored.) If the TZ environment variable is not set, the machine’s default timezone is used. Setting TZ to the empty string or 0 causes timestamps to be displayed in UTC. For system information on the TZ variable, see tzset(3) or environ(7). (To determine if SiLK was built with support for the local timezone, check the Timezone support value in the output of rwcount --version.) The TZ environment variable is also used when rwcount parses the timestamp specified in the --start-time or --end-time switches if SiLK is built with local timezone support.

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/local/share/silk/silk.conf
/usr/local/share/silk.conf Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

SEE ALSO

rwfilter(1), rwuniq(1), silk(7), tzset(3), environ(7)

BUGS

Unlike rwuniq(1), rwcount does not support counting the number of distinct IPs in a bin. However, using the --bin-time switch on rwuniq can provide time-based binning similar to what rwcount supports. Note that rwuniq always bins by the each record’s start-time (similar to rwcount --load-factor=1), and there is no support in rwuniq for dividing a SiLK record among multiple time bins.
Search for    or go to Top of page |  Section 1 |  Main Index


SiLK 3.11.0.1 RWCOUNT (1) 2016-04-05

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.