|
NAMErwstats - Print top-N or bottom-N lists or summarize data by protocol SYNOPSIS rwstats --fields=KEY [--values=VALUES]
{--count=N | --threshold=N | --percentage=N}
[{--top | --bottom}] [--presorted-input] [--no-percents]
[--ipv6-policy={ignore,asv4,mix,force,only}]
[{--bin-time=SECONDS | --bin-time}]
[--timestamp-format=FORMAT] [--epoch-time]
[--ip-format=FORMAT] [--integer-ips] [--zero-pad-ips]
[--integer-sensors] [--integer-tcp-flags]
[--no-titles] [--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG] [--temp-directory=DIR_PATH]
[{--legacy-timestamps | --legacy-timestamps={1,0}}]
[--site-config-file=FILENAME]
[--plugin=PLUGIN [--plugin=PLUGIN ...]]
[--python-file=PATH [--python-file=PATH ...]]
[--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--pmap-column-width=NUM]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}
rwstats {--overall-stats | --detail-proto-stats=PROTO[,PROTO]}
[--no-titles] [--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}
rwstats [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH ...] --help
rwstats [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH ...] --help-fields
rwstats --legacy-help
rwstats --version
DESCRIPTIONrwstats has two modes of operation: it can compute a Top-N or Bottom-N list, or it can summarize data for a list of protocols. In either mode, rwstats reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use "-" or "stdin" as a file name. If an input file name ends in ".gz", the file is uncompressed as it is read. When the --xargs switch is provided, rwstats reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line. Top-N Descriptionrwstats reads SiLK Flow records and groups them by a key composed of user-specified attributes of the flows. For each group (or bin), a collection of aggregate values is computed; these values are typically related to the volume of the bin, such as the sum of the bytes fields for all records that match the key. The first aggregate value is called the primary aggregate value. Once all the SiLK Flow records are read, rwstats sorts the bins by the primary aggregate value in either decreasing order (for a top-N list) or increasing order (for a bottom-N list). The ordering of bins that have the same primary aggregate value is arbitrary. The bins are printed as text, and the number of bins to print may be specified as a fixed value (e.g., print 10 bins), as a threshold (print bins whose byte count is greater than 400), or as a percentage of the total volume across all bins (print bins that contain at least 10% of all the packets). The user must provide the --fields switch to select the flow attribute(s) (or field(s)) that comprise the key for each bin. The available fields are similar to those supported by rwcut(1); see the description of the --fields switch in the "OPTIONS" section below for the details or run rwstats with the --help-fields switch. The list of fields may be extended by loading PySiLK files (see silkpython(3)) or plug-ins (silk-plugin(3)). The fields are printed in the order in which they occur in the --fields switch. The size of the key is limited to 256 octets. A larger key more quickly uses the available the memory and results in slower performance. The aggregate value(s) to compute for each bin are also chosen by the user. As with the key fields, the user may extend the list of aggregate fields by using PySiLK or plug-ins. The preferred way to specify the aggregate fields is to use the --values switch; the aggregate fields are printed in the order they occur in the --values switch. If the user does not select any aggregate value(s), rwstats defaults to computing the number of flow records for each bin. As with the key fields, requesting more aggregate values slows performance. In addition to computing the primary aggregate value for the flows in each bin, rwstats computes that aggregate value across all flow records. When printing the results, the output for each bin includes the ratio of the bin's aggregate value to the total aggregate value (displayed as a percentage). In addition, a cumulative percentage column is printed. When the primary aggregate value is a distinct count, the cumulative percentage may be greater than 100. The percentage columns contain a question mark when the primary aggregate value comes from a plug-in since rwstats does not know whether summing the aggregate values is reasonable. The display of the percentage columns may be suppressed by specifying --no-percents. rwstats attempts to keep all key and aggregate value data in the computer's memory. If rwstats runs out of memory, the current key and aggregate value data is written to a temporary file. Once all input has been processed, the data from the temporary files is merged to produce the final output. By default, these temporary files are stored in the /tmp directory. Because these files can be large, it is strongly recommended that /tmp not be used as the temporary directory. To modify the temporary directory used by rwstats, provide the --temp-directory switch, set the SILK_TMPDIR environment variable, or set the TMPDIR environment variable. rwstats may also run out of memory if the requested Top-N is too large. The --presorted-input switch may allow rwstats to process data more efficiently by causing rwstats to assume the input has been previously sorted with the rwsort(1) command. With this switch, rwstats does not need large amounts of memory during the binning stage because it does not bin each flow; instead, it keeps a running summation for the bin. When the key changes, the bin's primary aggregate value is compared with those of the current Top-N (or Bottom-N) to see if the new bin is a closer to the top (or bottom). For the output to be meaningful, rwsort and rwstats must be invoked with the same --fields value. When multiple input files are specified and --presorted-input is given, rwstats merge-sorts the flow records from the input files. rwstats usually runs faster if you do not include the --presorted-input switch when counting distinct IP addresses, even when reading sorted input. Finally, you may get unusual results with --presorted-input when the --fields switch contains multiple time-related key fields ("sTime", "duration", "eTime"), or when the time-related key is not the final key listed in --fields; see the "NOTES" section for details. Protocol Statistics DescriptionAlternatively, rwstats can provide statistics for each of bytes, packets, and bytes-per-packet giving minima, maxima, quartile, and interval flow-counts across all flows or across a list of protocols specified by the user. OPTIONSOption names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters. Top-N InvocationTo compute a Top-N or Bottom-N list, the key field(s) must be specified. Normally the --fields switch is used to specify the key field(s), but for backward compatibility older switches may be specified (see the "Legacy Switches" section below).
Many SiLK file formats do not store the following fields and their values are always 0; they are listed here for completeness: SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional fields; for flows without this additional information, the field's value is always 0.
Consider a long-running ssh session that exceeds the flow generator's active timeout. (This is the active timeout since the flow generator creates a flow for a connection that still has activity). The flow generator will create multiple flow records for this ssh session, each spanning some portion of the total session. The first flow record will be marked with a "T" indicating that it hit the timeout. The second through next-to-last records will be marked with "TC" indicating that this flow both timed out and is a continuation of a flow that timed out. The final flow will be marked with a "C", indicating that it was created as a continuation of an active flow.
The following fields provide a way to label the IPs or ports on a record. These fields require external files to provide the mapping from the IP or port to the label:
Finally, the list of built-in fields may be augmented by the run-time loading of PySiLK code or plug-ins written in C (also called shared object files or dynamic libraries), as described by the --python-file and --plugin switches.
To determine the value of N for a Top-N (or Bottom-N) list, one of the following switches must be specified. The primary value may limit which switch may be specified.
To determine whether to compute the Top-N or the Bottom-N, specify one of the following switches. If neither switch is given, --top is assumed: Protocol Statistics InvocationThe following switches compute and print, for each of bytes, packets, and bytes per packet, the minimum value, the maximum value, quartiles, and a count of the number of flows that fall into each of one of ten intervals statistics. These switches may not be combined with the switches that produce Top-N or Bottom-N lists.
Miscellaneous SwitchesThe following switches are available when rwstats is running in either mode, though many only applicable to the Top-N mode.
When a timezone is specified, it is used regardless of the default timezone support compiled into SiLK. The timezone is one of:
The following arguments modify certain IP addresses prior to printing. These arguments may be combined with the above formats.
The following argument is also available:
Legacy SwitchesUse of the following switches has been discouraged since SiLK 2.0.0. As of SiLK 3.8.1, the switches are deprecated and they will be removed in SiLK 4.0. For each switch, use the replacement indicated.
EXAMPLESIn the following examples, the dollar sign ("$") represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash ("\") is used to indicate a wrapped line. Top-N ExamplesPrint the top talkers (based on number of flow records, limit to the top four): $ rwstats --fields=sip --count=4 data.rw
INPUT: 549092 Records for 12990 Bins and 549092 Total Records
OUTPUT: Top 4 Bins by Records
sIP| Records| %Records| cumul_%|
10.1.1.1| 36604| 6.666278| 6.666278|
10.1.1.2| 13897| 2.530906| 9.197184|
10.1.1.3| 12739| 2.320012| 11.517196|
10.1.1.4| 11807| 2.150277| 13.667473|
Print the seven hosts that received the most packets: $ rwstats --fields=dip --values=packets --count=7 data.rw
INPUT: 549092 Records for 44654 Bins and 6620587 Total Packets
OUTPUT: Top 7 Bins by Packets
dIP| Packets| %Packets| cumul_%|
10.1.1.1| 217574| 3.286325| 3.286325|
10.1.1.2| 138177| 2.087081| 5.373407|
10.1.1.3| 121892| 1.841106| 7.214512|
10.1.1.4| 97073| 1.466230| 8.680742|
10.1.1.5| 82284| 1.242851| 9.923593|
10.1.1.6| 80051| 1.209123| 11.132715|
10.1.1.7| 73602| 1.111714| 12.244430|
Print the IP pairs that shared 100,000,000 bytes or more: $ rwstats --fields=sip,dip --values=byte --threshold=100000000 data.rw
INPUT: 549092 Records for 107136 Bins and 3410300252 Total Bytes
OUTPUT: Top 5 Bins by Bytes (threshold 100000000)
sIP| dIP| Bytes| %Bytes| cumul_%|
10.1.1.1| 10.1.1.2| 307478707| 9.016177| 9.016177|
10.1.1.3| 10.1.1.4| 172164463| 5.048367| 14.064544|
10.1.1.5| 10.1.1.6| 142059589| 4.165604| 18.230147|
10.1.1.7| 10.1.1.8| 119388394| 3.500818| 21.730965|
10.1.1.9| 10.1.1.10| 108268824| 3.174759| 24.905725|
Print the ports that were the source of at least 5% of all records: $ rwstats --fields=sport --percentage=5 data.rw
INPUT: 549092 Records for 56799 Bins and 549092 Total Records
OUTPUT: Top 3 Bins by Records (5% == 27454)
sPort| Records| %Records| cumul_%|
80| 86677| 15.785515| 15.785515|
53| 64681| 11.779629| 27.565144|
0| 47760| 8.697996| 36.263140|
Print the destination ports that saw the least number of records (limit to the bottom eight): $ rwstats --fields=dport --bottom --count=8 data.rw
INPUT: 549092 Records for 44772 Bins and 549092 Total Records
OUTPUT: Bottom 8 Bins by Records
dPort| Records| %Records| cumul_%|
19417| 1| 0.000182| 0.000182|
12110| 1| 0.000182| 0.000364|
34777| 1| 0.000182| 0.000546|
8999| 1| 0.000182| 0.000728|
36404| 1| 0.000182| 0.000911|
16682| 1| 0.000182| 0.001093|
27420| 1| 0.000182| 0.001275|
14162| 1| 0.000182| 0.001457|
Print the source-destination port pairs that shared more than 500,000 packets (there were none): $ rwstats --fields=sport,dport --values=packets \
--top --threshold=500000 data.rw
INPUT: 366309 Records for 130307 Bins and 5597540 Total Packets
OUTPUT: No bins above threshold of 500000
Print the source-destination port pairs that shared more than 50,000 packets: $ rwstats --fields=sport,dport --values=packets \
--top --threshold=50000 data.rw
INPUT: 366309 Records for 130307 Bins and 5597540 Total Packets
OUTPUT: Top 3 Bins by Packets (threshold 50000)
sPort|dPort| Packets| %Packets| cumul_%|
6699| 3607| 138177| 2.468531| 2.468531|
80| 1179| 59774| 1.067862| 3.536393|
80| 9659| 50319| 0.898949| 4.435342|
Print the protocols from least to most active (based on number of records): $ rwstats --fields=protocol --bottom --count=10 data.rw
INPUT: 545262 Records for 3 Bins and 545262 Total Records
OUTPUT: Bottom 10 Bins by Records
protocol| Records| %Records| cumul_%|
1| 46319| 8.494815| 8.494815|
17| 132634| 24.324820| 32.819635|
6| 366309| 67.180365|100.000000|
Print the packet and byte counts for the pair of /16s that shared the most packets (use rwnetmask(1) on the input to rwstats; limit result to top ten): $ rwnetmask --4sip-prefix=16 --4dip-prefix=16 data.rw \
| rwstats --fields=sip,dip --values=packets,bytes \
--count=10 --no-percent
INPUT: 250928 Records for 230 Bins and 72279154 Total Packets
OUTPUT: Top 10 Bins by Packets
sIP| dIP| Packets| Bytes|
10.255.0.0| 192.168.0.0| 2711524| 2207297227|
10.253.0.0| 192.168.0.0| 2690120| 2288595669|
10.254.0.0| 192.168.0.0| 2593074| 2141263178|
10.252.0.0| 192.168.0.0| 2553388| 2117294828|
10.250.0.0| 192.168.0.0| 2312661| 1982654956|
10.251.0.0| 192.168.0.0| 2218194| 1785263601|
10.249.0.0| 192.168.0.0| 2196041| 1934938137|
10.248.0.0| 192.168.0.0| 2160037| 1804446929|
10.247.0.0| 192.168.0.0| 2000379| 1579214987|
10.246.0.0| 192.168.0.0| 1878143| 1578321728|
Print the number of distinct destination hosts seen for every destination port, limiting the result to the ports that saw at least 3% of the hosts. The percentage for each bin is relative to the number of distinct destination IP addresses seen in the input. $ rwstats --fields=dport --values=distinct:dip --percent=3 data.rw
INPUT: 243127 Records for 4738 Bins and 122064 Total dIP-Distinct
OUTPUT: Top 5 bins by dIP-Distinct (3.0000% == 3661)
dPort|dIP-Distin|%dIP-Disti| cumul_%|
80| 26940| 22.070389| 22.0704|
25| 15538| 12.729388| 34.7998|
443| 7907| 6.477749| 41.2775|
22| 7733| 6.335201| 47.6127|
8080| 3942| 3.229453| 50.8422|
Print the number of distinct destination ports seen for each protocol. When the primary aggregate value is counting the number of distinct values, the cumulative percentage may be larger than 100%. $ rwstats --fields=proto --values=distinct:dport --count=0 data.rw INPUT: 243127 Records for 2 Bins and 5335 Total dPort-Distinct OUTPUT: Top 2 Bins by dPort-Distinct pro|dPort-Dist|%dPort-Dis| cumul_%| 6| 4672| 87.572634| 87.5726| 17| 4669| 87.516401| 175.0890| The following example uses PySiLK to create an aggregate value field that computes the average byte count for each bin. The code for this field is shown in the silkpython(3) manual page. Note that the percentage columns are empty. $ rwstats --python-file=avg-bytes.py --fields=sport \
--values=avg-bytes,bytes,flow --count=6 data.rw
INPUT: 243127 Records for 4738 Bins
OUTPUT: Top 6 Bins by avg-bytes
sPort| avg-bytes| Bytes| Records|%avg-bytes| cumul_%|
22| 1010658.57| 28292376134| 27994| ?| ?|
8080| 739703.65| 2918870591| 3946| ?| ?|
80| 732930.03| 19821359790| 27044| ?| ?|
443| 731919.66| 5794607921| 7917| ?| ?|
25605| 86376.00| 86376| 1| ?| ?|
25349| 83556.00| 167112| 2| ?| ?|
The --threshold switch is not supported when the primary aggregate value is from PySiLK or a plug-in. $ rwstats --python-file=avg-bytes.py --fields=sport \
--values=avg-bytes,bytes,flow --threshold=90000 data.rw
rwstats: Only the --count limit is supported when the primary \
values field is from a plug-in
rwstats: Cannot add value field 'avg-bytes' from plugin
When using rwstats on input that contains both incoming and outgoing flow records, consider using the int-ext-fields(3) plug-in which defines four additional fields representing the external IP address, the external port, the internal IP address, and the internal port. The plug-in requires the user to specify which class/type pairs are incoming and which are outgoing. See its manual page for additional information. As an example, here we run rwstats on a file containing incoming and outgoing web traffic. $ rwstats --fields=sip,sport,dip,dport --values=bytes \
--count=6 data.rw
INPUT: 155140 Records for 155140 Bins and 59036553615 Total Bytes
OUTPUT: Top 6 Bins by Bytes
sIP|sPort| dIP|dPort| Bytes| %Bytes| cumul_%|
10.242.96.200| 80|192.168.234.203|29868| 2681287| 0.004542| 0.004542|
192.168.211.200| 80| 10.253.27.160|25453| 2675740| 0.004532| 0.009074|
192.168.233.168| 80| 10.247.60.163|29777| 2672196| 0.004526| 0.013600|
192.168.229.229| 443| 10.250.19.210|27512| 2666647| 0.004517| 0.018117|
192.168.255.24| 8080| 10.240.75.236|29826| 2659828| 0.004505| 0.022623|
192.168.241.247| 80| 10.216.173.77|26654| 2658141| 0.004503| 0.027125|
Here the int-ext-fields plug-in is used: $ export INCOMING_FLOWTYPES=all/in,all/inweb
$ export OUTGOING_FLOWTYPES=all/out,all/outweb
$ rwstats --plugin=int-ext-fields.so \
--fields=ext-ip,ext-port,int-ip,int-port --value=bytes \
--count=6 data.rw
INPUT: 155140 Records for 77570 Bins and 59036553615 Total Bytes
OUTPUT: Top 6 Bins by Bytes
ext-ip|ext-p| int-ip|int-p| Bytes| %Bytes| cumul_%|
10.253.27.160|25453|192.168.211.200| 80| 2736332| 0.004635| 0.004635|
10.242.96.200| 80|192.168.234.203|29868| 2722619| 0.004612| 0.009247|
10.247.60.163|29777|192.168.233.168| 80| 2716749| 0.004602| 0.013849|
10.250.19.210|27512|192.168.229.229| 443| 2714974| 0.004599| 0.018447|
10.254.241.55|24206| 192.168.207.45| 80| 2713597| 0.004596| 0.023044|
10.226.206.118|29557|192.168.247.227| 8080| 2707265| 0.004586| 0.027630|
Protocol Statistics ExamplePrint the interval breakdowns for flow records, packets, and bytes across all protocols, and for protocols 6 (TCP) and 17 (UDP): $ rwstats --detail-proto-stats=6,17 data.rw
FLOW STATISTICS--ALL PROTOCOLS: 549092 records
*BYTES min 28; max 88906238
quartiles LQ 122.06478 Med 420.30930 UQ 876.21920 UQ-LQ 754.15442
interval_max|count<=max|%_of_input| cumul_%|
40| 35107| 6.393646| 6.393646|
60| 35008| 6.375616| 12.769263|
100| 49500| 9.014883| 21.784145|
150| 40014| 7.287303| 29.071449|
256| 65444| 11.918586| 40.990034|
1000| 224016| 40.797535| 81.787569|
10000| 75708| 13.787853| 95.575423|
100000| 21981| 4.003154| 99.578577|
1000000| 1901| 0.346208| 99.924785|
4294967295| 413| 0.075215|100.000000|
*PACKETS min 1; max 70023
quartiles LQ 1.76962 Med 3.68119 UQ 7.61567 UQ-LQ 5.84605
interval_max|count<=max|%_of_input| cumul_%|
3| 232716| 42.381969| 42.381969|
4| 61407| 11.183372| 53.565341|
10| 195310| 35.569631| 89.134972|
20| 33310| 6.066379| 95.201351|
50| 17686| 3.220954| 98.422304|
100| 4854| 0.884005| 99.306309|
500| 2760| 0.502648| 99.808957|
1000| 373| 0.067930| 99.876888|
10000| 637| 0.116010| 99.992897|
4294967295| 39| 0.007103|100.000000|
*BYTES/PACKET min 28; max 1500
quartiles LQ 57.98319 Med 90.71150 UQ 164.77250 UQ-LQ 106.78932
interval_max|count<=max|%_of_input| cumul_%|
40| 42568| 7.752435| 7.752435|
44| 15173| 2.763289| 10.515724|
60| 91003| 16.573361| 27.089085|
100| 163850| 29.840173| 56.929258|
200| 153190| 27.898786| 84.828043|
400| 39761| 7.241227| 92.069271|
600| 12810| 2.332942| 94.402213|
800| 7954| 1.448573| 95.850786|
1500| 22783| 4.149214|100.000000|
4294967295| 0| 0.000000|100.000000|
FLOW STATISTICS--PROTOCOL 6: 366309/549092 records
*BYTES min 40; max 88906238
quartiles LQ 310.47331 Med 656.53661 UQ 1089.75344 UQ-LQ 779.28013
interval_max|count<=max|%_of_proto| cumul_%|
40| 29774| 8.128110| 8.128110|
60| 11453| 3.126595| 11.254706|
100| 6915| 1.887751| 13.142456|
150| 16369| 4.468632| 17.611088|
256| 12651| 3.453642| 21.064730|
1000| 196881| 53.747246| 74.811976|
10000| 68989| 18.833553| 93.645529|
100000| 21099| 5.759891| 99.405420|
1000000| 1784| 0.487021| 99.892441|
4294967295| 394| 0.107559|100.000000|
*PACKETS min 1; max 70023
quartiles LQ 3.39682 Med 5.85903 UQ 8.80427 UQ-LQ 5.40745
interval_max|count<=max|%_of_proto| cumul_%|
3| 69358| 18.934288| 18.934288|
4| 55993| 15.285729| 34.220016|
10| 186559| 50.929407| 85.149423|
20| 30947| 8.448332| 93.597755|
50| 16186| 4.418674| 98.016429|
100| 4204| 1.147665| 99.164094|
500| 2178| 0.594580| 99.758674|
1000| 315| 0.085993| 99.844667|
10000| 537| 0.146598| 99.991264|
4294967295| 32| 0.008736|100.000000|
*BYTES/PACKET min 40; max 1500
quartiles LQ 60.19817 Med 96.78616 UQ 175.08044 UQ-LQ 114.88228
interval_max|count<=max|%_of_proto| cumul_%|
40| 36559| 9.980372| 9.980372|
44| 14929| 4.075521| 14.055893|
60| 39593| 10.808634| 24.864527|
100| 100117| 27.331297| 52.195824|
200| 111258| 30.372718| 82.568542|
400| 26020| 7.103293| 89.671834|
600| 8600| 2.347745| 92.019579|
800| 7726| 2.109148| 94.128727|
1500| 21507| 5.871273|100.000000|
4294967295| 0| 0.000000|100.000000|
FLOW STATISTICS--PROTOCOL 17: 132634/549092 records
*BYTES min 32; max 2115559
quartiles LQ 66.53665 Med 150.61551 UQ 242.44095 UQ-LQ 175.90430
interval_max|count<=max|%_of_proto| cumul_%|
20| 0| 0.000000| 0.000000|
40| 5195| 3.916794| 3.916794|
80| 42150| 31.779182| 35.695975|
130| 11528| 8.691587| 44.387563|
256| 45497| 34.302667| 78.690230|
1000| 23401| 17.643289| 96.333519|
10000| 4447| 3.352836| 99.686355|
100000| 389| 0.293288| 99.979643|
1000000| 23| 0.017341| 99.996984|
4294967295| 4| 0.003016|100.000000|
*PACKETS min 1; max 8839
quartiles LQ 0.84383 Med 1.68768 UQ 2.53149 UQ-LQ 1.68766
interval_max|count<=max|%_of_proto| cumul_%|
3| 117884| 88.879171| 88.879171|
4| 4452| 3.356605| 92.235777|
10| 6678| 5.034908| 97.270685|
20| 1766| 1.331484| 98.602168|
50| 1055| 0.795422| 99.397590|
100| 368| 0.277455| 99.675046|
500| 353| 0.266146| 99.941192|
1000| 33| 0.024880| 99.966072|
10000| 45| 0.033928|100.000000|
4294967295| 0| 0.000000|100.000000|
*BYTES/PACKET min 32; max 1415
quartiles LQ 63.23827 Med 91.27180 UQ 158.10219 UQ-LQ 94.86392
interval_max|count<=max|%_of_proto| cumul_%|
20| 0| 0.000000| 0.000000|
24| 0| 0.000000| 0.000000|
40| 5671| 4.275676| 4.275676|
100| 70970| 53.508150| 57.783826|
200| 39298| 29.628904| 87.412730|
400| 12175| 9.179396| 96.592126|
600| 4130| 3.113832| 99.705958|
800| 160| 0.120633| 99.826590|
1500| 230| 0.173410|100.000000|
4294967295| 0| 0.000000|100.000000|
The silkpython(3) manual page provides examples that use PySiLK to create arbitrary fields to use as part of the key for rwstats. ENVIRONMENT
FILES
NOTESrwstats functionally replaces the combination the following, where N is one more than the number of fields passed to rwuniq(1): rwuniq --fields=... | sort -r -t '|' -k N | head -10 When the --bin-time switch is given and the three time fields (starting-time ("sTime"), ending-time ("eTime"), and duration ("duration")) are present in the key, the duration field's value is modified to be the difference between the ending and starting times. When the three time-related key fields ("sTime","duration","eTime") are all in use, rwstats ignores the final time field when binning the records, but the field does appear in the output. Due to truncation of the milliseconds values, rwstats generates different numbers of bins depending on the order in which those three values appear in the --fields switch. When computing distinct counts over a field, the field may not be part of the key; that is, you may not have "--fields=sip --values=sip-distinct". The distinct count in that case is always 1. Using the --presorted-input switch sometimes introduces more issues than it solves, and --presorted-input is less necessary now that rwstats can use temporary files while processing input. When computing distinct IP counts, rwstats typically runs faster if you do not use the --presorted-input switch, even if the data was previously sorted. When using the --presorted-input switch, it is highly recommended that you use no more than one time-related key field ("sTime", "duration", "eTime") in the --fields switch and that the time-related key appear last in --fields. The issue is caused by rwsort considering the millisecond values on the times when sorting, while rwstats truncates the millisecond value. rwstats's strength is its ability to build arbitrary keys and aggregate fields. For maps of a single key to a single value, see also rwbag(1). To create a binary file that contains multiple keys and values, use rwaggbag(1). SEE ALSOrwcut(1), rwnetmask(1), rwsort(1), rwuniq(1), rwbag(1), rwaggbag(1), rwpmapbuild(1), addrtype(3), ccfilter(3), int-ext-fields(3), pmapfilter(3), pysilk(3), silkpython(3), silk-plugin(3), sensor.conf(5), rwflowpack(8), silk(7), yaf(1), dlopen(3), tzset(3), environ(7)
|