GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
rwbagtool(1) SiLK Tool Suite rwbagtool(1)

rwbagtool - Perform high-level operations on binary Bag files

  rwbagtool { --add | --subtract | --minimize | --maximize
              | --divide | --scalar-multiply=VALUE
              | --compare={lt | le | eq | ge | gt} }
        [--intersect=SETFILE | --complement-intersect=SETFILE]
        [--mincounter=VALUE] [--maxcounter=VALUE]
        [--minkey=VALUE] [--maxkey=VALUE]
        [--invert] [--coverset] [--ipset-record-version=VERSION]
        [--output-path=PATH]
        [--note-strip] [--note-add=TEXT] [--note-file-add=FILE]
        [--compression-method=COMP_METHOD]
        [BAGFILE[ BAGFILE...]]

  rwbagtool --help

  rwbagtool --version

rwbagtool performs various operations on binary Bag files and creates a new Bag file. A Bag is a set where each key is associated with a counter. rwbag(1) and rwbagbuild(1) are the primary tools used to create a Bag file. rwbagcat(1) prints a binary Bag file as text.

rwbagtool can add Bags together, subtract a subset of data from a Bag, divide a Bag by another, compare the counters of two Bag files, perform key intersection of a Bag with an IPset, extract the keys of a Bag as an IPset, or filter Bag entries based on their key or counter values.

In the command synopsis above, BAGFILE is a the name of a file or a named pipe, or the names "stdin" or "-" to have rwbagtool read from the standard input. If no Bag file names are given on the command line, rwbagtool attempts to read a Bag from the standard input. If BAGFILE does not contain a Bag, rwbagtool prints an error to stderr and exits abnormally.

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

The first set of options are mutually exclusive; only one may be specified. If none are specified, the counters in the Bag files are summed.
--add
Sum the counters for each key for all Bag files given on the command line. At least one Bag file must be specified, and any number of additional Bag files may be given. If a key is not present in an input file, a counter of zero is used. When no operation switch is specified on the command line, the add operation is the default.
--subtract
Subtract from the first Bag file all subsequent Bag files. At least one Bag file must be specified, and any number of additional Bag files may be given. If a key does not appear in the first Bag file, rwbagtool assumes it has a value of 0. If subtracting a key's counters results in a non-positive number, the key does appear in the resulting Bag file.
--minimize
Cause the output to contain the minimum counter seen for each key. Keys that do not appear in all input Bags do not appear in the output. At least one Bag file must be specified, and any number of additional Bag files may be given.
--maximize
Cause the output to contain the maximum counter seen for each key. The output contains each key that appears in any input Bag. At least one Bag file must be specified, and any number of additional Bag files may be given.
--divide
Divide the first Bag file by the second Bag file. It is an error if only one Bag file or more than two Bag files are given. Every key in the first Bag file must appear in the second file; the second Bag may have keys that do not appear in the first, and those keys do not appear in the output. Since Bags do not support floating point numbers, the result of the division is rounded to the nearest integer (values ending in .5 are rounded up). If the result of the division is less than 0.5, the key does not appear in the output.
--scalar-multiply=VALUE
Multiply each counter in the Bag file by the scalar VALUE, where VALUE is an integer in the range 1 to 18446744073709551615. This switch requires a single Bag as input.
--compare=OPERATION
Compare the key/counter pairs in exactly two Bag files. It is an error if only one Bag file or more than two Bag files are specified. The keys in the output Bag are only those for which the comparison denoted by OPERATION is true when comparing the key's counter in the first Bag with the key's counter in the second Bag. The counters for all keys in the output have the value 1. Any key that does not appear in both input Bag files does not appear in the result. The possible OPERATION values are the strings:
"lt"
GetCounter(Bag1, key) < GetCounter(Bag2, key)
"le"
GetCounter(Bag1, key) <= GetCounter(Bag2, key)
"eq"
GetCounter(Bag1, key) == GetCounter(Bag2, key)
"ge"
GetCounter(Bag1, key) >= GetCounter(Bag2, key)
"gt"
GetCounter(Bag1, key) > GetCounter(Bag2, key)

The result of the above operation is an intermediate Bag file. The following switches are applied next to remove entries from the intermediate Bag:
--intersect=SETFILE
Mask the keys in the intermediate Bag using the set in SETFILE. SETFILE is the name of a file or a named pipe containing an IPset, or the name "stdin" or "-" to have rwbagtool read the IPset from the standard input. If SETFILE does not contain an IPset, rwbagtool prints an error to stderr and exits abnormally. Only key/counter pairs where the key matches an entry in SETFILE are written to the output. (IPsets are typically created by rwset(1) or rwsetbuild(1).)
--complement-intersect=SETFILE
As --intersect, but only writes key/counter pairs for keys which do not match an entry in SETFILE.
--mincounter=VALUE
Cause the output to contain only those entries whose counter value is VALUE or higher. The allowable range is 1 to the maximum counter value; the default is 1.
--maxcounter=VALUE
Cause the output to contain only those entries whose counter value is VALUE or lower. The allowable range is 1 to the maximum counter value; the default is the maximum counter value.
--minkey=VALUE
Cause the output to contain only those entries whose key value is VALUE or higher. Default is 0 (or 0.0.0.0). Accepts input as an integer or as an IP address in dotted decimal notation.
--maxkey=VALUE
Cause the output to contain only those entries whose key value is VALUE or higher. Default is 4294967295 (or 255.255.255.255). Accepts input as an integer or as an IP address in dotted decimal notation.

The following switches control the output.
--invert
Generate a new Bag whose keys are the counters in the intermediate Bag and whose counter is the number of times the counter was seen. For example, this turns the Bag {sip:flow} into the Bag {flow:count(sip)}. Any counter in the intermediate Bag that is larger than the maximum possible key is attributed to the counter for the maximum key; to prevent this, specify "--maxcounter=4294967295" which removes all key-counter pairs whose counters do not fit into a key. (The --bin-ips switch on rwbagcat(1) allows one to invert a Bag file as it is being printed.)
--coverset
Instead of creating a Bag file as the output, write an IPset which contains the keys contained in the intermediate Bag.
--ipset-record-version=VERSION
Specify the format of the IPset records that are written to the output when the --coverset switch is used. VERSION may be 2, 3, 4, 5 or the special value 0. When the switch is not provided, the SILK_IPSET_RECORD_VERSION environment variable is checked for a version. The default version is 0. Since SiLK 3.11.0.
 0
Use the default version for an IPv4 IPset and an IPv6 IPset. Use the --help switch to see the versions used for your SiLK installation.
 2
Create a file that may hold only IPv4 addresses and is readable by all versions of SiLK.
 3
Create a file that may hold IPv4 or IPv6 addresses and is readable by SiLK 3.0 and later.
 4
Create a file that may hold IPv4 or IPv6 addresses and is readable by SiLK 3.7 and later. These files are more compact that version 3 and often more compact than version 2.
 5
Create a file that may hold only IPv6 addresses and is readable by SiLK 3.14 and later. When this version is specified, IPsets containing only IPv4 addresses are written in version 4. These files are usually more compact that version 4.
--output-path=PATH
Write the resulting Bag to PATH, where PATH is a filename, a named pipe, the keyword "stderr" to write the output to the standard error, or the keyword "stdout" or "-" to write the output to the standard output. If PATH names an existing file, rwbagtool exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. If this switch is not given, the output is written to the standard output. Attempting to write the binary output to a terminal causes rwbagtool to exit with an error.
--note-strip
Do not copy the notes (annotations) from the input files to the output file. Normally notes from the input files are copied to the output.
--note-add=TEXT
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
--note-file-add=FILENAME
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
--compression-method=COMP_METHOD
Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.
none
Do not compress the output using an external library.
zlib
Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.
lzo1x
Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.
snappy
Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.
best
Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.
--help
Print the available options and exit.
--version
Print the version number and information about how SiLK was configured, then exit the application.

In the following examples, the dollar sign ("$") represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash ("\") is used to indicate a wrapped line.

The examples assume the following contents for the files:

 Bag1.bag    Bag2.bag    Bag3.bag    Bag4.bag    Mask.set
  3|  10|     1|   1|     2|   8|     1|   1|          2
  4|   7|     4|   2|     4|  10|     4|   3|          4
  6|  14|     7|  32|     6|  14|     6|   4|          6
  7|  23|     8|   2|     7|  12|     7|   4|          8
  8|   2|                 9|   8|     8|   6|

The examples use rwbagcat(1) to print the contents of the Bag files.

Adding Bag files produces a Bag whose keys are the set union of the keys in the input Bags. The counter for each key is the sum of the key's counters in each input Bag.

 $ rwbagtool --add Bag1.bag Bag2.bag > Bag-sum.bag
 $ rwbagcat --key-format=decimal Bag-sum.bag
  1|   1|
  3|  10|
  4|   9|
  6|  14|
  7|  55|
  8|   4|

 $ rwbagtool --add Bag1.bag Bag2.bag Bag3.bag > Bag-sum2.bag
 $ rwbagcat --key-format=decimal Bag-sum2.bag
  1|   1|
  2|   8|
  3|  10|
  4|  19|
  6|  28|
  7|  67|
  8|   4|
  9|   8|

The --subtract switch subtracts from the key/counter pairs in the first Bag file the key/counter pairs in all other Bag file arguments. Keys that are not present in the first argument are ignored. If subtraction results in a counter value of zero or less, the key is removed from the result.

 $ rwbagtool --subtract Bag1.bag Bag2.bag > Bag-diff.bag
 $ rwbagcat --key-format=decimal Bag-diff.bag
  3|  10|
  4|   5|
  6|  14|

 $ rwbagtool --subtract Bag2.bag Bag1.bag > Bag-diff2.bag
 $ rwbagcat --key-format=decimal Bag-diff2.bag
  1|   1|
  7|   9|

The output produced by the --minimize switch contains only the keys that appear in all of input Bags. For each key, the counter is the minimum value for that key in any input Bag.

 $ rwbagtool --minimize Bag1.bag Bag2.bag Bag3.bag > Bag-min.bag
 $ rwbagcat --key-format=decimal Bag-min.bag
  4|   2|
  7|  12|

The keys of the Bag file produced by --maximize are the same as the keys produced by --add; that is, the union of all keys in the input files. For each key, its counter is the maximum value seen for that key in any single input Bag file.

 $ rwbagtool --maximize Bag1.bag Bag2.bag Bag3.bag > Bag-max.bag
 $ rwbagcat --key-format=decimal Bag-max.bag
  1|   1|
  2|   8|
  3|  10|
  4|  10|
  6|  14|
  7|  32|
  8|   2|
  9|   8|

The --divide switch requires exactly two Bag files as input. The keys in the first Bag argument must be either the same as or a subset of those in the second argument. The counter for each key in the first Bag file is divided by that key's counter in the second file. If the result of the division is less than 0.5, the key is not included in the output.

 $ rwbagtool --divide Bag2.bag Bag4.bag > Bag-div1.bag
 $ rwbagcat --key-format=decimal Bag-div1.bag
   1|   1|
   4|   1|
   7|   8|

When the order of the Bag file arguments is reversed an error is reported.

 $ rwbagtool --divide Bag4.bag Bag2.bag > Bag-div2.bag
 rwbagtool: Error dividing bags; key 6 not in divisor bag

To work around this issue, use the --coverset switch to create a copy of Bag4.bag that contains only the keys in Bag2.bag.

 $ rwbagtool --coverset Bag2.bag > Bag2-keys.set
 $ rwbagtool --intersect=Bag2-keys.set  Bag4.bag > Bag4-small.bag
 $ rwbagtool --divide Bag4-small.bag Bag2.bag > Bag-div2.bag
 $ rwbagcat --key-format=decimal Bag-div2.bag
   1|   1|
   4|   2|
   8|   3|

The following command is the same as the above except the IPset and Bag files are piped between the tools instead of being written to disk:

 $ rwbagtool --coverset Bag2.bag                \
   | rwbagtool --intersect=-  Bag4.bag          \
   | rwbagtool --divide -  Bag2.bag             \
   | rwbagcat --key-format=decimal
   1|   1|
   4|   2|
   8|   3|

The --scalar-multiply switch multiplies each counter in the input Bag by the specified value. Exactly one Bag file argument is required.

 $ rwbagtool --scalar-multiply=7 Bag1.bag > Bag-multiply.bag
 $ rwbagcat --key-format=decimal Bag-multiply.bag
  3|  70|
  4|  49|
  6|  98|
  7| 161|
  8|  14|

Use two rwbagtool commands if multiple operations are desired.

 $ rwbagtool --add Bag1.bag Bag2.bag   \
   | rwbagtool --scalar-multiply=3 --output-path=Bag12-multi.bag
 $ rwbagcat --key-format=decimal Bag12-multi.bag
  1|   3|
  3|  30|
  4|  27|
  6|  42|
  7| 165|
  8|  12|

The --compare switch takes an argument that specifies how to compare the counters in two Bag files, and it requires exactly two Bag files as input. For each key that appears in both Bag files, the counter value in the first file is compared to counter value in the second file. If the comparison is true, the key appears in the resulting Bag file with a counter of 1. If the comparison is false, the key is not present in the output file. Keys that appear in only one of the input files are ignored.

The following comparisons operate on Bag1.bag and Bag2.bag which have as common keys 4, 7, and 8.

Find counters in Bag1.bag that are less than those in Bag2.bag:

 $ rwbagtool --compare=lt Bag1.bag Bag2.bag > Bag-lt.bag
 $ rwbagcat --key-format=decimal Bag-lt.bag
  7|   1|

Find counters in Bag1.bag that are less than or equal to those in Bag2.bag:

 $ rwbagtool --compare=le Bag1.bag Bag2.bag > Bag-le.bag
 $ rwbagcat --key-format=decimal Bag-le.bag
  7|   1|
  8|   1|

Find counters in Bag1.bag that are equal to those in Bag2.bag:

 $ rwbagtool --compare=eq Bag1.bag Bag2.bag > Bag-eq.bag
 $ rwbagcat --key-format=decimal Bag-eq.bag
  8|   1|

Find counters in Bag1.bag that are greater than or equal to those in Bag2.bag:

 $ rwbagtool --compare=ge Bag1.bag Bag2.bag > Bag-ge.bag
 $ rwbagcat --key-format=decimal Bag-ge.bag
  4|   1|
  8|   1|

Find counters in Bag1.bag that are greater than those in Bag2.bag:

 $ rwbagtool --compare=gt Bag1.bag Bag2.bag > Bag-gt.bag
 $ rwbagcat --key-format=decimal Bag-gt.bag
  4|   1|

A cover set is an IPset file that contains the keys that are present in any of the input Bag files. In other words, it is the union of the keys converted to an IPset. Since an operation switch is not provided in this command, an implicit --add operation is performed on the Bag files prior to creating the cover set. (rwsetcat(1) prints the contents of an IPset file as text.)

 $ rwbagtool --coverset Bag1.bag Bag2.bag Bag3.bag > Cover.set
 $ rwsetcat --key-format=decimal Cover.set
  1
  2
  3
  4
  6
  7
  8
  9

One use of a cover set is to limit the contents of a Bag file to keys that are present in a second Bag file:

 $ rwbagtool --coverset --output-path=Cover.set Bag1.bag
 $ rwbagtool --intersect=Cover.set Bag2.bag > Bag1-mask-Bag2.bag
 $ rwbagcat --key-format=decimal Bag1-mask-Bag2.bag
  4|   2|
  7|  32|
  8|   2|

To mask the contents of Bag2.bag by the keys that are not present in Bag1.bag:

 $ rwbagtool --complement-intersect=Cover.set Bag2.bag \
        > Bag1-notmask-Bag2.bag
 $ rwbagcat --key-format=decimal Bag1-notmask-Bag2.bag
  1|   1|

The output of the --invert switch is a Bag file that counts the number of times each counter is present in the input Bag file.

 $ rwbagtool --invert Bag1.bag > Bag-inv1.bag
 $ rwbagcat --key-format=decimal Bag-inv1.bag
  2|   1|
  7|   1|
 10|   1|
 14|   1|
 23|   1|

 $ rwbagtool --invert Bag2.bag > Bag-inv2.bag
 $ rwbagcat --key-format=decimal Bag-inv2.bag
  1|   1|
  2|   2|
 32|   1|

 $ rwbagtool --invert Bag3.bag > Bag-inv3.bag
 $ rwbagcat --key-format=decimal Bag-inv3.bag
  8|   2|
 10|   1|
 12|   1|
 14|   1|

When multiple Bag files are specified on the command line, the files are added prior to creating the inverted Bag. Even though the counter 2 appears three times in the files Bag1.bag and Bag2.bag, the key 2 is not present in the following since the add operation is performed first.

 $ rwbagtool --invert Bag1.bag Bag2.bag   \
   | rwbagcat --key-format=decimal
  1|   1|
  4|   1|
  9|   1|
 10|   1|
 14|   1|
 55|   1|

The --intersect switch takes an IPset file as an argument and limits the keys of the Bag produced by rwbagtool to only those keys that appear in the IPset file.

 $ rwbagtool --intersect=Mask.set Bag1.bag > Bag-mask.bag
 $ rwbagcat --key-format=decimal Bag-mask.bag
  4|   7|
  6|  14|
  8|   2|

The --complement-intersect switch limits the output to only those keys that do not appear in the IPset file.

 $ rwbagtool --complement-intersect=Mask.set Bag1.bag > Bag-mask2.bag
 $ rwbagcat --key-format=decimal Bag-mask2.bag
  3|  10|
  7|  23|

See also the next section.

In addition to limiting the result of rwbagtool to keys that appear or do not appear in an IPset file (cf. previous section), numeric limits may be used to restrict the keys or counters that in the resulting Bag file with use of the --minkey, --maxkey, --mincounter, and --maxcounter switches.

 $ rwbagtool --add --maxkey=5 Bag1.bag Bag2.bag > Bag-res1.bag
 $ rwbagcat --key-format=decimal Bag-res1.bag
  1|   1|
  3|  10|
  4|   9|

 $ rwbagtool --minkey=3 --maxkey=6 Bag1.bag > Bag-res2.bag
 $ rwbagcat --key-format=decimal Bag-res2.bag
  3|  10|
  4|   9|
  6|  14|

 $ rwbagtool --mincounter=20 Bag1.bag Bag2.bag > Bag-res3.bag
 $ rwbagcat --key-format=decimal Bag-res3.bag
  7|  55|

 $ rwbagtool --subtract --maxcounter=9 Bag1.bag Bag2.bag  \
        > Bag-res4.bag
 $ rwbagcat --key-format=decimal Bag-res4.bag
  4|   5|

To share a Bag file with a user who has a version of SiLK that includes different compression libraries, it may be necessary to change the the compression-method of the Bag.

It is not possible to change the compression-method directly. A new file must be created first, and then you may then replace the old file with the new file.

To create a new file that uses a different compression-method of the Bag file A.bag, use rwbagtool with the --add switch and specify the desired argument:

 $ rwbagtool --add --compression=none --output-path=A1.bag A.bag

Unfortunately, the Bag tools do not allow changing the key type or counter type of a Bag file. To change the types, use rwbagcat(1) to write the Bag as text and rwbagbuild(1) to convert the text back to a Bag file.

 $ rwbagcat Bag1.bag    \
   | rwbagbuild --bag-input=- --output-path=Bag1-typed.bag  \
        --key-type=sport --counter-type=sum-bytes

Use rwfileinfo(1) to see the type of the key and counter.

 $ rwfileinfo --field=bag Bag1-typed.bag
 Bag1-typed.bag:
   bag          key: sPort @ 4 octets; counter: sum-bytes @ 8 octets

Alternatively, one may use PySiLK (see pysilk(3)) to modify the key type and counter type.

 $ cat bag-type.py
 import sys
 from silk import *

 key_type = sys.argv[1]
 counter_type = sys.argv[2]
 old_file = sys.argv[3]
 new_file = sys.argv[4]

 old = Bag.load(old_file, key_type=IPv4Addr)
 new = Bag(old, key_type=key_type, counter_type=counter_type)
 new.save(new_file)
 $
 $ python bag-type.py sipv4 sum-packets Bag1.bag Bag1-type2.bag
 $ rwfileinfo --field=bag Bag1-type2.bag
 Bag1-type2.bag:
   bag          key: sIPv4 @ 4 octets; counter: sum-packets @ 8 octets

SILK_IPSET_RECORD_VERSION
This environment variable is used as the value for the --ipset-record-version when that switch is not provided. Since SiLK 3.7.0.
SILK_CLOBBER
The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.
SILK_COMPRESSION_METHOD
This environment variable is used as the value for --compression-method when that switch is not provided. Since SiLK 3.13.0.

rwbag(1), rwbagbuild(1), rwbagcat(1), rwfileinfo(1), rwset(1), rwsetbuild(1), rwsetcat(1), silk(7), zlib (3)
2022-04-12 SiLK 3.19.1

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.