viz - Makes invisible characters visible; binary file lister
viz [
-f fmt |
-F fmtfile ] [
-h ] [
-l n ]
[
-p |
-o |
-P len |
-O len ] [
file ... ]
Viz copies its input to its output, converting invisible characters to a
visible form. If no
files are given,
stdin is read. If the
-t option is used, the output is formatted in a form that can be
completely inverted (see
inviz), which allows a binary file to be
converted to a text form, edited, and then converted back. It is much more
flexible than either
cat -v or
od (either old or POSIX
od), and it is also 2-4 times faster.
By default, the input is treated as a sequence of characters. However, a file
format may be specified, in which case
viz can handle files containing
a mixture of data of arbitrary types.
The format can contain include repeat counts and comments to be embedded in the
output stream.
Uninteresting data can be skipped over and not printed on stdout.
Additional flexibility is provided through user-settable variables, which can be
used as repeat counts. Simple math can be done on the variables, and chars,
shorts, or integers from the input stream can be stored in them.
-f fmt: specifies the type of input data, and how to
format the output.
-F fmtfile: gives the name of a file containing a
fmt string.
-h: print a help message on stderr, then exit.
-l n: limit output to no more than n characters
per line (def 80).
-p: print offsets (each line begins offset:
).
-P len: print offsets using the format
len*n+offset:. This is useful if the
input file is organized into fixed-length logical records of data.
-o, -O len: like -p and -P, but the offset is printed in
octal.
-t: prints the type of input data (e.g. int or float) in the
output stream; this allows non-text data to be retranslated properly, so that
the output can be completely inverted by
inviz(1) to reproduce the
original input.
-L: display a readable version (on stderr) of the
internal representation of the format string; not useful for ordinary
use.
-D: turn on debug mode; not useful for ordinary use.
By default, the input is treated as a stream of ASCII characters. Part or all of
the input can optionally be interpreted as shorts, ints, longs (all signed or
unsigned), or floats or doubles. Since all characters are made visible,
newlines and empty lines in the output are not significant.
When
viz is translating ASCII characters,
- •
- printing characters are passed through untouched;
- •
- the usual C escape sequences are used for backspace (\b), formfeed
( \f), newline (\n), return (\r), tab (\t),
backslash ( \\), and null ( \0);
- •
- if compiled with an ANSI C compiler, viz also uses these
additional escape sequences: audible bell → \a and vertical
tab → \v;
- •
- the caret (^) is printed as \^, so that a plain caret can be
used as a prefix for control characters (see next bullet);
- •
- other characters in the range \0..\037 are displayed as control characters
(e.g. 01 is displayed as ^A);
- •
- characters above \0177 are displayed as octal C escape sequences (
\ nnn).
Viz can print integer data (char, short, int, or long) in decimal, octal,
hex, or binary (default decimal), or use a user-specified
printf()
format. If a multi-byte number is printed in binary, the binary
representations of the bytes are separated by commas to improve readability.
Floats and doubles can be printed using a
printf() ``f'' or ``g'' format
(default
g), or use a user-specified
printf() format.
If the
-t (type of data) option is used, output lines containing numbers
are begun with
\#C,
\#S,
\#I,
\#L,
\#F, or
\#D, indicating that the line contains char, short, int, long, float,
or double-size numeric data, respectively. This allows
inviz(1
)
to exactly invert the output.
The output format is controlled with the
-f fmt or
-F
fmtfile options. A format specifies the types of data contained in the
file and how to print them. A
fmtfile is the name of a file containing
a format; this is very useful for holding complex or frequently-repeated
formats.
A simple format is [
repeat_count ]
datasize [
output_format ]
or
[
repeat_count ]
output_format
For instance,
'20 short u' says to read 20 objects whose size is that of
a short, and interpret them as unsigned decimal values. Whitespace is optional
around any of the elements of a format. The members of a simple format are:
repeat_count tells how many items of size datasize
to process from the input stream. If the format doesn't begin with a repeat
count, an infinite loop is assumed and the format is applied to the entire
input stream. A repeat count is one of
- •
- a decimal number (e.g. 120);
- •
- a double-dollar sign ($$), meaning repeat until EOF is encountered
on the input; or
- •
- a viz register $c, where c can be any
character. Registers are discussed below.
datasize indicates the size of each input datum in bytes.
Valid datasizes are
char,
C
short,
S
int,
I
long,
L
float,
F
double,
D
Z
The single-character uppercase names [
CSILFD] are shorthand for
char,
short,
etc. If the long form is used, it has to be
followed by a non-alphabetic character; use whitespace if necessary. There is
no explicit "unsigned" size, as each unsigned type is always the
same size as the corresponding signed form. The special datasize
Z
means a zero-terminated string (
i.e., a null-terminated string); see
output format
a, below. The
Z type is the only datasize that
doesn't correspond to a fundamental C type.
output_format specifies how to interpret and print the
data. Valid output formats are:
- ~
- Discard (don't print) the input. The default datasize is
char.
- a
- Interpret the input as a character and print using the rules given above
for text. Similar to od(1)'s -a option. The only allowed
datasizes are char (or C) and Z. The Z
datasize is special. It corresponds to all characters in the input stream
from the current position up through a null. This is useful for printing
null-terminated strings that are embedded in fixed-length fields that
otherwise contain garbage. The number of characters matched by a Z
datasize can be collected into a viz register, so that you can then
skip the remaining junk. See also output format c.
- b
- Interpret the input as non-negative and integral-valued, and print as a
binary number. The default datasize is char.
Datasize's float and double cannot be used with the
b output format.
- c
- Interpret the input as a character and print using their ASCII
representation, except for non-printing characters and blanks, which are
printed as 3-digit octal numbers \nnn. The only allowed
datasize is char (or C). Similar to od(1)'s -c
option. The only allowed datasize is char (or C). See also
output format a.
- d
- Interpret the input as a signed integral-valued number and print it in
decimal. The default datasize is int. Datasize's
float and double cannot be used with the d output
format.
- f
- Interpret the input as a float or double, and print it using
printf(3s) %f format. The allowed datasize's are
float (the default) and double.
- g
- Interpret the input as a float or double, and print it using
printf(3s) %g format. The allowed datasize's are
float (the default) and double.
- h
- Interpret the input as an unsigned number and print it in hexadecimal. The
default datasize is int. Datasize's float and
double cannot be used with the h output format.
- o
- Interpret the input as an unsigned number and print it in octal. The
default datasize is int. Datasize's float and
double cannot be used with the o output format.
- u
- Interpret the input as an unsigned number and print it in decimal. The
default datasize is int. Datasize's float and
double cannot be used with the u output format.
- x
- A synonym for h (hex) format.
- "printfstring"
- If you enter a quoted string, then the printfstring is used as a
printf format. It is up to you to give a format that is appropriate to the
datatype; no checking is done. For instance, you might use -f 'I "
%+d "'
to have integers always printed with signs.
If you specify the datasize, but not the output format, the defaults are:
Datasize Default Output Format
C, char, Z a
S, short, I, int, L, long d
F, float, D, double g
If you specify an output format, but not the datasize, the defaults are:
Output Format Default Datasize
a, b, c, ~ char
d, u, o, h, x int
f, g float
Formats can be concatenated, and are then processed one after the other:
-f
'24Ca 5Fg'
processes 24 characters and then 5 floats.
Formats can be grouped and nested with parentheses:
-f 'int 3(24a 5g)'
means 1 int, then 3 groups of 24 chars + 5 floats. Since
the format doesn't begin with a number, then entire format is repeated until
EOF is reached; it's equivalent to
-f '$$(int 3(24a 5g))'
A format can contain
[comment], in which case the
comment is embedded in the output stream. For instance,
-f 'I
[Power: ] F'
prints one integer, then the string "
Power: ", and then
a float. Within a comment, the usual C escape sequences are recognized, as are
the
viz registers `
$c' (the contents of
$c
are substituted wherever the string is encountered). The special meaning of
$ may be escaped by preceding it with a backslash.
All text from a sharp (
#) through a newline is considered a comment
about the format, and is ignored (it's not put into the output stream,
either). This allows you to comment long format files. For instance, a format
file could contain
I~ >n # Read the number of bytes into register n
$n/F # Convert to a number of floats
(See the section on
Registers.)
A newline can be placed in the output stream either by using a comment
containing a newline such as ``
[\n]'', or any of the special format
members ``
n'', ``
;'', and ``
\n''. (These three are
completely equivalent; use whichever suits best.) The advantage of using the
the special formats instead of a comment is that it gives better-looking
output:
viz knows about the newline inserted by the format, and takes
this into account when formatting output.
If the input stream allows seeking, you can invoke
fseek() by entering
offset seek whence or .I offset
! whence
Here, whence is one of `
0', `
1', or `
2', and indicates
seeking from the beginning of the file, current location in the file, or end
of the file. For example,
25 ! 1 or
25 seek 1
seeks 25 characters forward from the current location in the file;
0 ! 0
or
0 seek 0
rewinds the file; and
-25 ! 2 or
-25
seek 2
goes to 25 characters before the end of the file.
You can put the value of any integer data type read from the input stream into a
register, by following the format with ``
>registername.'' For instance,
-f 'I~ >n'
collects an integer (but doesn't print its value, since the output format is
"
~"), and stores the value in the register named
n.
If you used a repeat count in the format, the value stored in the register is
the last read value. For the special case of the
Z datasize, the value
stored in a register is the number of characters that were matched by the
Z datasize, including the trailing null. This allows you to know how
many characters were processed by the
Z format.
There are 255 registers you can use, named with any single character other than
null. The register can be used as a count by prefixing it with "
$". For instance, if a data stream contains an integer count that
indicates the number of following floats, you could print just the float
values by using
-f 'I~ >n $nF'
This reads but doesn't print the number of bytes (
I~), stores the count
in register
$n (
>n), and then reads and prints that number of
floats (
$nF).
If you do not store the value of an integer in a register, it is automatically
stored in register
$#. Thus the above example could also have been
written as
-f 'I~ $#F'
You can do arithmetic on registers. This is typically used for converting a byte
count into a count of, say, integers. The syntax is
$registername op
value [ op value ...]
The operators
op and their effects are
Operator Effect
= assign value to registername
+ add value to registername
- subtract value from registername
* multiply registername by value
/ divide registername by value
% registername modulo value
In addition, the special unary operator
P prints the register value as a
decimal number in the output stream, but has no other effect. For example -f
'$I = 3P+4P $xP'
sets $I to 3, prints $I, adds 4 to $I, and then prints the new value of $I, and
finally prints the value of $x.
The
value is either an unsigned number, or
$registername,
or one of [
CSILFD], which stand for, respectively, sizeof(char),
sizeof(short), sizeof(int), sizeof(long), sizeof(float), and sizeof(double).
Note that these letters are the same as used for specifying datasizes. For
instance, if a data stream contains an integer count that indicates the number
of following bytes, but the data are a group of floats, you could convert the
byte count to a number of floats, and then print the float values by using the
format -f 'I~ >n $n/F $nF'
This reads but doesn't print the number of bytes (
I~), stores the count
in register
$n (
>n), divides the number of bytes by the size
of a float(
$n/F), and then reads and prints that number of floats (
$nF).
Operators are evaluated left-to-right, and there is no precedence. For instance,
``
$n+3*5'' adds 3 to
$n and then multiplies
$n by 5; it
doesn't add 15 to
$n.
There are several special registers. You have already read about $#; the
complete list of special registers is:
- •
- $# holds the last integer value scanned if you didn't use
"> c" to name another register.
- •
- $( is a counter: it counts the number of iterations at the current
parenthesis depth (a separate copy is kept for each depth).
- •
- $@ contains the total number of bytes processed so far.
The
$( counter can be useful in labelling successive values read, or
successive records in a file. For example: -f '3( [$(: ] I F \n)'
prints
1: intvalue floatvalue
2: intvalue floatvalue
3: intvalue floatvalue
This program used to be called
vis, but then HP came up with its own
program of the same name. Too bad.