NAME

viz - Makes invisible characters visible; binary file lister

SYNOPSIS

viz [ -f fmt | -F fmtfile ] [ -h ] [ -l n ]
[ -p | -o | -P len | -O len ] [ file ... ]

DESCRIPTION

Viz copies its input to its output, converting invisible characters to a visible form. If no files are given, stdin is read. If the -t option is used, the output is formatted in a form that can be completely inverted (see inviz), which allows a binary file to be converted to a text form, edited, and then converted back. It is much more flexible than either cat -v or od (either old or POSIX od), and it is also 2-4 times faster.

By default, the input is treated as a sequence of characters. However, a file format may be specified, in which case viz can handle files containing a mixture of data of arbitrary types.

The format can contain include repeat counts and comments to be embedded in the output stream.

Uninteresting data can be skipped over and not printed on stdout.

Additional flexibility is provided through user-settable variables, which can be used as repeat counts. Simple math can be done on the variables, and chars, shorts, or integers from the input stream can be stored in them.

OPTIONS

-f fmt: specifies the type of input data, and how to format the output.

-F fmtfile: gives the name of a file containing a fmt string.

-h: print a help message on stderr, then exit.

-l n: limit output to no more than n characters per line (def 80).

-p: print offsets (each line begins offset: ).

-P len: print offsets using the format len*n+offset:. This is useful if the input file is organized into fixed-length logical records of data.

-o, -O len: like -p and -P, but the offset is printed in octal.

-t: prints the type of input data (e.g. int or float) in the output stream; this allows non-text data to be retranslated properly, so that the output can be completely inverted by inviz(1) to reproduce the original input.

-L: display a readable version (on stderr) of the internal representation of the format string; not useful for ordinary use.

-D: turn on debug mode; not useful for ordinary use.

OUTPUT

By default, the input is treated as a stream of ASCII characters. Part or all of the input can optionally be interpreted as shorts, ints, longs (all signed or unsigned), or floats or doubles. Since all characters are made visible, newlines and empty lines in the output are not significant.

When viz is translating ASCII characters,

printing characters are passed through untouched;
the usual C escape sequences are used for backspace (\b), formfeed (\f), newline (\n), return (\r), tab (\t), backslash (\\), and null (\0);
if compiled with an ANSI C compiler, viz also uses these additional escape sequences: audible bell → \a and vertical tab → \v;
the caret (^) is printed as \^, so that a plain caret can be used as a prefix for control characters (see next bullet);
other characters in the range \0..\037 are displayed as control characters (e.g. 01 is displayed as ^A);
characters above \0177 are displayed as octal C escape sequences (\nnn).

Viz can print integer data (char, short, int, or long) in decimal, octal, hex, or binary (default decimal), or use a user-specified printf() format. If a multi-byte number is printed in binary, the binary representations of the bytes are separated by commas to improve readability.

Floats and doubles can be printed using a printf() ``f'' or ``g'' format (default g), or use a user-specified printf() format.

If the -t (type of data) option is used, output lines containing numbers are begun with \#C, \#S, \#I, \#L, \#F, or \#D, indicating that the line contains char, short, int, long, float, or double-size numeric data, respectively. This allows inviz(1) to exactly invert the output.

FORMAT SPECIFICATION

The output format is controlled with the -f fmt or -F fmtfile options. A format specifies the types of data contained in the file and how to print them. A fmtfile is the name of a file containing a format; this is very useful for holding complex or frequently-repeated formats.

Simple Formats

A simple format is [ repeat_count ] datasize [ output_format ]
or
[ repeat_count ] output_format
For instance, '20 short u' says to read 20 objects whose size is that of a short, and interpret them as unsigned decimal values. Whitespace is optional around any of the elements of a format. The members of a simple format are:

repeat_count tells how many items of size datasize to process from the input stream. If the format doesn't begin with a repeat count, an infinite loop is assumed and the format is applied to the entire input stream. A repeat count is one of

a decimal number (e.g. 120);
a double-dollar sign ($$), meaning repeat until EOF is encountered on the input; or
a viz register $c, where c can be any character. Registers are discussed below.

datasize indicates the size of each input datum in bytes. Valid datasizes are

char, C
short, S
int, I
long, L
float, F
double, D
Z

The single-character uppercase names [CSILFD] are shorthand for char, short, etc. If the long form is used, it has to be followed by a non-alphabetic character; use whitespace if necessary. There is no explicit "unsigned" size, as each unsigned type is always the same size as the corresponding signed form. The special datasize Z means a zero-terminated string (i.e., a null-terminated string); see output format a, below. The Z type is the only datasize that doesn't correspond to a fundamental C type.

output_format specifies how to interpret and print the data. Valid output formats are:

~: Discard (don't print) the input. The default datasize is char.
a: Interpret the input as a character and print using the rules given above for text. Similar to od(1)'s -a option. The only allowed datasizes are char (or C) and Z. The Z datasize is special. It corresponds to all characters in the input stream from the current position up through a null. This is useful for printing null-terminated strings that are embedded in fixed-length fields that otherwise contain garbage. The number of characters matched by a Z datasize can be collected into a viz register, so that you can then skip the remaining junk. See also output format c.
b: Interpret the input as non-negative and integral-valued, and print as a binary number. The default datasize is char. Datasize's float and double cannot be used with the b output format.
c: Interpret the input as a character and print using their ASCII representation, except for non-printing characters and blanks, which are printed as 3-digit octal numbers \nnn. The only allowed datasize is char (or C). Similar to od(1)'s -c option. The only allowed datasize is char (or C). See also output format a.
d: Interpret the input as a signed integral-valued number and print it in decimal. The default datasize is int. Datasize's float and double cannot be used with the d output format.
f: Interpret the input as a float or double, and print it using printf(3s) %f format. The allowed datasize's are float (the default) and double.
g: Interpret the input as a float or double, and print it using printf(3s) %g format. The allowed datasize's are float (the default) and double.
h: Interpret the input as an unsigned number and print it in hexadecimal. The default datasize is int. Datasize's float and double cannot be used with the h output format.
o: Interpret the input as an unsigned number and print it in octal. The default datasize is int. Datasize's float and double cannot be used with the o output format.
u: Interpret the input as an unsigned number and print it in decimal. The default datasize is int. Datasize's float and double cannot be used with the u output format.
x: A synonym for h (hex) format.
"printfstring": If you enter a quoted string, then the printfstring is used as a printf format. It is up to you to give a format that is appropriate to the datatype; no checking is done. For instance, you might use -f 'I " %+d "'
to have integers always printed with signs.

Default Datasizes and Output Formats

If you specify the datasize, but not the output format, the defaults are:

Datasize Default Output Format

C, char, Z	a
S, short, I, int, L, long	d
F, float, D, double	g

If you specify an output format, but not the datasize, the defaults are:

Output Format Default Datasize

a, b, c, ~	char
d, u, o, h, x	int
f, g	float

Concatenation

Formats can be concatenated, and are then processed one after the other: -f '24Ca 5Fg'
processes 24 characters and then 5 floats.

Grouping

Formats can be grouped and nested with parentheses: -f 'int 3(24a 5g)'
means 1 int, then 3 groups of 24 chars + 5 floats. Since the format doesn't begin with a number, then entire format is repeated until EOF is reached; it's equivalent to -f '$$(int 3(24a 5g))'

Comments

A format can contain [comment], in which case the comment is embedded in the output stream. For instance, -f 'I [Power: ] F'
prints one integer, then the string "Power: ", and then a float. Within a comment, the usual C escape sequences are recognized, as are the viz registers `$c' (the contents of $c are substituted wherever the string is encountered). The special meaning of $ may be escaped by preceding it with a backslash.

Input Comments

All text from a sharp (#) through a newline is considered a comment about the format, and is ignored (it's not put into the output stream, either). This allows you to comment long format files. For instance, a format file could contain

I~ >n    # Read the number of bytes into register n
$n/F     # Convert to a number of floats

(See the section on Registers.)

Newlines

A newline can be placed in the output stream either by using a comment containing a newline such as ``[\n]'', or any of the special format members ``n'', ``;'', and ``\n''. (These three are completely equivalent; use whichever suits best.) The advantage of using the the special formats instead of a comment is that it gives better-looking output: viz knows about the newline inserted by the format, and takes this into account when formatting output.

Seeking

If the input stream allows seeking, you can invoke fseek() by entering
offset seek whence   or .I offset ! whence
Here, whence is one of `0', `1', or `2', and indicates seeking from the beginning of the file, current location in the file, or end of the file. For example, 25 ! 1   or   25 seek 1
seeks 25 characters forward from the current location in the file; 0 ! 0   or   0 seek 0
rewinds the file; and -25 ! 2   or   -25 seek 2
goes to 25 characters before the end of the file.

Variables: Viz Registers

You can put the value of any integer data type read from the input stream into a register, by following the format with ``>registername.'' For instance, -f 'I~ >n'
collects an integer (but doesn't print its value, since the output format is "~"), and stores the value in the register named n. If you used a repeat count in the format, the value stored in the register is the last read value. For the special case of the Z datasize, the value stored in a register is the number of characters that were matched by the Z datasize, including the trailing null. This allows you to know how many characters were processed by the Z format.

There are 255 registers you can use, named with any single character other than null. The register can be used as a count by prefixing it with "$". For instance, if a data stream contains an integer count that indicates the number of following floats, you could print just the float values by using -f 'I~ >n $nF'
This reads but doesn't print the number of bytes (I~), stores the count in register $n (>n), and then reads and prints that number of floats ($nF).

If you do not store the value of an integer in a register, it is automatically stored in register $#. Thus the above example could also have been written as -f 'I~ $#F'

You can do arithmetic on registers. This is typically used for converting a byte count into a count of, say, integers. The syntax is $registername op value [ op value ...]
The operators op and their effects are

Operator Effect


	=	assign value to registername
	+	add value to registername
	-	subtract value from registername
	*	multiply registername by value
	/	divide registername by value
	%	registername modulo value

In addition, the special unary operator P prints the register value as a decimal number in the output stream, but has no other effect. For example -f '$I = 3P+4P $xP'
sets $I to 3, prints $I, adds 4 to $I, and then prints the new value of $I, and finally prints the value of $x.

The value is either an unsigned number, or $registername, or one of [CSILFD], which stand for, respectively, sizeof(char), sizeof(short), sizeof(int), sizeof(long), sizeof(float), and sizeof(double). Note that these letters are the same as used for specifying datasizes. For instance, if a data stream contains an integer count that indicates the number of following bytes, but the data are a group of floats, you could convert the byte count to a number of floats, and then print the float values by using the format -f 'I~ >n $n/F $nF'
This reads but doesn't print the number of bytes (I~), stores the count in register $n (>n), divides the number of bytes by the size of a float($n/F), and then reads and prints that number of floats ($nF).

Operators are evaluated left-to-right, and there is no precedence. For instance, ``$n+3*5'' adds 3 to $n and then multiplies $n by 5; it doesn't add 15 to $n.

There are several special registers. You have already read about $#; the complete list of special registers is:

$# holds the last integer value scanned if you didn't use ">c" to name another register.
$( is a counter: it counts the number of iterations at the current parenthesis depth (a separate copy is kept for each depth).
$@ contains the total number of bytes processed so far.

The $( counter can be useful in labelling successive values read, or successive records in a file. For example: -f '3( [$(: ] I F \n)'
prints




1:   intvalue   floatvalue	
2:   intvalue   floatvalue	
3:   intvalue   floatvalue

NOTES

This program used to be called vis, but then HP came up with its own program of the same name. Too bad.