To avoid storing the same short data blocks padded with
differing numbers of zeros, Venti clients working with fixed-size
zero truncate the blocks before writing them to the server.
For example, if a 1024-byte data block contains the
hello world followed by 1013 zero bytes,
a client would store only the 11-byte block.
When the client later read the block from the server,
it would append zero bytes to the end as necessary to
reach the expected size.
When truncating pointer blocks
trailing zero scores are removed
instead of trailing zero bytes.
Because of the truncation convention,
any file consisting entirely of zero bytes,
no matter what its length, will be represented by the zero score:
the data blocks contain all zeros and are thus truncated
to the empty block, and the pointer blocks contain all zero scores
and are thus also truncated to the empty block,
and so on up the hash tree.
A Venti session begins when a
client connects to the network address served by a Venti
server; the conventional address is
venti port is 17034).
Both client and server begin by sending a version
string of the form
versions field is a list of acceptable versions separated by
The protocol described here is version
02. The client is responsible for choosing a common
version and sending it in the
VtThello message, described below.
After the initial version exchange, the client transmits
(T-messages) to the server, which subsequently returns
(R-messages) to the client.
The combined act of transmitting (receiving) a request
of a particular type, and receiving (transmitting) its reply
is called a
transaction of that type.
Each message consists of a sequence of bytes.
Two-byte fields hold unsigned integers represented
in big-endian order (most significant byte first).
Data items of variable lengths are represented by
a one-byte field specifying a count,
n, followed by
n bytes of data.
Text strings are represented similarly,
using a two-byte count with
the text itself stored as a UTF-encoded sequence
of Unicode characters (see
Text strings are not
n counts the bytes of UTF data, which include no final
NUL character is illegal in text strings in the Venti protocol.
The maximum string length in Venti is 1024 bytes.
Each Venti message begins with a two-byte size field
specifying the length in bytes of the message,
not including the length field itself.
The next byte is the message type, one of the constants
in the enumeration in the include file
<venti.h>. The next byte is an identifying
tag, used to match responses to requests.
The remaining bytes are parameters of different sizes.
In the message descriptions, the number of bytes in a field
is given in brackets after the field name.
n is not a constant represents a variable-length parameter:
n followed by
n bytes of data forming the
parameter. The notation
string[s] (using a literal
is shorthand for
s followed by
s bytes of UTF-8 text.
parameter is the last field in the message represents a
variable-length field that comprises all remaining
bytes in the message.
All Venti RPC messages are prefixed with a field
size giving the length of the message that follows
(not including the
size field itself).
The message bodies are:
Each T-message has a one-byte
tag field, chosen and used by the client to identify the message.
The server will echo the requests
tag field in the reply.
Clients should arrange that no two outstanding
messages have the same tag field so that responses
can be distinguished.
The type of an R-message will either be one greater than
the type of the corresponding T-message or
Rerror, indicating that the request failed.
In the latter case, the
error field contains a string describing the reason for failure.
Venti connections must begin with a
VtThello message contains the protocol
version that the client has chosen to use.
codec could be used to add authentication, encryption,
and compression to the Venti session
but are currently ignored.
rcodec fields in the
VtRhello response are similarly ignored.
sid fields are intended to be the identity
of the client and server but, given the lack of
authentication, should be treated only as advisory.
hello should be the only
hello transaction during the session.
ping message has no effect and
is used mainly for debugging.
Servers should respond immediately to pings.
read message requests a block with the given
to convert a block type enumeration value
type used on disk and in the protocol.
count field specifies the maximum expected size
of the block.
data in the reply is the blocks contents.
write message writes a new block of the given
type with contents
data to the server.
The response includes the
score to use to read the block,
which should be the SHA1 hash of
The Venti server may buffer written blocks in memory,
waiting until after responding to the
write message before writing them to
The server will delay the response to a
sync message until after all blocks in earlier
write messages have been written to permanent storage.
goodbye message ends a session. There is no
VtRgoodbye: upon receiving the
VtTgoodbye message, the server terminates up the connection.
04 of the Venti protocol is similar to version
02 (described above)
but has two changes to accomodates larger payloads.
First, it replaces the leading 2-byte packet size with
a 4-byte size.
count in the
VtTread packet may be either 2 or 4 bytes;
the total packet length distinguishes the two cases.