|Rather then read the data from the stdin input stream, use the string string as the source of the data. This is useful when you would like to calculate for a specific word or sentence.|
|-s||Sorted output based on symbol frequencies|
|-v||Print entropy version.|
A data sets entropy is the sum of each distinct symbols entropies. A symbol is a distinct set of contiguous bits. For the sake of simplicity, I use 8 (number of bits making up a byte). It is important to note that a symbol does NOT have to consist of 8 bits. A distinct symbols entropy S of symbol z can be defined as:Where Pz is the probability of symbol z being found in the set. By storing each distinct symbol as a node in a list, we can calculate exactly how many times the symbol occurs in the set. This is also known as solving for the symbols frequency.
Sz = -log2(Pz)
In order to calculate the overall entropy of the data; we sum the total entropies contributed by each distinct symbol and divide that by a coefficient known as the data size. This coefficient is calculated by taking the number of bits per symbol and multiplying it by the total quantity of symbols.
The entropy utility exits 0 on success, and non-zero if an error occurs.
o Chris S.J. Peron
None known. This does not mean they do not exist though. Please send bug reports and source code patches to (firstname.lastname@example.org).
|entropy version 1.0.0||ENTROPY (1)||9 Aug 2002|