chkascii - a small C program that checks an ASCII (plain-text) file for any
invalid characters
chkascii treats valid ASCII values in a text file as the set :
{ 32-126 / 9 (horizontal tab) / 10 (line feed a.k.a. Unix newline) }
Anything else is considered junk, although the user can pass in additional ASCII
values to treat as good.
chkascii additionally checks whether the file is EOL-terminated. If it is not
EOL-terminated, chkascii treats the condition as an error too.
The command below asks myfile to be checked as per the built-in scheme, exiting
upon the first bad ASCII character :
chkascii myfile
--summary keeps the application going till EOF, after which a summary is printed
:
chkascii --summary myfile
--accept forces an ASCII value to be accepted as good. The command below asks
ASCII values 13 (CR) and 11 (VT) to be treated as good :
chkascii --accept=13,11 myfile
-1 is returned for operational failure. Because the operating system might map
return values to unsigned char, the shell could pick up 255.
If the file (contains all good ASCII values and is EOL-terminated) or else (is
empty), 0 is returned.
All other cases are treated as errors, and a byte adjusted at the ends is
returned.
If the file is not EOL-terminated, the left-most bit of the byte (equivalent to
128) is turned on.
If the file has junk, the rightmost bit of the byte (equivalent to 1) is turned
on.
vim(1) has the Normal command ga which displays the ASCII value of the character
under the cursor.
The Ctrl+Q subcommand of hexedit(1) can be used to insert (hexadecimal) ASCII
half-chars (0 through f).
Besides many editors, unix2dos(1) and flip(1) can convert text files between
Unix (LF) and DOS (CRLF) line-endings.
chkascii does not read standard input, nor are multiple files accepted. If you
pass in multiple files, the first file is taken as the argument.
Manish Jain (bourne.identity@hotmail.com)