Use of the :utf8 I/O layer (as opposed to :encoding(UTF8) or
:encoding(UTF-8)) was suggested in the Perl documentation up to
version 5.8.8. This may be OK for output, but on input :utf8 does not
validate the input, leading to unexpected results.
An exploit based on this behavior of :utf8 is exhibited on PerlMonks
at <http://www.perlmonks.org/?node_id=644786>. The exploit involves a
string read from an external file and sanitized with m/^(\w+)$/,
where $1 nonetheless ends up containing shell meta-characters.
open $fh, <:utf8, foo.txt; # BAD
open $fh, <:encoding(UTF8), foo.txt; # GOOD
open $fh, <:encoding(UTF-8), foo.txt; # BETTER
See the Encode documentation for the difference between
UTF8 and UTF-8. The short version is that UTF-8 implements the
Unicode standard, and UTF8 is liberalized.
For consistencys sake, this policy checks files opened for output as
well as input, For complete coverage it also checks binmode() calls,
where the direction the operation can not be determined.