Perl Unicode Cookbook: Decode @ARGV as UTF-8

℞ 13: Decode program arguments as utf8

While the standard Perl Unicode preamble makes Perl’s filehandles use UTF-8 encoding by default, filehandles aren’t the only sources and sinks of data. The command-line arguments to your programs, available through @ARGV, may also need decoding.

You can have Perl handle this operation for you automatically in two ways, and may do it yourself manually. As documented in perldoc perlrun, the -C flag controls Unicode features. Use the A modifier for Perl to treat your arguments as UTF-8 strings:

     $ perl -CA ...

You may, of course, use -C on the shebang line of your programs.

The second approach is to use the PERL_UNICODE environment variable. It takes the same values as the -C flag; to get the same effect as -CA, write:

     $ export PERL_UNICODE=A

You may temporarily disable this automatic Unicode treatment with PERL_UNICODE=0.

Finally, you may decode the contents of @ARGV yourself manually with the Encode module:

    use Encode qw(decode_utf8);
    @ARGV = map { decode_utf8($_, 1) } @ARGV;

Previous: ℞ 12: Explicit encode/decode

Series Index: The Standard Preamble

Next: ℞ 14: Decode @ARGV as Local Encoding

Tags

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub