The Perl Conference Salt Lake City 2018 Banner

Perl Unicode Cookbook: Specify a File's Encoding

℞ 19: Open file with specific encoding

While setting the default Unicode encoding for IO is sensible, sometimes the default encoding is not correct. In this case, specify the encoding for a filehandle manually in the mode option to open or with the binmode operator. Perl’s IO layers will handle encoding and decoding for you. This is the normal way to deal with encoded text, not by calling low-level functions.

To specify the encoding of a filehandle opened for input:

    open(my $in_file, "< :encoding(UTF-16)", "wintext");
     # OR
     open(my $in_file, "<", "wintext");
     binmode($in_file, ":encoding(UTF-16)");

     # ...
     my $line = <$in_file>;

To specify the encoding of a filehandle opened for output:

     open($out_file, "> :encoding(cp1252)", "wintext");
     # OR
     open(my $out_file, ">", "wintext");
     binmode($out_file, ":encoding(cp1252)");

     # ...
     print $out_file "some text\n";

More layers than just the encoding can be specified here. For example, the incantation ":raw :encoding(UTF-16LE) :crlf" includes implicit CRLF handling. See PerlIO for more details.

Previous: ℞ 18: Make All I/O Default to UTF-8

Series Index: The Standard Preamble

Next: ℞ 20: Unicode Casing



Something wrong with this article? Help us out by opening an issue or pull request on GitHub