℞ 22: Match Unicode linebreak sequence in regex
Unicode defines several characters as providing vertical whitespace, like the carriage return or newline characters. Unicode also gathers several characters under the banner of a linebreak sequence. A Unicode linebreak matches the two-character CRLF grapheme or any of the seven vertical whitespace characters.
As documented in perldoc
perlrebackslash, the \R regex backslash sequence matches any
Unicode linebreak sequence. (Similarly, the \v sequence matches
any single character of vertical whitespace.)
This is useful for dealing with textfiles coming from different operating systems:
s/\R/\n/g; # normalize all linebreaks to \n
Previous: ℞ 21: Case-insensitive Comparisons
Series Index: The Standard Preamble

