diff options
author | David E. Wheeler <david@justatheory.com> | 2015-02-26 22:35:52 -0500 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2015-03-09 10:32:09 -0600 |
commit | 8f226aeeda55a51eee04feb4b605d30997d9b592 (patch) | |
tree | 995e300c55c3b601fdca44aa397d0ab4843732af /pod/perlpodspec.pod | |
parent | 9d1ee7270f0fadaef492639aab74023e69bd9fad (diff) | |
download | perl-8f226aeeda55a51eee04feb4b605d30997d9b592.tar.gz |
Reinstate reverted "perlpod and spec: s/Latin-1/CP-1252/"
This reverts 1a3afb4f8c551b292b5b34f7244ed71f9ac01cfd which reverted
e2bb786192adfa315ea974b5f630d7040aa6f6ac, thus reinstating the latter.
In thinking about this and discussing it with rjbs, I (khw) realized
that this pod text really should go in to v5.22. I made minor
clarifications and fixed the author name of the original commit.
Diffstat (limited to 'pod/perlpodspec.pod')
-rw-r--r-- | pod/perlpodspec.pod | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/pod/perlpodspec.pod b/pod/perlpodspec.pod index f2af63e2c6..251a55cafd 100644 --- a/pod/perlpodspec.pod +++ b/pod/perlpodspec.pod @@ -607,7 +607,8 @@ as signaling that the file is Unicode encoded as in UTF-16 (whether big-endian or little-endian) or UTF-8, Pod parsers should do the same. Otherwise, the character encoding should be understood as being UTF-8 if the first highbit byte sequence in the file seems -valid as a UTF-8 sequence, or otherwise as Latin-1. +valid as a UTF-8 sequence, or otherwise as CP-1252 (earlier versions of +this specification used Latin-1 instead of CP-1252). Future versions of this specification may specify how Pod can accept other encodings. Presumably treatment of other @@ -641,7 +642,7 @@ I<and> whether the next byte is in the range 0x80 - 0xBF. If so, the parser may conclude that this file is in UTF-8, and all highbit sequences in the file should be assumed to be UTF-8. Otherwise the parser should treat the file as being -in Latin-1. (A better check is to pass a copy of the sequence to +in CP-1252. (A better check is to pass a copy of the sequence to L<utf8::decode()|utf8> which performs a full validity check on the sequence and returns TRUE if it is valid UTF-8, FALSE otherwise. This function is always pre-loaded, is fast because it is written in C, and |