diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-12-27 23:56:20 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-12-27 23:56:20 +0000 |
commit | aaef10c5550c567d82a2f114831f7a5c9e62a4e7 (patch) | |
tree | 6e96e49c0f43b8660ff537b87fe018f0b99f8908 /pod/perluniintro.pod | |
parent | b682381a96c9a55a544f9537d92a562937057c0c (diff) | |
download | perl-aaef10c5550c567d82a2f114831f7a5c9e62a4e7.tar.gz |
Fast Latin1<->UTF-8 conversion for older Perls.
p4raw-id: //depot/perl@13912
Diffstat (limited to 'pod/perluniintro.pod')
-rw-r--r-- | pod/perluniintro.pod | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 9b447caab9..68f8a01534 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -790,6 +790,15 @@ C<Unicode::Map8>, and C<Unicode::Map>, available from CPAN. If you have the GNU recode installed, you can also use the Perl frontend C<Convert::Recode> for character conversions. +The following are fast conversions from ISO 8859-1 (Latin-1) bytes +to UTF-8 bytes, the code works even with older Perl 5 versions. + + # ISO 8859-1 to UTF-8 + s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg; + + # UTF-8 to ISO 8859-1 + s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg; + =head1 SEE ALSO L<perlunicode>, L<Encode>, L<encoding>, L<open>, L<utf8>, L<bytes>, |