# Revision history for Perl extension Encode. # # $Id: Changes,v 2.44 2011/08/09 07:49:44 dankogai Exp dankogai $ # $Revision: 2.44 $ $Date: 2011/08/09 07:49:44 $ ! Unicode/Unicode.xs Addressed the following: Date: Fri, 22 Jul 2011 13:58:43 +0200 From: Robert Zacek To: perl5-security-report@perl.org Subject: Unicode.xs!decode_xs n-byte heap-overflow ! Encode.pm encoding.pm ! lib/Encode/Alias.pm lib/Encode/Encoder.pm lib/Encode/Guess.pm Applied: RT#69735: patch for use constant DEBUG => https://rt.cpan.org/Ticket/Update.html?id=69735 2.43 2011/05/21 23:14:43 ! lib/Encode/Alias.pm Addressed RT#68361: Encode::Bytes x-mac-... aliases missing https://rt.cpan.org/Ticket/Display.html?id=68361 ! Encode.pm Applied the 0001-Fix-typo-in-pod.patch https://rt.cpan.org/Ticket/Update.html?id=64381 Addressed RT#65796 Deep recursion error finding invalid charset https://rt.cpan.org/Ticket/Update.html?id=65796 Applied a jumbo doc patch by Tom Christiansen Message-Id: <14795.1304618434@chthon> 2.42 2010/12/31 22:48:48 ! Encode.xs ! Unicode/Unicode.xs Applied: RT#64371: Update for 5.14 API changes http://rt.cpan.org/Ticket/Display.html?id=64371 2.41 2010/12/23 11:05:58 ! lib/Encode/MIME/Header.pm Applied: RT#63387 encode of MIME-Header inserts too much whitespace http://rt.cpan.org/Ticket/Display.html?id=63387 ! t/Aliases.t lib/Encode/Alias.pm Applied: RT#63286: Various Encode::Alias improvements http://rt.cpan.org/Ticket/Display.html?id=63286 2.40 2010/09/18 18:39:51 ! Encode.pm Encode.xs + t/utf8ref.t Addressed: RT#59981: find_encoding("UTF-8")->encode crashes decode_utf8() is now a little faster, too. http://rt.cpan.org/Ticket/Display.html?id=59981 http://rt.cpan.org/Ticket/Display.html?id=58541 ! lib/Encode/Unicode/UTF7.pm Addressed: RT#56443 utf-8 flag is not turned off after calling Encode::encode('UTF-7', $string) to encode an ascii string http://rt.cpan.org/Ticket/Display.html?id=56443 ! t/utf8strict.t Addressed: RT#57799 http://rt.cpan.org/Ticket/Display.html?id=57799 ! lib/Encode/Guess.pm Addressed: RT#46080: guess_encoding documentation http://rt.cpan.org/Ticket/Display.html?id=46080 ! ucm/nextstep.ucm Addressed: RT#59668: nextstep encoding is broken - missing ASCII characters http://rt.cpan.org/Ticket/Display.html?id=59668 ! lib/Encode/MIME/Header.pm t/mime-header.t Addressed: RT#52103: Encode::MIME::Header encoded words not separated by white space http://rt.cpan.org/Ticket/Display.html?id=52103 ! t/guess.t lib/Encode/Guess.pm Addressed: Encode: silenced a warning by from_to(..., 'Guess', ...) http://coderepos.org/share/changeset/37731 2.39 2009/11/26 09:23:59 ! Encode.xs t/fallback.t $utf8 = decode('utf8', $malformed, sub{ ... }) # now works! http://rt.cpan.org/Ticket/Display.html?id=51204 ! t/CJKT.t t/guess.t t/perlio.t $ENV{'PERL_CORE'} tricks removed since they are no longer necessary. Message-Id: <20091116161513.GA25556@bestpractical.com> 2.38 2009/11/16 14:08:13 ! Encode.xs Addressed: Encode memory corruption [perl #70528] Message-Id: ! t/Unicode.t Unicode/Unicode.xs Patched: #51263: set magic is not applied when modifying encode arguments http://rt.cpan.org/Ticket/Display.html?id=51263 ! Encode.xs Patched: #51204: Callback CHECK not supported for UTF-8 decoder/encoder http://rt.cpan.org/Ticket/Display.html?id=51204 ! Byte/Byte.pm CN/CN.pm Changes JP/JP.pm KR/KR.pm TW/TW.pm Unicode/Unicode.pm bin/enc2xs lib/Encode/Supported.pod Fix URLs http://rt.cpan.org/Ticket/Display.html?id=49776 ! t/CJKT.t t/guess.t t/perlio.t t/piconv.t $PERL_CORE trick is now off for perl 5.11 or better. Message-Id: Message-Id: Message-Id: <20090907154908.GS60303@plum.flirble.org> Message-Id: <20090907161509.GN8057@iabyn.com> 2.37 2009/09/06 14:32:21 ! Encode.xs fixed: compilation failure on compilers not supporting C99 http://rt.cpan.org/Ticket/Display.html?id=49466 2.36 2009/09/06 09:03:07 ! Encode.xs fixed: 'find_encoding("utf8")->decode(undef)' causes segmentation fault http://rt.cpan.org/Ticket/Display.html?id=49462 2.35 2009/07/13 02:06:30 ! lib/Encode/MIME/Header.pm Addressed RT #40027: decode of MIME-Header removes too much whitespace http://rt.cpan.org/Ticket/Display.html?id=40027 http://rt.cpan.org/Ticket/Display.html?id=42902 ! t/piconv.t Addressed by CSJEWELL: t/piconv.t loops infinitely on Win32 http://rt.cpan.org/Ticket/Display.html?id=47760 2.34 2009/07/08 13:34:15 ! bin/piconv duplicate-BOM problem now fixed. Message-Id: <10ECB9B7-006E-4570-9EB6-51C49F04ADCF@dan.co.jp> ! bin/piconv + t/piconv.t patches and tests by SREZIC Message-Id: <4A5366DA.8050801@iconmobile.com> ! Makefile.PL man* removed on behalf of blead Message-Id: <20090326135219.GU18164@plum.flirble.org> 2.33 2009/03/25 07:55:57 ! lib/Encode/MIME/Header.pm Decontaminated $& which sneaked in on 2.31. Message-Id: <67FC9F3A39C746DA95AAB6BB01539099@robmhp> Message-Id: <693254b90903242352x2dc26ba6p5e68deb871fa88ae@mail.gmail.com> http://coderepos.org/share/changeset/31542 2.32 2009/03/07 07:32:37 ! lib/Encode/Alias.pm t/Alias.t Encode now resolves 'en_US.UTF-8' to utf-8-strict like 'ja_JP.euc' Those who set locale on their shells should be happier now. ! AUTHORS added tokuhirom ! Encode.pm "encode(undef, 'str') should die earlier" http://coderepos.org/share/changeset/30790 2.31 2009/02/16 06:18:09 ! lib/Encode/MIME/Header.pm "Revert [29767] and [29771] since it breaks perl 5.8" by miyagawa http://coderepos.org/share/changeset/30111 2.30 2009/02/15 17:44:13 ! encoding.pm fixed regexes, et cetera. by drry http://coderepos.org/share/changeset/29767 ! lib/Encode/MIME/Header.pm Addressed: Encode::MIME::Header::decode should respect CHECK http://rt.cpan.org/Ticket/Display.html?id=43204 http://coderepos.org/share/changeset/29767 2.29 2009/02/01 13:14:37 ! Encode.pm VERSION++ just to make PAUSE happy Message-Id: <877i4anwwt.fsf@k75.linux.bogus> 2.28 Date: 2009/02/01 12:30:18 ! Unicode/Unicode.xs Latest refactoring broke the backward compatibility w/ Perl 5.8.6 and before now restored Message-Id: <1233185156.DABa130.74940@basic2.hostingcompartido.com> Message-Id: <693254b90902010027x277a5d0fm4f5700ba2f276239@mail.gmail.com> ! lib/Encode/MIME/Header.pm Addressed: Split header lines are joined incorrectly http://rt.cpan.org/Ticket/Display.html?id=42902 2.27 2009/01/21 22:55:07 ! lib/Encode/MIME/Header.pm t/mime-header.t Addressed: Encode::MIME::Header MIME-Q encoding truncates trailing zeros in some circumstances http://rt.cpan.org/Ticket/Display.html?id=42627 ! lib/Encode/Alias.pm Added alias: unicode-1-1-utf-7 http://rt.cpan.org/Ticket/Display.html?id=38558 ! Encode.pm Documented: _utf8_on() does not work for tainted values http://rt.cpan.org/Ticket/Display.html?id=41163 ! bin/enc2xs s[oss.software.ibm.com/icu][www.icu-project.org]g http://rt.cpan.org/Ticket/Display.html?id=40245 ! lib/Encode/Guess.pm t/guess.t Addressed:Empty file should produce an error message http://rt.cpan.org/Ticket/Display.html?id=38652 ! Unicode/Unicode.xs AUTHORS Refactored by Alex Davies http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-10/msg00745.html Message-Id: <7637669B2E3D46B187591747DA27F4C8@Amelie> 2.26 2008/07/01 20:56:17 ! Encode.pm Absense of Encode::ConfigLocal no longer carps no matter what. http://bugzilla.redhat.com/show_bug.cgi?id=435505#c2 http://rt.cpan.org/Ticket/Display.html?id=28638 http://rt.cpan.org/Ticket/Display.html?id=11511 ! lib/Encode/JIS7.pm use encoding 'utf8' and 'iso-2022-jp' glitches on perl 5.10 Thanks, MIYAGAWA Message-Id: <693254b90807011224h3ab50d76v50c6fea87baf223c@mail.gmail.com> ! lib/Encode/Alias.pm t/Aliases.t macintosh' not recognize as MacRoman http://rt.cpan.org/Ticket/Display.html?id=36326 ! Makefile.PL s{INC => "-I./Encode"} {INC => '-I' . File::Spec->catfile( '.', 'Encode' )} To prevent some platforms from forgetting to include Encode/encode.h. http://rt.cpan.org/Ticket/Display.html?id=36348 2.25 2008/05/07 20:56:05 ! Encode.pm added ':default' to Exporter option. ! lib/Encode/GSM0338.pm GSM0338 now handles coderef in CHECK http://rt.cpan.org/Ticket/Display.html?id=31335 ! Makefile.PL Perl 5.10/Encode 2.24: Tiny typo in Encode's Makefile.PL arg processing Message-Id: <961C2A4F-92B3-416D-A9F9-E7B0ADA9F134@fsck.com> ! lib/Encode/Alias.pm "This fix for Encode::Alias should make Solaris happy:" Message-ID: <47D886D9.6060001@iki.fi> 2.24 2008/03/12 09:51:11 ! lib/Encode/Config.pm adds and fixes also adds cp858 support. ! Encode.pm encoding.pm lib/Encode/Alias.pm ucm/cp858.ucm Merged perl@33486. > Change 33486 by rgs@scipion on 2008/03/12 08:50:11 An unfortunate side-effect of Encode and Encode::Alias use'ing each other, and Encode::Alias exporting functions into Encode for it to use as methods, broke the loading of the find_alias() Encode method in some cases since 5.10. Breaking the recursive inheritance fixes it. Message-Id: ! Encode.pm POD fix by tels Message-Id: <200711281835.36125@bloodgate.com> ! bin/ucmlint Fix by MIYAGAWA via CodeRepos http://coderepos.org/share/changeset/1791 ! encoding.pm t/mime_header_iso2022jp.t ported back from Perl 5.10-RC1 2.23 2007/05/29 18:15:32 ! Encode.xs got rid of global fallback_cb; encode_method() now takes one more argument which is a coderef to fallback. This should make encode_method() thread-safe. ! Encode.pm Added perluniintro, perlunifaq, and perlunitut to POD ! Encode.xs Plug a memory leak in Encode -- by rgs Message-Id: ! Unicode/Unicode.pm POD fixes on UTF-16LE http://aspn.activestate.com/ASPN/Mail/Message/perl5-porters/3486118 ! Makefile.PL man page generation is now conditional; yes by default but no if $PERL_CORE Message-Id: 2.22 2007/05/29 07:35:27 ! Encode.pm from_to() does not honor the check while decoding. That's a feature. To make sure it is a feature it is mentioned in the POD. http://rt.cpan.org/NoAuth/Bug.html?id=27277 ! Makefile.pl Encode used to suppress man page generation. Now it does. http://rt.cpan.org/NoAuth/Bug.html?id=27200 ! Encode.pm Encode.xs t/fallback.t Addressed: (de|en)code("ascii", "\x{3000}", sub{ $_[0] }) segfaults Reported by MIYAGAWA 2.21 2007/05/12 06:42:19 + lib/Encode/MIME/Name.pm t/mime-name.t ! Encode.pm Encode.xs lib/Encode/Encoding.pm new method: mime_name() inspired by: MIYAGAWA ! t/encoding.t Subject: Re: Compress::Zlib, pack "C" and utf-8 [PATCH] From: Marc Lehmann Date: Thu, 12 Apr 2007 08:41:53 +0200 Message-ID: <20070412064153.GA22475@schmorp.de> http://public.activestate.com/cgi-bin/perlbrowse/p/31194 ! Unicode/Unicode.pm POD fix. Message-Id: <20070417220547.GA11999@zetta.zet> 2.20 2007/04/22 14:56:12 ! Encode.pm Pod fixes. Now find_encoding() is explained more in details. + lib/Encode/GSM0338.pm - ucm/gsm0338.ucm ! lib/Encode/Supported.pod lib/Encode/Config.pm Bytes/Makefile.PL t/gsm0338.t ESTI GSM 03.38 support is relocated from Encode::Byte to Encode::GSM0338. This encoding is so kaputt it is unfit for Encode::XS! Though it was okay for general cases and escape sequences, '\0' => '@' IFF '\0\0' => '\0' had gliches. So kaputt even t/gsm0338 wrongly interpreted that. ref. http://www.csoft.co.uk/sms/character_sets/gsm.htm ! encoding.pm t/Aliases.t Imported from bleedperl #31015 2.19 2007/04/06 12:53:41 ! lib/Encode/JP/JIS7.pm + t/jis7-fallback.t encode('iso-2022-jp') fallback support added by MIYAGAWA++ decode()'s fallback remains unchanged (FB_PERLQQ) since UTF-8 contains all characters in iso-2022-jp so there's no need for fancy stuff. Message-Id: <693254b90704060526s6d850320h71cdda50dfbf7eba@mail.gmail.com> ! Encode.pm #25216 ([PATCH] Encode.pm: postpone the load of Encode::Encoding) http://rt.cpan.org/NoAuth/Bug.html?id=25216 ! lib/Encode/MIME/Header.pm t/mime-header.t #24418 (Encode::MIME::Header: wrong encoding with latin1 characters) http://rt.cpan.org/NoAuth/Bug.html?id=24418 ! Encode.pm #23876 (Add documentation for LEAVE_SRC) http://rt.cpan.org/NoAuth/Bug.html?id=23876 ! lib/Encode/Alias.pm t/Aliases.t #20781: Thai encoding needs alias for tis-620 http://rt.cpan.org/NoAuth/Bug.html?id=20781 ! bin/piconv AUTHORS #20344: piconv: wrong conversion of utf-16le encoded files (with PATCH) http://rt.cpan.org/NoAuth/Bug.html?id=20344 ! Encode.pm Encode.xs bin/enc2xs encoding.pm t/Aliases.t t/utf8strict.t Imported from bleedperl's 2.18_01 2.18 2006/06/03 20:28:48 ! bin/enc2xs overhauled the -C option - added ascii-ctrl', 'null', 'utf-8-strict' to core - auto-generated Encode::ConfigLocal no longer use v-string for version - now searches modules via File::Find so Encode/JP/Mobile is happy ! Byte/Byte.pm CN/CN.pm EBCDIC/EBCDIC.pm JP/JP.pm KR/KR.pm Symbol/Symbol.pm use strict added; though all they do is load XS, it's still better a practice ! *.pm use warnings added to all of them for better practices' sake. 2.17 2006/05/09 17:10:09 ! encode.pm 'chin' =~ /^zh_CN|chin(?:a|ese)?$/i is true but chin is not china or chinese. http://d.hatena.ne.jp/jankogai/20060508/1147090316 ! Encode.xs Integrated maintperl change (27824|27824) which I overlooked -- sorry, Nicholas and Coverity Scan. Message-Id: <200604152115.k3FLF1Ar014538@smtp3.ActiveState.com> Message-Id: <200605091615.k49GF1gJ016777@smtp3.ActiveState.com> 2.16 2006/05/03 18:24:10 ! bin/piconv --xmlcref and --htmlcref added. ! Encode.pm Copyright Notice Added. http://rt.cpan.org/NoAuth/Bug.html?id=19056 ! * Replaced remaining ^\t with q( ) x 4. -- Perl Best Practice pp. 20 And all .pm's are now perltidy-ed. 2.15 2006/04/06 15:44:11 ! Unicode/Unicode.xs Addressed: UTF-16, UTF-32, UCS, UTF-7 decoders mishandle illegal characters http://rt.cpan.org/NoAuth/Bug.html?id=18556 ! Encode.pm added str2bytes() as an alias to encode() and bytes2str() as an alias to decode() http://rt.cpan.org/NoAuth/Bug.html?id=17103 ! Encode.xs Change 26922: Avoid warning with MS Visual C compiler. Message-Id: <200601231245.k0NCj2dw009484@smtp3.ActiveState.com> ! t/perlio.t Change 26067: As using -C to turn on utf8 IO is equivalent to the open pragma Message-Id: <200511092227.jA9MRcYD009025@smtp3.ActiveState.com> 2.14 2006/01/15 15:43:36 ! Makefile.PL Change 26295: Don't build manpages for Encode and Unicode::Normalize Message-Id: <200512071540.jB7Fe4Gt017960@smtp3.ActiveState.com> ! Encode.pm Change 26081: Pod nit in Encode.pm, found by Marc Lehmann in RT #36949. Message-Id: <200511110357.jAB3vZcP023647@smtp3.ActiveState.com> ! Encode.xs Encode/encode.h bin/enc2xs encengine.c Change 25821: Mark more static Encode data structures as const. Change 25823: use more 'const' in the Encode data structures. Message-Id: <200510221243.j9MChTSu027711@smtp3.ActiveState.com> Message-Id: <200510221343.j9MDhTk9001245@smtp3.ActiveState.com> 2.13 2006/01/15 15:06:36 ! AUTHORS Miyagawa's mail address updated Message-Id: <693254b90601150535o767e10bai4f4732c275b4ebe0@mail.gmail.com> ! lib/Encode/MIME/Header.pm #16413: Encode::MIME::Headers patch to solve what is probably someone else's bug http://rt.cpan.org/NoAuth/Bug.html?id=16413 ! lib/Encode/MIME/Header.pm t/mime-header.t Applied: RT #16258: Support for RFC 2184 language tag http://rt.cpan.org/NoAuth/Bug.html?id=16258 ! Encode.pm Fixed RT #14559: fix for #8872 introduces new "bug" http://rt.cpan.org/NoAuth/Bug.html?id=14559 ! Encode.pm + t/from_to.t from_to() now makes use of $check more naturally. Message-Id: <693254b90601150535o767e10bai4f4732c275b4ebe0@mail.gmail.com> 2.12 2005/09/08 14:17:17 ! Encode.xs Encode.pm t/fallback.t Now accepts coderef for CHECK! ! ucm/8859-7.ucm Updated to newer version at unicode.org http://rt.cpan.org/NoAuth/Bug.html?id=14222 ! lib/Encode/Supported.pod More POD typo fixed. <42F5E243.80500@gmail.com> ! encoding.pm More POD typo leftover fixed. Message-Id: 2.11 2005/08/05 10:58:25 ! AUTHORS CHANGES To reflect changes below ! Encode.pm encoding.pm lib/Encode/Alias.pm lib/Encode/PerlIO.pod lib/Encode/Supported.pod Typo fixed by Piotr Fusik in Change 25261 & 25266 Message-ID: <001401c595bd$dccb5d80$0bd34dd5@piec> ! Encode.xs Addresses "BUG REPORT: panic in Encode.xs". Message-Id: <42EDDA97.2010608@hyper.to> + lib/Encode/MIME/Header/ISO_2022_JP.pm mime_header_iso2022jp.t ! lib/Encode/MIME/Header.pm lib/Encode/Config.pm Encoding 'MIME-Header-ISO_2022_JP' is introduced by Makamaka Message-Id: <200507311557.j6VFvE2K034605@www231.sakura.ne.jp> ! Encode/encode.h Encode.pm Encode.xs PerlIO's "encoding(utf-8-strict)" got a problem w/ partial character. Found and addressed by KONNO Hiroharu See also ext/PerlIO/encoding/encoding.pm Message-Id: 2.10 2005/05/16 18:46:36 ! Encode.pm fixed decode_utf8() accordingly to RT#8872 http://rt.cpan.org/NoAuth/Bug.html?id=8872 ! Encode.xs AUTHORS s/SvIVX/SvIV_set/ by Steve Peters. Message-Id: <2297.67.96.185.36.1114626315.squirrel@webmail3.pair.com> ! AUTHORS GAAS was missing! ! Encode.pm New Pod section: "UTF-8 vs utf8"; explains utf-8-strict + t/utf8strict.t Tests utf-8-strict, accordingly to UTF-8 decoder capability and stress test" by Markus Kuhn http://smontagu.damowmow.com/utf8test.html Note that malformed and overlong sequences are not test here because perl already does that for you, utf-8-strict or not. ! Encode.pm Encode/encode.h t/fallback.t Addressed "encode(..., Encode::LEAVE_SRC) does not work". Now FB_(PERLQQ|HTMLCREF|XMLCREF) implies LEAVE_SRC so you can (en|de)code constant strings with these fallbacks. http://rt.cpan.org/NoAuth/Bug.html?id=8736 ! Encode.pm Encode.xs lib/Encode/Alias.pm t/Aliases.t Make Encode.pm support the real UTF-8, by GAAS Message-Id: Message-Id: ! Encode.pm Encode.xs post-2.09 comment patches from GAAS applied. Message-Id: Message-Id: 2.09 2004/12/03 19:16:53 ! Encode.pm Encode.xs Addressed " :encoding(utf8) broken in perl-5.8.6". Message-Id: ! Encode.pm Addressed "(de|en)code($valid_encoding, undef) does not warn". http://rt.cpan.org/NoAuth/Bug.html?id=8723 ! Encode.pm t/Encode.t Addressed "Can't encode URI". When a reference is fed to (en|de)code, Encode now stringifies instead of returning undef. http://rt.cpan.org/NoAuth/Bug.html?id=8725 ! Encode.xs t/fallback.t Addressed "FB_HTMLCREF and FB_XMLCREF for the UTF-8 decoder". http://rt.cpan.org/NoAuth/Bug.html?id=8694 ! Encode.pm Addressed "s/digit/number/". http://rt.cpan.org/NoAuth/Bug.html?id=8695 ! Encode.pm Addressed "while (defined(read )) { ... } is an infinite loop". http://rt.cpan.org/NoAuth/Bug.html?id=8696 ! Encode.pm Addressed "What the heck is UCM?". Document fixed so that it no longer contains "UCM-Based Encodings". http://rt.cpan.org/NoAuth/Bug.html?id=8697 2.08 2004/10/24 13:00:29 ! Encode.xs lib/Encode/Encoding.pm Unicode/Unicode.{pm,xs} Resolved the issue that was raised by 2.07 -- Encode::utf8 fallbacks that was introduce messed up PerlIO::encoding. * To do so, ->renew() is renewed and ->renewed() was introduced to tell whether the caller is PerlIO or not. Message-Id: <94B2EB12-25B7-11D9-9E6A-000A95DBB50A@dan.co.jp> 2.07 2004/10/22 19:35:52 ! lib/Encode/Encoding.pm "Remove Carp from warnings.pm" that influences Encode, by Tels. Message-Id: <200410161618.29779@bloodgate.com> ! Encode.xs AUTHORS t/fallback.t Now Encode::utf8's fallbacks are compliant to Encode standard. Thank Bjoern Hoehrmann for persistently convincing me. Message-Id: <41a61aea.638409494@smtp.bjoern.hoehrmann.de> ! Encode.pm POD further revised. 2.06 2004/10/22 06:23:11 ! ucm/mac* RT #8083 reports that MacThai mapping was obsolete Updated all mac* encodings accordingly to the URI below. One remaining mystery is that MacRomanian vs. MacRumanian. MacRumanian is not found in unicode.org... http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ ! Encode.pm t/Encode.t Fixed RT #8081: "decode(..., bless{},'x') segfault" Two more tests added to test that. http://rt.cpan.org/NoAuth/Bug.html?id=8081 ! Encode.pm POD revised accordingly to RT #7966 http://rt.cpan.org/NoAuth/Bug.html?id=7966 ! Unicode/Unicode.pm POD updated explaining why Encode::Unicode always croaks on error rather than giving users choices. http://rt.cpan.org/NoAuth/Bug.html?id=7892 2.05 2004/10/19 04:55:01 ! encoding.pm "unnuke" jhi's patch in bleedperl, with minor correction by dankogai. Message-ID: <41210A84.6060506@iki.fi> Message-ID: <20041018233442.7418113f@r2d2> Message-Id: <2BA3DAC4-218A-11D9-906D-000A95DBB50A@dan.co.jp> 2.04 2004/10/16 21:22:44 ! Makefle.PL From: craigberry@mac.com Subject: [PATCH ext/Encode/Makefile.PL] make Encode.c dependency explicit Message-Id: <41716868.7000102@mac.com> 2.03 2004/10/06 05:07:20 ! lib/Encode/Alias.pm Resolved some alias case sensitivity glitches reported via RT. http://rt.cpan.org/NoAuth/Bug.html?id=7835 ! bin/piconv Resolved Win32 glitches reported via RT. (Fixed by dankogai and tested by Steve Hay) http://rt.cpan.org/Ticket/Display.html?id=7831 ! JP/JP.pm lib/Encode/Alias.pm lib/Encode/Supported.pod AUTHORS /\bwindows-31j$/i is now an alias of CP932, by Steve Hay. http://rt.cpan.org/NoAuth/Bug.html?id=6695 2.02 2004/08/31 10:55:34 ! ucm/big5-hkscs.ucm AUTHORS t/big5-hkscs.enc t/big5-hkscs.utf New map submitted by Deng Liu and Autrijus. Test data needed to be upgrade as well, done by dankogai Message-Id: <20040824204828.GB6999@aut.dyndns.org> ! bin/ucmsort Now works for characters U+10000 and above. This fix was needed to "tidy" the original map that was submitted. ! bin/enc2xs "ucmsort" now mentioned in pod 2.01 2004/05/25 16:27:14 ! bin/enc2xs AUTHORS From: domo@computer.org Subject: [PATCH] Correct statistics from enc2xs <4AF60A4A-B8BB-11D8-BF99-000A27839BD6@computer.org> ! lib/Encode/Alias.pm Addressed "False [] range "\s-" in regex;" in Encode::Alias.pm <200405271148.i4RBm4KY026529@mail.mvnet.de> 2.01 2004/05/25 16:27:14 ! lib/Encode/CN/HZ.pm lib/Encode/Unicode/UTF7.pm "If someone thinks utf8::upgrade($1) should be croaked like chom?p($1),please try the following patch for Encode.pm." -- sadahiro-san <20040522212704.C068.BQW10602@nifty.com> 2.0 2004/05/16 20:55:15 * version updated to 2.00 -- sorry, no big feature change. I just hate version 1.100 :) ! lib/Encode/Guess.pm Unicode/Unicode.pm addressed UTF-(8|32LE) + BOM misguessing https://rt.cpan.org/Ticket/Display.html?id=6279 ! Encode.pm s/is_utif8/is_utf8/ in POD ! Encode/lib/Encode/CN/HZ.pm Fixes "make test" failure after the patch to pp_hot.c by Sadahiro-san Message-Id: <20040222182357.6B39.BQW10602@nifty.com> ! bin/piconv From: autrijus@autrijus.org Subject: [PATCH] "piconv -C 512" badly broken Message-Id: <1072870210.769.5.camel@localhost> 1.99 2003/12/29 02:47:16 ! Unicode/Unicode.xs find_encoding("UTF-16BE")->encode("abc") now null terminates http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2003-10/threads.html#00258 ! Encode.pm prototype bug in decode_utf8() fixed Message-Id: <600A4CDA-F004-11D7-B570-000393AE4244@dan.co.jp> ! Encode.pm /MANIFEST encoding.pm lib/Encode/Supported.pod t/at-cn.t t/at-tw.t t/gsm0338.t ucm/gsm0338.ucm + t/gsm0338.t Merged from maintperl@21987 1.98 2003/08/20 11:15:31 ! lib/Encode/MIME/Header.pm AUTHORS t/mime-header.t Dave Evans has found and corrected a bug in Encode::MIME::Header. Test suite added by Dan Kogai. Message-Id: <3F43440B.7060606@rudolf.org.uk> ! encoding.pm Typo fixes rolled back in from bleedperl ! t/at-cn.t t/at-tw.t v-strings, now depreciated in perl 5.8.1, is replaced by sadahiro Message-Id: <20030805002313.9880.BQW10602@nifty.com> ! bin/enc2xs argv case nit for VMS by Craig Message-ID: <3F2B02DE.10207@mac.com> ! t/enc_eucjp.t t/enc_utf8.t AUTHORS Encode test fixes for VMS by Peter Prymmer Message-ID: ! lib/Encode/Alias.pm t/Aliases.t koi-8 aliases bug detected and patched by sadahiro. Further fix and test suite by dankogai Message-Id: <20030713102228.C76A.BQW10602@nifty.com> 1.97 2003/07/08 21:52:14 ! encoding.pm lib/Encode/Guess.pm lib/Encode/Alias.pm lib/Encode/JP/JIS7.pm lib/Encode/Encoder.pm Encode.pm $DEBUG replaced with DEBUG() so perl optimizes better, by Rafael with further fixes by dankogai Message-Id: <20030705222023.1f24e041.rgarciasuarez@free.fr> ! lib/Encode/Aliases.pm Was: define_alias( qr/\bGB[-_ ]?2312(?:\D.*$|$)/i => '"euc-cn"' ); Now: define_alias( qr/\bGB[-_ ]?2312(?!-?raw)/i => '"euc-cn"' ); So new hash seeding introduced in bleedperl works. Message-Id: <20030629100937.GD20285@vipunen.hut.fi> ! lib/Encode/Guess.pm $Encode::Guess::NoUTFAutoGuess is added so you can turn off automatic utf(8|16|32) guessing -- originally by Autrijus Message-Id: <20030626162731.GA2077@not.autrijus.org> ! Encode.pm Addressed the following; Subject: [perl #22835] FB_QUIET doesn't work with Encode::encode Message-Id: 1.96 2003/06/18 09:29:02 ! lib/Encode/JP/JP.pm t/guess.t m/(...)/ in void context then $1 is considered a Bad Thing Message-Id: ! Encode.pm Mentions in POD that as of perl 5.8.1 utf8::is_utf8() is also available. ! encengine.c More typecast from maintperl@19739 Message-Id: <200306110645.h5B6j5D2009640@smtp3.ActiveState.com> ! t/perlio.t Tests 37 & 38 failed on Win32 -- yet another CRLF issue Message-Id: <200306090733.h597XQPA031646@smtp3.ActiveState.com> ! t/Encode.t Now skips for EBCDIC platform. Message-Id: ! t/perlio.t Craig's patch applied that addresses "Many systems (DOS, VMS) cannot have more than one C<.> in their filenames." -- perlport. Message-Id: <3ED79E01.8050401@mac.com> ! bin/piconv Found and fixed the back that -p,--perlqq does not work. Induced by the change from Getopt::Std to Getopt::Long. ! encoding.pm Addressed [cpan #2629] Wrong assumption in numeric comparison Message-Id: ! Encode.pm Encode.xs Unicode/Unicode.pm Unicode/Unicode.xs lib/Encode/Encoding.pm t/perlio.t ! API Change: ->new_sequence() => ->renew() + Encode::Unicode makes use of it so it can handle BOM on PerlIO + Encode::XS and Encode::utf8 now supports ->renew() + Encode::Encoding now documents this with examples - Non-XS (en|de)code stripped out of Encode::Unicode Message-Id: <146957DB-8C39-11D7-9C91-000393AE4244@dan.co.jp> 1.95 2003/05/21 08:41:11 ! ucm/8859-*.ucm Since bogus entries were found in iso-8859-6, all entries are re-generated once again out of http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-*.TXT Thank David Graff for the discovery Message-Id: <200305201819.h4KIJRRU013746@unagi.cis.upenn.edu> + lib/Encode/Unicode/UTF7.pm ! lib/Encode/Config.pm lib/Encode/Alias.pm Unicode/Unicode.pm t/Unicode.t lib/Encode/Supported.pod UTF-7 support is now added. With this Encode now has all transcoding methods in Unicode::String. 1.94 2003/05/10 18:13:59 ! lib/Encode/MIME/Header.pm A more sophisticated solution for double-encoding by dankogai ! lib/Encode/MIME/Header.pm AUTHORS Two bugs fixed by Bjoern Jacke * "Double Encoding" was not possible i.e. encode("MIME-B" => "=?UTF-8?B?w4RwZmVs?=") * encode("MIME-Q") had UTF-8 flag on Message-Id: ! lib/Encode/MIME/Header.pm AUTHORS Two occurances of "croak ()" fixed as "croak qq()". Simon Cozens is added to AUTHORS as a result. Message-Id: <20030509103708.GA30664@deep-dark-truthful-mirror.pad> ! bin/piconv POD fixes that reflect enhancements by jhi ! bin/piconv Two enhancements by jhi. + Now uses Getopt::Long so it accepts long name options (--from for -f, for example) + New option: -r,--resolve Message-Id: <20030505114149.GA227075@kosh.hut.fi> ! MANIFEST META.yml META.yml added upon request of Schwern Message-Id: ! AUTHORS Enache Adrian removed upon request -- to live longer than Encode and/or FreeBSD (toy-)?thread :) Message-Id: <20030425015701.GA2069@ratsnest.hole> ! t/enc_module.t "close STDOUT unless $^O eq 'freebsd';" once again relocated to keep VMS happy in which case "$^O eq 'freebsd'" is required to keep FreeBSD+thread happy. Sigh. Message-Id: <3EA88ADC.3000300@mac.com> 1.93 2003/04/24 17:43:16 ! t/enc_eucjp.t added "no warnings 'pack'" in for loop to keep bleedperl from complaining "Character in 'C' format wrapped in pack". ! Makefile.PL More elegant perl core detection inspired by Ilya Zakharevich (but further elaborated for general cases). ! lib/Encode/Encoding.pm lib/Encode/PerlIO.pod POD fixes. ! t/euc-jp.ucm like cp9??, \x80-\x9F (control + 0x80) are zapped so they are less likely to be confused w/ ISO-8859-* ! t/CJKT.t RT tests added (vendor encodings are exemplified) -- that successfully found a flaw on iso-2022-kr before the patch. ! lib/Encode/CJKConstants.pm lib/Encode/KR/2022_KR.pm decode("ISO-2022-KR") has been buggy but no one ever sited that since no one seems to be using it. Bugs discovered by SADAHIRO-san Message-Id: <20030416231757.A545.BQW10602@nifty.com> ! lib/Encode/CN/HZ.pm t/perlio.t HZ is now perlio_ok, thanks to SADAHIRO-san. perlio.t modified so it adds test for HZ. Message-Id: <20030416231757.A545.BQW10602@nifty.com> ! lib/Encode/Guess.pm Now guesses UTF-(16|32)(BE|LE) when the string contains \x00. So long as the string contains \x{00}-\x{ff} it does not fail. See perldoc for details. Message-Id: 1.92 2003/03/31 03:27:27 ! ucm/big5-eten.ucm ucm/big5-hkscs.ucm Extraneous single-byte chars in range \x80-\xA0 and \xFA-\xFF removed. FYI, IBM's ICU has none of these for java-Big5-1.3_P.ucm but glibc-BIG5-2.1.2.ucm does. Message-Id: <20030325215213.4CA1.BQW10602@nifty.com> ! ucm/cp932.ucm ucm/cp936.ucm ucm/cp949.ucm ucm/cp950.ucm Maps regenerated again but this time based upon http://oss.software.ibm.com/cvs/icu/charset/data/ucm/ (But where is THE DOCUMENT by MICROSOFT?) ! t/enc_module.t AUTHORS failure with threaded Perl on FreeBSD addressed. Enache Adrian is added to AUTHORS for this. Message-Id: <20030322230131.GA813@ratsnest.hole> ! lib/Encode/Guess.pm Some POD fixes. ! t/CJKT.t Change 18989: Make the :bytes conditional on PerlIO. further Modified by Dan Kogai <200303161730.h2GHU5B16265@smtp3.ActiveState.com> ! t/enc_module.t Chnage 18966: another fix for failing test on windows ("use encoding" puts STDIN in :raw mode, so chomp() wasn't stripping the CR), by gsar Message-Id: <200303140545.h2E5j5B08856@smtp3.ActiveState.com> ! t/CJKT.t Change 18970: Hopefully this works also in Win32, by jhi Message-Id: <200303140745.h2E7j6B22729@smtp3.ActiveState.com> Change 18965: fix CJKT.t failures on windows due to incorrect binmode(), by gsar Message-Id: <200303140530.h2E5U5B07046@smtp3.ActiveState.com> 1.91 2003/03/09 20:07:37 ! encoding.pm even more proofread by jhi. Message-Id: <20030309194323.GT20843@kosh.hut.fi> ! t/enc_module.t -use lib 't'; +use lib qw(t ext/Encode/t ../ext/Encode/t); Message-Id: <20030309182057.GR20843@kosh.hut.fi> ! AUTHORS s/Hirohito/Hiroto/ig; Sorry, Hiroto-san. Message-Id: <20030309181748.GP20843@kosh.hut.fi> ! encoding.pm s/logner/longer/ Message-Id: <20030309181907.GQ20843@kosh.hut.fi> 1.90 2003/03/09 17:32:43 ! encoding.pm + t/enc_data.t Inaba-san has added a patch for perl 5.8.1 or later that makes encoding.pm work for filehandle. t/enc_data.t is to test that. POD is further revised. Message-Id: <200303091515.h29FF6B03903@smtp3.ActiveState.com> ! encoding.pm t/enc_module.t encoding vs. ${^UNICODE} resolved. POD revised accordingly. Message-Id: <20030306112940.GN20652@kosh.hut.fi> 1.89 2003/02/28 ! Encode.xs signed vs. unsigned issue discovered by Craig on OpenVM Message-Id: ! encoding.pm AUTHORS + t/Mod_EUCJP.pm t/enc_module.enc t/enc_module.t Because binmode() stacks layers instead of overwrite, you have to ":raw :encoding()" in encoding.pm or your are in trouble when you call encoding.pm multiple times. There are several workarounds but Inaba-san's idea is in. SUGAWARA Hajime , who was the first to address this problem was added to AUTHORS. The test suites was added for this, which is a modified version of SUGAWARA-san's scripts Message-Id: <3E5CF695.6AE07852@st.rim.or.jp> 1.88 2003/02/20 14:42:34 ! Encode.xs one signedness nit for Encode by jhi <200302161933.h1GJX876018710@kosh.hut.fi> ! ucm/viscii.ucm VISCII map was incorrect; fixed by Sadahiro-san Message-Id: <20030216120828.47D3.BQW10602@nifty.com> ! t/enc_eucjp.t t/enc_utf8.t AUTHORS You can't unlink files that are opened in cygwin but the last file handle opened in t/enc_*.t left open. Patch submitted by Yitzchak and he was added to AUTHORS. Message-Id: ! t/CJKT.t now works with 'LC_ALL=en_US.UTF-8 PERL_UTF8_LOCALE=1' Message-Id: <20030206104513.GA11081@kosh.hut.fi> ! Unicode/Unicode.xs For 1.88: Unicode.xs =~ s/regog/recog/ -- jhi Message-Id: <20030206045153.GA6826@kosh.hut.fi> 1.87 2003/02/06 01:52:11 ! AUTHORS * Inaba "Sensei" Hirohito added (I thought I have done so a long ago but apparently I did not). * SUZUKI Norio added for verious and useful bug reports. ! Byte/Byte.pm KR/KR.pm Unicode/Unicode.pm lib/Encode/Encoder.pm lib/Encode/CJKConstants.pm podchecked so all warnings are gone except for L. ! encoding.pm t/enc_eucjp.t * t/uni/tr_utf8.t now t ok on maintperl (sorry, jhi) * Filter option overhaul * POD revision ! Encode.pm Encode.xs encengine.c Encode/encode.h lib/Encode/Encoding.pm lib/Encode/JP/JIS7.pm Merged inaba-san's patch that fixes "use encoding 'shiftjis'" without filter. podchecked by Dan Kogai. Message-Id: <3E3BC46B.6C687CFD@st.rim.or.jp> ! lib/Encode/Alias.pm decode('alias', $1) went wild because of local $_ in find_alias() the evil local $_ is eradicated but that changes find_alias() format for coderef aliasing. See Encode::Alias for details Message-Id: <200302051704.AA00042@kipp0.nifty.com> 1.86 2003/01/22 03:29:07 ! encoding.pm * Don't forget to canonize when you attempt an exact match! Message-Id: <73E7F801-2DAA-11D7-BF9A-000393AE4244@dan.co.jp> * ${^ENCODING} exception is off for $] > 5.008 Message-Id: <20030122110617T.inaba.hiroto@toshiba-it.co.jp> ! t/enc_utf8.t $] check commented out so it runs on 5.8.0 1.85 2003/01/21 22:19:14 ! encoding.pm ${^ENCODING} exception is now explicit rather than handled by regex. + t/enc_eucjp.t t/enc_utf8.t Test suite for the better "encoding" pragma support for bleedperl. On 5.8.0, they will just be skipped. 1.84 2003/01/10 12:00:16 ! encoding.pm ${^ENCODING} is no longer set for utf so encoding is no longer fun :) (That is to prevent duplicate encoding first by IO then ${^ENCODING}) Message-Id: <20030108213737.GK331043@lyta.hut.fi> ! Unicode/Unicode.xs %_ fixes saves the resulting .so .05% smaller, by NC Message-Id: <20021226225709.GF284@Bagpuss.unfortu.net> ! Encode.pm Silence Encode on undef, by Andreas Message-Id: Message-Id: ! Unicode/Unicode.xs s/regognised/recognised/ . British spelling left intact to pay respect to two British Nicks :) Message-Id: <20021203020454.GK2274@kosh.hut.fi> 1.83 2002/11/18 17:28:49 ! Encode.xs lib/Encode/JIS7.pm Even more patches from Inaba-san has been applied. With this patch t/uni/tr_7jis.t and t/uni/t_utf8.t of bleedperl will work. Message-Id: <20021115105514D.inaba.hiroto@toshiba-it.co.jp> 1.82 2002/11/14 23:06:12 ! Encode.xs Encode::utf8 (XS Version) assertion botch first found in Cygwin, later found in perls w/ -Dusemymalloc was fixed by NC. Message-Id: <20021114210349.GA288@Bagpuss.unfortu.net> 1.81 2002/11/08 18:29:27 ! Encode.pm Encode.xs Non-XS version of Encode::utf8 is back (with XS being default). Encode::predefine_encodings(0) to turn off XS. This is primarily to cope w/ Cygwin smoke but Sadahiro-san has found that it was Test::More causing the problem, not Encode. But I have already made it configurable so it may be useful in some rare cases.... Message-Id: <20021107210110.2EE4.BQW10602@nifty.com>, et al. ! bin/enc2xs The ingenious patch by Nicholas Clark that reduces shlib sizes by 50% with no penalty and backward compatibility preserved, is in. Message-Id: <20021103231324.GE288@Bagpuss.unfortu.net> 1.80 2002/10/21 20:39:09 ! Encode.xs t/mime-header.t Even more patches from NI-XS regarding Encode::utf8->decode(). And one more test to t/mime-header.t to prove it Message-Id: 1.79 2002/10/21 06:05:37 ! Encode.xs Further patches from NI-XS. Encode::utf8->decode() now checks the value of utf8 flag of the argument. As a result, the fix to lib/Encode/MIME/Header.pm is no longer neccessary but since it did no harm (even speedwise) I'll leave it unreverted. ! ucm/cp949.ucm ucm/cp950.ucm U+20AC EURO SIGN U+00AE REGISTERED SIGN were missing as a result of 1.78. Discovered by Moriyama-san. Moriyama-san has also developed a test script that compares (en|de)coded results to the corresponding Win32 API result and all cp9?? maps are now verified. Message-Id: <20021021025220.3AED.MSYK@mtg.biglobe.ne.jp> 1.78 2002/10/20 15:44:00 ! lib/Encode/MIME/Header.pm fixed so that it works with new Encode::utf8 ! Encode.pm Encode.xs Encode::utf8 is now in Encode.xs by Nick In-XS. This allows :encoding(UTF-8) to handle partial chars at end of buffers correctly. Message-Id: <20021020134935.2079.3@bactrian.ni-s.u-net.com> ! lib/Encode/Supported.pod More nitpickings applied. + t/rt.pl MANIFEST ! t/CJKT.t Moriyama-san has discovered a serious bug in t/CJKT.t; its roundtrip tests were completely useless. To redeem that and get the peace of mind again, I wrote t/rt.pl to test ALL '|0' ENTRIES in all ucm/*.ucm Since this script takes too long to finish (30 seconds on PIII-800MHz, FreeBSD), it is deliberately excluded from 'make test' but you can easily run that by either renaming it or: perl -Mblib t/rt.pl Message-Id: <20021019065420.0C48.MSYK@mtg.biglobe.ne.jp> ! ucm/cp936.ucm ucm/cp949.ucm ucm/cp950.ucm Other CJKT cp9?? also updated according to the URI below; http://www.microsoft.com/typography/unicode/cscp.htm + bin/ucmsort MANIFEST ucmsort is a crude utility that sorts CHARMAP entries in UCM files to proper order. intended for hardcore develpers only. ! ucm/cp932.ucm JP/JP.pm AUTHORS CP932 mapping which was based upon the mapping file at unicode.org was found obsolete by MORIYAMA Masayuki msyk@mtg.biglobe.ne.jp>. He has also supplied the patch so he was added to AUTHORS. ! lib/Encode/Supported.pod ISO-8859-11 != TIS 620 == TIS 620 + \xA0 ( ) Message-Id: 1.77 2002/10/06 03:27:02 ! t/jperl.t * Modified to accomodate up and comming patch by Inaba-san that will fix tr/// needing eval qq{} Message-Id: <9F78A19C-D6C3-11D6-BAC6-0003939A104C@dan.co.jp> ! encoding.pm * pod fixes/enhancements to reflect the changes above ! lib/Encode/Alias.pm "Encode::TW is correct, Encode::Alias not." - /Autrijus/ Message-Id: <20021001015648.GB18710@not.autrijus.org> 1.76 2002/08/25 15:09:51 ! t/big5-eten.utf To reflect ucm change by Autrijus. t/big5-eten.enc was regenerated but naturally identical to previous version -- dankogai ! ucm/big5-eten.ucm Codepoint fixes -- autrijus Message-Id: <20020805040236.GC5220@not.autrijus.org> = * copied everything under perl-5.8.0/ext/Encode to make sure Encode is in sync w/ perl core ! t/CJKT.t t/guess.t Change 17175 by jhi@alpha on 2002/06/10 23:24:42 Now that binmode(FH) does implicit ":bytes" revisit the failing tests. The worrisome one is the Digest::MD5 test-- how will it fare in CRLF lands now? ! t/CJKT.t t/guess.t From: Radu Greab Date: Mon, 10 Jun 2002 00:40:34 +0300 Message-Id: <200206092140.g59LeYn15745@ix.netsoft.ro> Fixes for en_US.UTF-8 failures, all but ext/PerlIO/t/fallback.t ones which I cannot figure out. ! lib/Encode/Alias.pm Subject: [Encode PATCH] spurious warning From: Nicholas Clark Date: Sun, 2 Jun 2002 20:26:22 +0100 Message-ID: <20020602192619.GA320@Bagpuss.unfortu.net> 1.75 2002/06/01 18:07:49 ! lib/Encode/Alias.pm t/Alias.t lib/Encode/Supported.pod TW/TW.pm glibc compliance cited by Autrijus. http://www.li18nux.org/docs/html/CodesetAliasTable-V10.html ! bin/enc2xs bin/piconv Subject: Re: forewarning: usedevel and versiononly Message-Id: <20020529081515.D570.H.M.BRAND@hccnet.nl> 1.74 2002/05/28 18:33:15 + ucm/null.ucm ucm/ctrl.ucm ! Makefile.PL bin/enc2xs lib/Encode/Supported.pod "null" and "ascii-ctrl" encodings added upon the request of Autrijus Subject: Re: unicode -> &# notation Message-ID: <20020518193704.GB40272@not.autrijus.org> 1.73 2002/05/28 17:26:18 ! */Makefile.PL Makefile.PL bin/enc2xs Encode/Makefile_PL.e2x AUTHORS Chris Nandor has fixed Encode so that it works w/ MacPerl -- at least w/ PPC (68k need static linking which does not work due to 64k limit). pudge is added to AUTHORS (I'm surprised he was not there in the list). Encode/Makefile_PL.e2x was additionally fixed by dankogai to reflect changes in other Makefile.PL Message-Id: ! t/mime-header.t Subject: Change 16746: -Mutf8 cleanup. Message-Id: <200205222345.g4MNj7e10597@smtp3.ActiveState.com> 1.72 2002/05/20 15:49:56 ! Makefile.PL Subject: [PATCH] Encode should be in perl-core library path Message-Id: <86r8k7h738.wl@mail.edge.co.jp> Message-Id: <20020520161201.A11019@alpha.hut.fi> ! lib/Encode/MIME/Header.pm Subject: [PATCH] Encode::MIME::Header Message-Id: <86sn4nh7a8.wl@mail.edge.co.jp> ! Encode/Makefile_PL.e2x Subject: [PATCH] Make Makefile_PL.e2x happy on MSWin32 Message-Id: <20020519201031.GA1603@not.autrijus.org> ! CN/Makefile.PL Byte/Makefile.PL JP/Makefile.PL TW/Makefile.PL Symbol/Makefile.PL KR/Makefile.PL EBCDIC/Makefile.PL Makefile.PL AUTHORS @16628 and @16652 from Vadim. Vadim was added to AUTHORS. Subject: [PATCH] good day for WinCE port of perl. Message-ID: <001301c1fc68$e808e560$a95cc3d9@vad> ! Encode.xs ! Unicode/Unicode.xs Even more linting by Robin via @16532 ! Encode.xs Even more typecast by Sarathy in @16460 1.71 2002/05/07 16:22:42 ! Encode.xs even more typecasts by Robin Message-Id: <200205071513.QAA05846@tempest.npl.co.uk> ! bin/enc2xs A very strange bug that was causing a bugus ucm -> C table generation that was revealed by a UCM file that Andreas was working. This is the king of wierdest bug I've encountered in the course of Encode maintenance. Message-Id: <6C04F0FA-61D4-11D6-B164-00039301D480@dan.co.jp> 1.70 2002/05/06 10:26:48 ! encoding.pm Made more 'module-safe' with conjunction w/ 'no encoding'. Message-Id: ! lib/Encode/Encoding.pm 'require Encode' because ->Define uses Encode::define_encoding(); problem and solution addressed by Miyagawa-kun Message-Id: <86znzdfvuh.wl@mail.edge.co.jp> ! t/Unicode.t Cuts the frill to make djgpp happier, as suggested by Laszlo Message-Id: <20020506105819.H17012@libra.eth.ericsson.se> ! bin/enc2xs enc2xs no longer overwrites files w/ -M option, as suggested by Andreas Message-Id: 1.69 2002/05/04 16:41:18 ! lib/Encode/MIME/Header Floating-point coerced for UNICOS (in integer arithmetics it folds line one character too early). Verification by Mark is pending. Message-Id: ! Unicode/Unicode.pm more doc patch from Elizabeth Message-Id: <4.2.0.58.20020503210946.02f4ed30@mickey.dijkmat.nl> ! Encode/Makefile_PL.e2x More platform-independent patch from Benjamin Message-Id: <3CD31BE0.69F79B06@earthlink.net> ! lib/Encode/Guess AUTHORS split regex fix by Graham Barr. Adds him to AUTHORS. Message-Id: <20020504085419.E95940@valueclick.com> ! Encode/Makefile_PL.e2x enc2xs script discovery made smarter and more sensible, first cited by Miyagawa-kun and further suggestions by Rafael and Andreas ! Encode.pm lib/Encode/Guess.pm t/fallback.t t/guess.t t/mime-header.t "The EBCDIC remapping of the low 256 bites again" #16372 by jhi 1.68 2002/05/03 12:20:13 ! lib/Encode/Alias.pm lib/Encode/Supported.pod t/Alias.t AUTHORS UCS-4 added to aliases of UTF-32 by Elizabeth Mattijsen. Alias.t and Supported.pod modified to reflect the change. Elizabeth added to Authors. And H.M. is also added for forwarding her patch among other contributions (I was rather surprised to find his name was not there yet!) Message-Id: <20020503114901.D639.H.M.BRAND@hccnet.nl> 1.67 2002/05/02 07:33:09 ! Encode.xs Error message now consistent w/ perlqq (\N{U+} -> \x{}) done in perl@16308 but Philip linted me further. Now the error messages are macronized as ERR_ENCODE_NOMAP and ERR_DECODE_NOMAP ! lib/Encode/Guess.pm Sanity check for happier -w by Autrijus 1.66 2002/05/01 05:41:06 ! Encode.xs t/fallback.t WARN_ON_ERR no longer assumes RETURN_ON_ERR so you can issue a warning while fallback is in effect. This even came with a welcome side-effect of cleaner code with less nests! Thank you, NI-XS. t/fallback.t is also modified to test this. And of course, the corresponding varialbles to UV[Xx]f are appropriately cast. This should've concluded NI-XS homework. ! Encode.pm encode(undef) does warn again! Repented upon suggestion by NI-XS. Document for unless vs. '' added Message-Id: <20020430171547.3322.13@bactrian.elixent.com> 1.65 2002/04/30 16:13:37 ! Encode.pm encode(undef) no longer warns for C. Suggested by Paul. Message-Id: ! lib/Encode/Supported.pod Encode::MIME::Header and Encode::Guess mentioned Updated for Encode::HanExtra 0.05 and Encode::JIS2K ! lib/Encode/Guess.pm POD fix by Miyagawa-kun Message-Id: <86k7qqx8p7.wl@mail.edge.co.jp> 1.64 2002/04/29 06:54:06 ! ucm/euc-jp.ucm Now decodes euc-jisx0213 also. CAVEAT: encode("euc-jp"...) and encocde("euc-jisx0213") are still DIFFERENT. Message-Id: ! Encode.xs A few white spaces corrected by NI-XS via PerlIO integration to Mainline Subject: Change 16247: Integrate perlio; ! Encode.pm Document fixes by Andreas Message-Id: 1.63 2002/04/27 18:59:50 ! lib/Encode/Encoding.pm ! Encoding.pm Unicode/Unicode.pm lib/Encode/Guess.pm lib/Encode/CN/HZ.pm ! lib/Encode/JP/JIS7.pm lib/Encode/MIME/Header.pm lib/Encode/KR/2022_KR.pm Make use of the Encode::Encoding base class! And other cleanups in Encode.xs upon NI-XS suggestions Message-Id: <20020427160718.1290.15@bactrian.ni-s.u-net.com> 1.62 2002/04/27 11:17:39 ! Encode.pm encodings() now just check %ExtModule instead of eval{require} all of them for ":all" to conserve more memory. ! Encode.xs more "%x" -> "%" UVxf stuff. ! Encode.pm s/=over2/=over 2/g # oops. 1.61 2002/04/26 03:02:04 ! t/mime-header.t Now does decent tests besides use_ok() ! lib/Encode/Guess.pm t/guess.t UI streamlined, document added ! Unicode/Unicode.xs various signed/unsigned mismatch nits (#16173) http://public.activestate.com/cgi-bin/perlbrowse?patch=16173 ! Encode.pm POD: utf8-flag-related caveats added. A few sections completely rewritten. ! Encode.xs ! AUTHORS Thou shalt not assume %d works, either! Robin Baker added to AUTHORS for this Message-Id: <200204251132.MAA28237@tempest.npl.co.uk> ! t/CJKT.t "Change 16144 by gsar@onru on 2002/04/24 18:59:05" 1.60 2002/04/24 20:06:52 ! Encode.xs "Thou shalt not assume %x works." -- jhi Message-Id: <20020424210618.E24347@alpha.hut.fi> ! CN/Makefile.PL JP/Makefile.PL KR/Makefile.PL TW/Makefile.PL To make low-memory build machines happy, now *.c is created for each *.ucm (no table aggregation). You can still override this by setting $ENV{AGGREGATE_TABLES}. Message-Id: <00B1B3E4-579F-11D6-A441-00039301D480@dan.co.jp> + lib/Encode/Guess.pm + lib/Encode/JP/JIS7.pm Encoding-autodetect (mainly for Japanese encoding) added. In a course of development, JIS7.pm was improved. + lib/Encode/HTML/Header.pm + lib/Encode/Config.pm MIME B/Q Header Encoding Added! ! Encode.pm Encode.xs t/fallback.t new fallbacks; XMLCREF and HTMLCREF upon Bart's request. Message-Id: <20020424130709.GA14211@tanglefoot> 1.59 $ 2002/04/22 23:54:22 ! Encode.pm Encode.xs needs_lines() and perlio_ok() are added to Internal encodings such as utf8 so XML::SAX is happy. FB_* stub xsubs are now prototyped. 1.58 2002/04/22 23:54:22 ! TW/TW.pm s/MacChineseSimp/MacChineseTrad/ # ... oops. ! bin/ucm2text ! t/*.t - t/*.euc t/*.ref + t/*.enc t/*.utf Now all CJKT encodings go thru round-trip test via t/CJKT.t. t/(CN|TW).t by Autrijus are renamed at-(cn|tw).t t/(JP|KR).t are aggregated to t/CJKT.t test data are all remade via bin/ucm2text. And .... They are no longer skipped for -Uuseperlio ! 1.57 2002/04/22 20:27:30 ! t/JP.t t/KR.t t/perlio.t unless (find PerlIO::Layer 'perlio') ... line is back again. t/JP.t and t/KR.t were supposed to work but maybe '>:utf8' lines need PerlIO. Sigh.... ! Encode.xs Unicode/Unicode.pm lib/Encode/JP/JIS7.pm t/perlio.t ->perlio_ok now does eval{ require PerlIO::encoding } there so it correctly returns 1 when PerlIO::encoding is yet loaded. ! Encode.xs perl-current patch #16072 reflected 1.56 2002/04/22 09:48:07 ! Encode.pm encoding.pm t/perlio.t t/jperl.t New PerlIO::encoding 0.04 compliance met 1.55 2002/04/22 03:43:05 ! Encode.pm Encode.xs Unicode/Unicode.pm needs_lines() defined so Encode::Encoding is no longer needed for perlio 1.54 2002/04/22 02:50:01 ! Encode.pm! Encode.xs! Unicode/Unicode.pm t/perlio.t ! lib/Encode/Encoding.pm lib/Encode/CN/HZ.pm now perlio_ok is true by default if PerlIO::encoding->VERSION is 0.03 or larger. POD in Encode::Encoding revised to reflect this. Encode::XS and Encode::Unicode now has perlio_ok() method. ! lib/Encode/Supported.pod s/UP-UX/HP-UX/ by jhi ! AUTHORS Byte/Byte.pm CN/CN.pm Encode.pm JP/JP.pm KR/KR.pm README ! Symbol/Symbol.pm TW/TW.pm Unicode/Unicode.pm bin/enc2xs bin/piconv ! bin/ucmlint encoding.pm lib/Encode/Alias.pm lib/Encode/CN/HZ.pm ! lib/Encode/Config.pm lib/Encode/Encoder.pm lib/Encode/Encoding.pm ! lib/Encode/KR/2022_KR.pm lib/Encode/PerlIO.pod ! lib/Encode/Supported.pod Huge document fixes by Philip. ! AUTHORS ! t/JP.t s/compare\(/compare_text\(/o by Sarathy. Adds him to AUTHORS http://public.activestate.com/cgi-bin/perlbrowse?patch=16049 ! t/perlio.t binmode() after "<:encoding" to make Win32 happy, by Mattia. Mattia added to AUTHORS file Message-Id: <3CC3150F.5798.22A05AE@localhost> 1.52 2002/04/20 23:43:47 ! t/perlio.t TODO: is now SKIP:, as NI-XS requested. Also adds more eraborate failure analysis added. ! bin/enc2xs A note on how to make sure of round-trip safety added to POD section (so Autrijus is happier) ! ucm/big5-hkscs.ucm ucm/big5-eten.ucm t/TW.pm big5-(eten|hkscs) is round-trip safe again! Message-Id: ! encoding.pm Typo fixes by Andreas ! Encode.pm Encode.xs Unicode/Unicode.xs Encode/Encoding.pm ! lib/Encode/JP/JIS7.pm lib/Encode/KR/2022_KR.pm t/perlio.t PerIO coodination patches from NI-XS. Message-Id: <2769E572-54A1-11D6-B7E2-00039301D480@dan.co.jp> 1.51 2002/04/20 09:58:23 ! t/TW.t Updated test suite by Autrijis so "make test" is happy again Message-Id: <20020420082104.GA25037@not.autrijus.org> + ucm/big5-eten.ucm ! ucm/big5-hkscs.ucm lib/Encode/Alias.pm - ucm/big5.ucm TW/TW.pm TW/Makefile.PL Updates by Autrijus. 'big5' is no longer a canonical but an alias to 'big5-eten'. big5-hkscs is now in 2001 edition. Message-Id: <20020419195346.GA19597@not.autrijus.org> ! Encode.xs Fix by NI-XS that fallback may cause SEGV w/ Perl/TK Message-Id: <20020419184509.1924.1@bactrian.ni-s.u-net.com> ! Encode.pm PerlIO detection a little bit smarter; no longer uses eval qq{} but eval {}. 1.50 2002/04/19 06:13:02 ! ! Encode.pm Encode.xs Encode/encoding.h + t/fallback.pm New Fallback API imlemented and documented. See "perldoc Encode" for details ! lib/Encode/JP/JIS7.pm Encode.pm + lib/Encode/PerlIO.pod t/perlio.t API compliance met. However, it still does not work unless perlio implements line buffer. See BUGS section in perldoc Encode::PerlIO As a sensible workaround, perlio_ok() added to Encode. ! encoding.pm ! lib/Encode/Supported.pod Doc fixes from jhi Message-Id: <20020418174647.J8466@alpha.hut.fi> ! CN/CN.pm Doc fixes from Autrijus Message-Id: <20020418144131.GA10987@not.autrijus.org> ! Encode.pm perlqq mode documented ! t/JP.t + t/jisx0201.euc t/jisx0201.ref ! t/jisx0208.euc t/jisx0208.ref t/JP.t tests more rigorously and with other encodings t/jisx0201.* added to test JIS7 encodings. jisx0208 is now PURELY in jis0208 (used to contain jisx0201 part). ! Encode/Makefile_PL.e2x The resulting Makefile.PL that "enc2xs -M" creates now auto-discovers enc2xs and encode.h rather than hard-coded. This allows the resulting module fully CPANizable. ! encoding.pm t/JP.t t/KR.t PerlIO detection simplified (checks %INC instead of eval{}) ! Encode.xs Encode/encode.h + Unicode/Makefile.PL Unicode/Unicode.pm Unicode/Unicode.xs - lib/Encode/Unicode.pm (en|de)code_xs relocated to where it belongs. Source reindented to my taste ! bin/enc2xs Additional (U8 *) cast added as suggested by jhi Message-Id: <20020417165916.A28599@alpha.hut.fi> 1.42 Date: 2002/04/17 - lib/Encode/XS.pm no-op module; Thought of adding a pod there but enc2xs has one so gone. ! encoding.pm ! t/JP.pm ! t/KR.pm correct mechanism to detect Perlio::encoding layar installed. ! Encode.xs PerlIO Layer detached. 1.41 2002/04/16 23:35:00 ! encoding.pm binmode(STDIN|STDOUT ...) done iff PerlIO is available ! t/*.t Cleaned up PerlIO skip conditions to prepare for the upcoming Encode - PerlIO forking. ! Encode.pm exported functions are now prototyped. ! lib/Encode/CN/HZ.pm ! bin/enc2xs ! Encode.xs fallback implemented # was /* FIXME */ affected programs revised to fit (only HZ was using the try-catch approach which needed to be fixed for API-compliance). ! Encode/Config.pm ! Encode/KR/2022_KR.pm ! Encode/KR/KR.pm can find =head1 NAME now, jhi Message-Id: <20020416083059.V30639@alpha.hut.fi> ! encoding.pm s/\{h\}/{$h}/g ;) ! Encode.xs now complies with less warnings with the pickest compilers. Suggested by Craig, fixed by Dan. ! Encode/Makefile_PL.e2x ! bin/enc2xs A bug that fails to find *.e2x in certain conditions fixed 1.40 2002/04/14 22:27:14 + Encode/ConfigLocal_PM.e2x ! lib/Encode/Config.pm ! bin/enc2xs "enc2xs -C" now generates/updates Encode::ConfigLocal. ConfigLocal_PM.e2x is a skelton thereof. ! lib/Encode/Config.pm ! CN/CN.pm "use Encode::CN::HZ;" was missing. ! t/Unicode.t ! t/unibench.t More rigorous tests added to test XS, especially on memory allocation. ! Encode.xs ! lib/Encode/Unicode.pm NI-S implemented an XS version -- merged Message-Id: <20020414154857.2066.4@bactrian.ni-s.u-net.com> ! encoding.pm ! t/jperl.t Source filter option added. With this option on, you can write perl 5.8-savvy scripts (such as UTF-8 identifiers) in legacy encodings. t/jperl.t enhanced to test this feature. ! t/Unicode.t ok() gotcha addressed by Benjamin fixed. Though I didn't exactly apply his suggestion, this degree of nitting is enough to add him to AUTHORS list. Message-Id: <3CB93223.291E5E2E@earthlink.net> ! JP/JP.pm + lib/Encode/JP/JIS7.pm - lib/Encode/JP/JIS.pm - lib/Encode/JP/2022_JP.pm - lib/Encode/JP/2022_JP1.pm 7bit-jis, iso-2022-jp and iso-2022-jp1 are all aggregated to JIS7.pm for better maintainability and performance ! encoding.pm Added caveat for non-ascii identifiers. ! encoding.pm fixes by jhi, the original author of this pragramtic module. Message-Id: <20020413231527.V1826@alpha.hut.fi> 1.34 2002/04/12 20:23:05 (Unreleased) ! Encode.pm ! t/Unicode.t EBCDIC fixes addressed by jhi. Message-Id: <20020412161844.D9383@alpha.hut.fi> ! lib/Encode/Encoder.pm POD fix by Miyagawa-kun Message-Id: <86bscqq4hu.wl@mail.edge.co.jp> 1.33 2002/04/10 22:28:40 ! AUTHORS Philip's mail address corrected. ! AUTHORS ! t/Encoder.t ! lib/Encode/Encoder.pm s/ = shift;/ = @_;/ # trivial but a common idiomatic typo :) This adds Miyagawa-kun to AUTHORS. * encoding() no longer exported by default but on demand * t/Encoder.t updated to test all these Message-Id: <86hemjpdn4.wl@mail.edge.co.jp> ! lib/Encode/Unicode.pm ! lib/Encode/Supported.pm Further doc fixes by Anton 1.32 2002/04/09 20:06:15 + bin/ucmlint + t/bogus.ucm - ucm/macDevanaga.ucm Unicode Character Map - ucm/macGujarati.ucm Unicode Character Map - ucm/macGurmukhi.ucm Unicode Character Map A utility to check integrity of .ucm files. t/bogus.ucm is a ucm that is deliberately bogus. unused Indic mappings are removed for the time being. ! Encode.pm resolve_alias() added as suggested by jhi. Same as find_encoding("alias")->name. For convenience. This one is defined in Encode.pm instead of Alias.pm. Message-Id: <20020409215846.H17022@alpha.hut.fi> ! Encode.xs Memory Allocate but detected during the devel of ucmlint -- fixed. Message-Id: ! lib/Encode/Unicode.pm valid_ucs2(0) is false but must be true. 3 patches from NI-S as follows. This also has fixed the incident Andy has reported. ! lib/Encode/Alias.pm find_alias() recursion prevention ! t/Aliases.t Checks for the patch above ! t/Encode/Unicode.pm An extra "F" that causes valid_ucs2() return a bogus value fixed Message-Id: <20020409133927.17803.1@bactrian.elixent.com> Message-Id: 2 Small Patches from jhi as follows: ! Encode.pm Encode->encodings() lists in case-insensitve order (as it was) ! bin/piconv -l option prints avaiable encodings to STDOUT instead of STDERR ! lib/Encode/Aliases.pm s/defintion/definition/ Message-Id: <200204082306.CAA21033@alpha.hut.fi> ! AUTHORS ! lib/Encode/Supported.pod ! lib/Encode/Unicode.pm POD revise by Philip Newton. This adds Philip to AUTHORS list. Thank you for the exact quote of Douglas Adams :) Message-Id: <22s3bu4gpvhhsses64nj3afuu0lo927rv3@4ax.com> 1.31 2002/04/08 18:08:07 ! lib/Encode/Encoder.pm + t/Encoder.t Encode::Encoder, once just a placeholder of an idea, is now much more practical. See t/Encode.t to find how practical it can be. + lib/Encode/Config.pm ! Encode.pm my false laziness at Encode.pm is fixed. Now %ExtModules are set in Encode::Config and they are all literally, not programatically set. My false laziness was resulting many encodings missing from %ExtModules. ! lib/Encode/Unicode.pm ! t/Unicode.t BOM for 32LE was bogus as noted by Anton. t/Unicode.t is fixed so that it does not rely Encode::Unicode for BOM values Message-Id: 1.30 2002/04/08 02:34:51 + lib/Encode/Encoder.pm Object Oriented Encoder. I reckon something like this is in need. ! Encode.pm ! t/Unicode.pm ! lib/Encode/Supported.pod * autoloading bug that prevented upper-case canonicals such as UTF-16 is fixed. Now even UTF/UCS are autoloaded! * encodings() is now more intuitive. * t/Unicode.t fixed to explicitly use Unicode.pm -- BOM values are stored therein. * Obligatory fixes to the POD. ! lib/Encode/Supported.pod Patch from Anton applied. Message-Id: <66641479.20020408033300@motor.ru> ! Encode.pm ! lib/Encode/Unicode.pm Cosmetic changes: "bless $obj, $class" => "bless $obj => class" 1.28 2002/04/07 18:58:42 ! MANIFEST + t/Unicode.t + t/grow.t Just a MANIFEST for those missing files. 1.26 Date: 2002/04/07 15:22:04 ! JP/Makefile.PL ! t/Aliases.PL Schwarn's patches against Makefile.PL has zapped jis*.ucm. Restored. And t/Aliases.t fixed to make sure they all exist. 1.25 2002/04/07 15:01:25 (Unreleased) ! Encode.pm ! lib/Encode/Unicode.pm More POD fixes.... ! Encode.pm - lib/Encode/UTF_EBCDIC.pm - lib/Encode/Internal.pm - lib/Encode/utf8.pm Integrated into Encode.pm as closures. That way "one package, one file" rule is preserved yet less files to require. ! encoding.pm commented out binmode(STDERR ... ! Makefile.PL ! Byte/Makefile.PL ! CN/Makefile.PL ! EBCDIC/Makefile.PL ! JP/Makefile.PL ! KR/Makefile.PL ! Symbol/Makefile.PL ! TW/Makefile.PL ! Encode/Makefile_PL.e2x Schwarn's MM-compliance patch merged Message-Id: <20020406082609.GA28758@blackrider> ! Encode.pm ! lib/Encode/Unicode.pm + lib/Encode/UTF_EBCDIC.pm + t/Unicode.t - lib/Encode/10646_1.pm - lib/Encode/ucs2_le.pm (UCS-2|UTF-(16|32))(LE|BE)? implementation and cleanups. Instead of per-module based (en|de)code, I saved a number of .pm by reorganizing it as per-object base (Well, this is what Encode::XS does under the hood). See Encode::Unicode for details. The original Unicode.pm is now correctly renamed to UTF_EBCDIC.pm. This module is used only on EBCDIC environments. 1.21 2002/04/05 14:46:34 (Not Released) ! JP/JP.pm ! Encode.pm + ucm/jis0201.ucm + ucm/jis0208.ucm + ucm/jis0212.ucm Are back to make Perl/Tk happy Smile, NI-S. ! t/Alias.pm ! lib/Encode/Alias.pm ! lib/Encode/Supported.pm ! lib/Encode/10646_1.pm ! lib/Encode/ucs2_le.pm UCS-16BE is now canonical for UCS-2/ISO-10646-1. Leftover implicit aliases in ucs2_le.pm removed. Tests and documents updated to reflect changes. essage-Id: <20020405114024.1290.17@bactrian.ni-s.u-net.com> ! lib/Encode/Alias.pm ! lib/Encode/Supported.pm Anton's revision commited. Added Dan's own fixes as well. Message-Id: <159103166906.20020405161134@motor.ru> ! lib/Encode/Alias.pm 134c134 < qr/^UCS2-le$/i => '"UCS-2"', ); --- > qr/^UCS2-LE$/i => '"UTF-16LE"'); Sigh. Thank you, Anton. Message-Id: <14567692196.20020405062020@motor.ru> Message-Id: <69FEC0B4-483E-11D6-A045-00039301D480@dan.co.jp> 1.20 2002/04/04 19:50:52 + bin/unidump the last minute addtion. Just give it a try. Docs remains to be done. Not installed by default. ! lib/Encode/Supported.pod Enhanced Greatly. ! t/Alias.t ! lib/Encode/Alias.pm ! lib/Encode/utf8.pm ! lib/Encode/10464_1.pm ! lib/Encode/ucs2_le.pm Canonical name for 'UCS-2le" is now "UTF-16LE". UCS-2 left unchanged but UTF-16BE is added as an alias. Implicit aliases move to Encode::Alias so init_alias() works more as expected. Also, 'utf8' is now canonical with 'UTF-8' being an alias. Though pedantically wrong, This should make perl mongers happier. t/Alias.t is enhanced to test all these. Message-Id: <9C39BD58-47AF-11D6-9D82-00039301D480@dan.co.jp> ! Byte/Makefile.PL Now all .ucm are stacked in byte_t; They all share ascii part so 50% of the codepoints are common. CJKT left as is because the saving is not significant. ! Byte/Makefile.PL ! CN/Makefile.PL ! EBCDIC/Makefile.PL ! Encode.xs ! Encode/Makefile_PL.e2x ! JP/Makefile.PL ! KR/Makefile.PL ! Makefile.PL ! Symbol/Makefile.PL ! TW/Makefile.PL ! bin/enc2xs ! AUTHORS All occurance of _def.h replaced with .exh so djgpp works happily ever after! To credit this amazing discovery, Laszlo is now in AUTHORS list Message-Id: <20020403181424.GA8778@freemail.hu> Message-Id: ! Makefile.PL ! */Makefile.PL ! Encode/Makefile_PL.skel bin/enc2xs No more @INC fiddling! Uses $ENV{PERL_CORE} instead Message-Id: <20020401222744.GX2000@blackrider>, et al. ! t/encoding.t Two more tests by added jhi Message-Id: <200204020000.DAA25121@alpha.hut.fi> + t/grow.t ! Encode.xs The showstopper fixed -- Memory reallocation bug was causing Encode::XS to fall into infinite loop on certain conditions. t/grow.t tests that. Message-Id: <9572CAC4-463C-11D6-ABA5-00039301D480@dan.co.jp>, et al + bin/txt2ucm ! */Makefile.PL ! */*.ucm ! */XX.pm ! lib/Encode/Supported.pod Vendor encodings rebuilt out of original map files at unicode.org. Indic languages such as MacDevanagali remain unspported do to the shortcoming of encengine capabilities (they need algorithmical conversion and I have no knowledge on that!). Pods fixed for added encodings. Oh, macJapan.ucm renamed to macJapanese.ucm. macROMnn is macRomanian and macRUMnn is macRumanian. txt2ucm is a crude script that is used to convert them. ! bin/enc2xs Unicode Compound Characters (used extensively on Mac) supported ! bin/piconv Typo fixes and improvements by jhi Message-Id: <200204010201.FAA03564@alpha.hut.fi>, et al. 1.11 2002/03/31 22:12:13 + t/encoding.t + t/jperl.t ! MANIFEST Missing files from the MANIFEST fixed. Message-Id: <20020401010156.H10509@alpha.hut.fi> Version incremented just to make CPAN happy. 1.10 2002/03/31 21:32:42 ! Makefile.PL ! README INSTALL_UCM option added to Makefile.PL so you can install *.ucm if you want. This should make Autrijus happy. Also, piconv is added to default install. + Encode/*.e2x ! bin/enc2xs Here-documented files that enc2xs generates are now exported to *.e2x. Much cleaner and easier to debug. ! encoding.pm encoding enhances so you can make it act more like such (now prehistoric ) "localized" variations of perl like Jperl. + t/jperl.t Further test for encoding.pm. Written in euc-jp + encoding.pm + t/encoding.t Taken over form jhi. Message-Id: <20020330174618.B10154@alpha.hut.fi> - Encode/*.ucm + ucm/*.ucm ! Makefile.PL ! */Makefile.PL *.ucm relocated to ucm/ so MakeMaker will not install'em by default. - ucm2table + bin/ucm2table *** ! AUTHORS ! Byte/Byte.pm ! Encode.pm ! Encode/macIceland.ucm ! lib/Encode/Alias.pm ! lib/Encode/Supported.pod MacIceland fixes and Pod Typo fixes. This adds Andreas to AUTHORS. Message-Id: 1.01 2002/03/29 20:59:39 ! Makefile.PL ! README s/USE_SCRIPTS/MORE_SCRIPTS/ ! Makefile.PL installs enc2xs by default for external Encode:: modules in CPAN, such as Encode::HanExtra ! t/*.t More sensible perl core detection via $ENV{PERL_CORE} suggested by Spider Message-Id: <200203291007.FAA07329@Orb.Nashua.NH.US> ! bin/enc2xs Perl core ditection via $^X =~ m/\bminiperl$/o Message-Id: 1.00 Wed Mar 29 2002 ! * The version of all files is updated to 1.00 via "ci -f -l1.00", commemorating version 1.00. All files, including *.ucm are now under version control. - encode.h + Encode/encode.h encode.h moved to Encode/ so it will be installed for the later use by enc2xs ! enc2xs h2xs-like feature added via "h2xs -M Name *.(enc|ucm)" ! Makefile.PL ! */Makefile.PL - compile + bin/enc2xs compile renamed to enc2xs. Affected Makefle.PL updated - lib/CN/2022_CN.pm "Punt it. HanExtra can take care of that later." -- Autrijus Message-Id: <20020328154338.GA7351@not.autrijus.org> ! Encode/johab.ucm ! Encode/euc-kr.ucm ! Encode/ksc5601.ucm ! lib/Encode/CJKConstants.pm ! lib/Encode/KR/2022_KR.pm Table patches for Euro Signs, 2022-KR fixups by Jungshik Message-Id: ! README ! Makefile.PL + bin/piconv bin/ added for example scripts. They are not installed by default. to install them, "perl Makefile.PL USE_SCRIPTS". piconv is iconv reinvented in perl. in addition to all features of iconv, it also adds perlish features. See L for more details. ! lib/Encode/Alias.pm qr/^ replaced with qr/\b so it directly matches locale names such as en_US.US-ASCII ! AUTHORS ! t/Aliases.t Patch by MJD to fix the following problem applied. Subject: [PATCH 5.7.3 Encode] Aliases.t not properly skipped when Encode extension not built Message-Id: <20020328091850.18677.qmail@plover.com> ! lib/Encode/KR/2022_KR.pm ! lib/Encode/CJKConstants.pm Another patch from Jungshik to make iso-2022-kr actually work Message-Id: ! Encode/Encode/euc-kr.ucm + Encode/Encode/johab.ucm ! Encode/Encode/ksc5601.ucm ! Encode/KR/KR.pm ! Encode/KR/Makefile.PL ! Encode/lib/Encode/Alias.pm ! t/Alias.t Johab support and complete revision of Korean Encoding by Jungshik Message-Id: + Encode.pm Revised to make up with now-dropped Encode::Details. - lib/Encode/Details.pod Dropped. Besides being obsolete, the topics are now covered in respective pods now. ! AUTHORS ! t/Alias.t KR/KR.pm lib/Encode/Alias.pm Korean aliases fixed thanks to Jungshik Shin /ks[-_ ]?c[-_ ]?5601-1987$/i => cp936 Message-Id: ! *.pm =head1 NAME added to all modules to make buildtoc happy Message-Id: <20020327041151.A10618@alpha.hut.fi> - lib/Encode/CJKguide.pod Too controversial and dropped from the dist. Will be available separately on the web. ! Encode/*.ucm RCS tags added so table debugging gets easier (should that be needed! I hope they all stay 1.00!) + lib/Encode/CJKguide.pod A detailed guide to mainly, but not limited to, CJK multibyte encodings. - Encode/roman8.ucm + Encode/hp-roman8.ucm ! Byte/Makefile.PL ! Encode/Supported.pod All occurance of "roman8" replaced with "hp-roman8" to avoid confusion ! Encode/Supported.pod ! Encode/mac*.ucm ! t/Alias.t Mac Encodings now comply the Inside Macintosh ! t/Alias.t Test for '-raw' conventions added. ! Encode/Alias.pm aliased gb2312 -> euc-cn, ksc5601 -> euc-kr ! Encode/gb12345.ucm ! Encode/gb2312.ucm ! Encode/ksc5601.ucm "-raw" appended to canonical names. File mames stay unchanged thanks to UCM format. ! lib/Encode/CN/HZ.pm Patch from Autrijus to fix gb2312 -> gb2312-raw + code linting Message-Id: <20020326035210.GA2091@not.autrijus.org> 0.99 Tue Mar 26 2002 - lib/Encode/JP/Const.pm + lib/Encode/CJKConstants.pm + lib/Encode/CN/2022_CN.pm + lib/Encode/KR/2022_KR.pm + t/KR.t + t/gb2312.euc + t/gb2312.ref + t/ksc5601.euc + t/ksc5601.ref + t/table.euc + t/table.ref + ucm2table * Support for ISO-2022-KR and ISO-2022-CN added. * t/KR.t added! * more t/*.{euc,ref} added, which was autogenerated from ucm2table * ucm2table autogenerates character table out of UCM files. - engine.c + encengine.c - lib/Encode/Supports.pod + lib/Encode/Supported.pod Names reverted due to popular demand. 8.3 rule applies only when there is a conflict. Message-Id: <20020325095924.GD44120@not.autrijus.org> ! */Makefile.PL - Encode/*.enc + Encode/*.ucm - lib/Tcl* - lib/Encode/Format/Enc.pod - t/Tcl.t * Character tables is now 100% ucm. * All files under Encode/ is now 8.3-compliant * some of missing encodings added (i.e. gsm0338 and nextstep) * Vendor mappings aggregated with appropriate national std in Makefile.PL, resulting smaller *.so especially for CJK. Following is result on Dan's FreeBSD box. Now Then --------------------------------------------------------------- blib/arch/auto/Encode/Byte/Byte.so 157,279 171,042 blib/arch/auto/Encode/CN/CN.so 1,634,476 1,626,685 blib/arch/auto/Encode/EBCDIC/EBCDIC.so 18,476 18,476 blib/arch/auto/Encode/Encode.so 27,791 27,791 blib/arch/auto/Encode/JP/JP.so 1,408,056 1,832,811 blib/arch/auto/Encode/KR/KR.so 1,156,518 1,329,587 blib/arch/auto/Encode/Symbol/Symbol.so 23,940 20,990 blib/arch/auto/Encode/TW/TW.so* 948,761 1,316,437 --------------------------------------------------------------- Total 5,375,297 6,343,819 Saving 968,522 * As a result of ucm-transition, Encode::Tcl dropped because Encode::Tcl demands *.enc. Encode::Tcl will be supplied in a separate tarball with *.enc. Message-Id: !compile -encengine.c +encode.c !Encode.pm -lib/Encode/Supported.pod +lib/Encode/Supports.pod -lib/Encode/iso10646_1.pm +lib/Encode/10646_1.pm -lib/Encode/EncFormat.pod +lib/Encode/Format/Enc.pod Files renamed 8.3 filename compliance. Affected modules/scripts revised. - lib/Encode/JP/Constants.pm + lib/Encode/JP/Consts.pm ! lib/Encode/JP/JIS.pm ! lib/Encode/JP/H2Z.pm Version nit problem and 8.3 rule fix. > Package namespace installed latest in CPAN file > Encode::JP::Constants 0.92 1.02 J/JH/JHI/perl-5.7.3.tar.gz was noted by jhi then Dan discovers "Constants.pm" does not comply 8.3 rule. Contants.pm renamed to Consts.pm and affected modules are fixed accordingly. In addition, legacy "use vars qw()..." are replaced with "our"; Message-Id: <20020325011248.D1561@alpha.hut.fi> Message-Id: <41023D51-3FB5-11D6-8347-00039301D480@dan.co.jp> ! JP/JP.pm - lib/Encode/JP/ISO_2022_JP.pm - lib/Encode/JP/ISO_2022_JP_1.pm + lib/Encode/JP/2022_JP.pm + lib/Encode/JP/2022_JP1.pm 01234567.012 8.3 naming conflict for vanilla fat addressed by jhi Message-Id: <20020324201931.V22596@alpha.hut.fi> ! Encode.xs Typecast fix addressed by jhi Message-Id: <20020324185540.T22596@alpha.hut.fi> 0.98 Mon Mar 25 2002 ! lib/Encode/Supported.pod Further pod fixes + lib/Encode/JP/ISO_2022_JP_1.pm ! lib/Encode/JP/ISO_2022_JP.pm ! lib/Encode/JP/JIS.pm ! JP/JP.pm Now Encode::JP is more strict on the difference between ISO-2022-JP and ISO-2022-JP-1. See JP/JP.pm for details. I hope this move makes Anton happier :) FYI the previous version implements ISO-2022-JP as ISO-2022-JP-1 since it had X0212 support. ! lib/Encode/Supported.pod Further pod fixes ! Encode.xs Avoid core-dump in Encode with PERLIO=mmap by NI-S Message-Id: <20020324104139.1326.7@bactrian.ni-s.u-net.com> ! CN/CN.pm ! JP/JP.pm ! KR/KR.pm ! TW/TW.pm ! lib/Encode/Suppoted.pod pod fixes to replace F to L, as suggested by Autrijius in: Message-Id: <20020324083943.GA14901@not.autrijus.org> ! lib/Encode/Suppoted.pod fixes and enhancements by Anton Message-Id: <10632060120.20020324103753@motor.ru> ! lib/Encode/Alias.pm > define_alias( qr/^GB[- ]?(\d+)$/i => '"gb$1"' ); added. Suggested by Anton then deobfuscated by Autrijius Message-Id: <20020324064455.GA3667@not.autrijus.org> ! compile Further fix by Nicholas Clark Message-Id: <20020323145840.GD304@Bagpuss.unfortu.net> - lib/EncodeFormat.pod + lib/Encode/EncFormat.pod ! MANIFEST File renamed as suggested by Autrijius ! Encode.pm ! lib/Encode/Details.pod ! lib/Encode/Supported.pod Sun Mar 24 13:29:35 2002 ! Encode.pm Sun Mar 24 13:43:47 2002 pod fixes by Autrijius. Message-Id: <20020324062804.GA3595@not.autrijus.org> Message-Id: <20020324075627.GB11986@not.autrijus.org> ! t/Alias.t ! lib/Encode/Alias.pm ! Encode.pm now more EBCDIC conscious; %ExtModules on EBCDIC system excludes CJK so that you don't have to worry about the matched alias resulting cloaking. t/Alias.t also revised to reflect changes. Verified by jhi Message-Id: <20020324022929.D22596@alpha.hut.fi> 0.97 Sun Mar 24 2002 ! CN/CN.pm ! KR/KR.pm ! TW/TW.pm EBCDIC detection mechanism installed as in JP/JP.pm Message-Id: <20020323211847.G19148@alpha.hut.fi> ! Byte/Makefile.PL ! CN/Makefile.PL ! EBCDIC/Makefile.PL ! JP/Makefile.PL ! KR/Makefile.PL ! Symbol/Makefile.PL ! TW/Makefile.PL Now all table files used by compile are postfixed '_t' to avoid namespace collisions in case insensitive file systems once for all! inspired by: Message-ID: <58290227735.20020323195659@familiehaase.de> ! t/Aliases.t Since the Encode::JP is unsupported under EBCDIC we cannot run this test (aliases as such should work fine) -- jhi Message-Id: <20020323202119.D19148@alpha.hut.fi> ! Byte/Makefile.PL duplicate occurance of ascii.ucm and 8859-1.ucm causes MacOS X dlyd to cloak ! t/CN.t ! t/Encode.t ! t/JP.t ! t/TW.t ! t/Tcl.t < chdir 't' if -d 't'; --- > if (! -d 'blib' and -d 't'){ chdir 't' }; When you are "make test"-ing on Encode/ directory, you must not change $ENV{PWD}. t/JP.t has been fixed before but others somehow remain unchanced. Also the situation detection was made simpler in t/JP.t, which was originally; > chdir 't' if -d 't' and $ENV{PWD} !~ m,/Encode[^/]*$,o; ! Encode.pm "Use of uninitialized value in string eq at Encode.pm line 96." ! Symbol/Makefile.PL ! EBCDIC/Makefile.PL ! AUTHOR -- Problem on case insensitive file systems "coexist of ebcdic.c <> EBCDIC.c on Cygwin not possible" Message-ID: <88254111953.20020323095503@familiehaase.de> ! compile ! AUTHOR "So I think it's a bug in gcc, not perl. But it still needs to be worked around." Message-Id: <20020323145840.GD304@Bagpuss.unfortu.net> Message-Id: <20020323170509.C96475@plum.flirble.org> 0.96 Sat Mar 23 2002 ! TW/TW.pm ! lib/Encode/Encoding.pm ! lib/Encode/Alias.pm ! lib/Encode/Supported.pod ! KR/KR.pm Pod Fixes by Michael G Schwern via jhi Message-ID: <20020322073908.GB10539@blackrider> ! Makefile.PL ! Encode.pm "...I think we should include ISO 8859-1 as well." -- NI-S Message-Id: <20020322120230.1332.8@bactrian.elixent.com> ! JP/JP.pm ! CN/CN.pm ! KR/KR.pm ! TW/TW.pm ! lib/Encode/Alias.pm alias definitions relocated to Encode::Alias so module autoloading works for aliases also. ! Encode.pm encodings() now accepts args to check ExtModules. + Byte/Byte.pm + Byte/Makefile.PL + EBCDIC/EBCDIC.pm + EBCDIC/Makefile.PL + Symbol/Makefile.PL + Symbol/Symbol.pm ! Encode.pm ! Encode.xs Latin and single byte encodings are reorganized so they are demand-loaded like Encode::XX. Now only ascii is compiled into Encode itself. ! lib/Encode/Alias.pm for my $k (keys %hash){ delete $hash{$k}; } is depreciated; fixed. 0.95 Fri Mar 22 2002 In this update, pod rewrites and alias fixes are the main issues + lib/Encode/Supported.pod Describes supported encodings ! Makefile.PL streamlined compiled-in encodings. ! lib/Encode/Description.pod -> lib/Encode/Details.pod Renamed. + Encode/ibm-125?.ucm Added from icu distibution with any occurance of "IBM-125?" to "cp125?". Filenames remain unchanged to pay some respect to icu staff, however. + lib/Encode/Alias.pm ! Encode.pm Alias difinitions in Encode.pm relocated. ! AUTHORS ! Encode.xs packWARN patch from Paul Marquess via jhi Message-Id: <20020321010101.O28978@alpha.hut.fi> Paul added to AUTHORS as a result. ! t/CJKalias.t -> t/Aliases.t Renamed. Checks even more aliases and alias overloading ! Encode.pm ! CN/CN.pm duplicate alias for ujis => euc-jp removed (Encode::JP has one) gbk => cp936 relocated to CN.pm ! t/CJKalias.t Test::More with plans (by jhi) 0.94 Thu Mar 21 2002 + lib/Encode/Description.pod ! lib/Encode/Encoding.pm Now the pod in Encode.pm is abridged as programming references. lib/Encode/Description.pod contains the original, detailed description and Encode::Encoding explains how to write your own module to add new encodings. So far, lib/Encode/Description.pod contains the whole pod once in Encode.pm. This is intentional. ! Encode.pm Pod revisions by Anton Tagunov Message-Id: <517178431.20020320174824@motor.ru> ! lib/Encode/Tcl.pm all occrance of Encode::Tcl::Extended removed including pod ! t/CJKalias.t test now checks $encoding->name only; $encoding->{name} are no longer check to find the canonical name. ! lib/Encode/JP/JIS.pm ! lib/Encode/JP/ISO_2022_JP.pm ->name() added to be more compliant with API ! CN/CN.pm ! JP/JP.pm ! KR/KR.pm ! TW/TW.pm ! t/CJKalias.t Patch by Autrijus to add aliases to TW and fixes to POD Message-Id: <20020320090619.GA24774@not.autrijus.org> ! AUTHORS SADAHIRO Tomoyuki added as should. My apologies. 0.93 Wed Mar 20 2002 * First release to be uploaded to CPAN. For prehistoric changes, please see Changes file of perl distibution as well as perl-unicode@perl.org archive, available at: http://archive.develooper.com/perl-unicode@perl.org/ Changes Since 0.92 includes; + Changes + AUTHORS ! Encode.pm ! README + Mention to perl-unicode@perl.org added ! JP/JP.pm + Encoding aliases added so you can feed locale names and MIME Charset="" directly. - Mention to JISX0212 removed because it's fixed ! CN/CN.pm ! KR/KR.pm + Encoding aliases added. Note TW is left untouched because euc-tw is not implemented in TW but in Encode::HanExtra. Autrijus, you may fix Encode::HanExtra. + t/CJKalias.t + to test encode aliases added