summaryrefslogtreecommitdiff
path: root/utf8.c
Commit message (Collapse)AuthorAgeFilesLines
* [perl #22946] Bug in Unicode surrogate pair conversion in Perl_utf16_to_utf8 Dinger, Tom2003-07-241-1/+2
| | | | | | From: "Dinger, Tom" (via RT) <perlbug-followup@perl.org> Message-ID: <rt-22946-60715.1.00007189884266@rt.perl.org> p4raw-id: //depot/perl@20211
* Fix up Larry's copyright statements to my best knowledge.Jarkko Hietaniemi2003-04-161-1/+1
| | | | | | | (Lots of Perl 5 source code archaeology was involved.) Larry didn't make strangled noises when I showed him the patch, either :-) p4raw-id: //depot/perl@19242
* Synchronize the specifications of the POSIX characterJarkko Hietaniemi2003-04-161-2/+2
| | | | | | classes alnum, graph, and print closer to the planned Unicode proposal. p4raw-id: //depot/perl@19231
* Update all copyrights to 2003, from JarkkoHugo van der Sanden2003-03-021-1/+1
| | | p4raw-id: //depot/perl@18801
* API doc tweaks.Jarkko Hietaniemi2003-02-221-4/+7
| | | p4raw-id: //depot/perl@18760
* %_ (was Re: [PATCH] operation on `PL_na' may be undefined)Nicholas Clark2003-01-071-2/+2
| | | | | Message-ID: <20021226211626.GD284@Bagpuss.unfortu.net> p4raw-id: //depot/perl@18456
* [PATCH] bug in utf8.c(?)Marty Pauley2002-09-261-0/+1
| | | | | Subject: [PATCH] bug in utf8.c(?) p4raw-id: //depot/perl@17928
* Small speedup by inlining the easy bits of is_utf8_char()Jarkko Hietaniemi2002-07-011-3/+11
| | | | | into is_utf8_string(). p4raw-id: //depot/perl@17392
* good day for WinCE port of perl.Vadim Konovalov2002-05-161-9/+0
| | | | | Message-ID: <001301c1fc68$e808e560$a95cc3d9@vad> p4raw-id: //depot/perl@16628
* WinCE several touchesVadim Konovalov2002-05-131-1/+1
| | | | | Message-ID: <007b01c1fabe$cc8cbbf0$785cc3d9@vad> p4raw-id: //depot/perl@16582
* WinCE many fixesVadim Konovalov2002-04-281-0/+9
| | | | | Message-ID: <00bf01c1eedd$c0c62a00$d25cc3d9@vad> p4raw-id: //depot/perl@16251
* fixes for all the warnings reported by Visual C (most of thisGurusamy Sarathy2002-04-211-82/+82
| | | | | | change is from change#12026) p4raw-link: @12026 on //depot/maint-5.6/perl: ff42b73b40f5a895aef4bed81c794f468e0609bc p4raw-id: //depot/perl@16048
* my $utf8here, our $utf8here, and package variable $utf8here.Jarkko Hietaniemi2002-04-161-8/+27
| | | | | | | | | | | | | | | The actual minimal fix is in utf8.c and from NI-S, the rest are the tests (in fresh_perl since I couldn't get them easily to work elsewhere) and a slight behaviour change: previously UTF-8 identifiers had to start with an alphabetic character. No more so, now they can start with an (Unicode) ID_Continue character (which however is not a (Unicode) digit). (Limiting the first character to ID_Start would be rather restrictive, since ID_Start allows only alphabetic letters.) TODO: use vars qw($utf8here). This I don't find to be a showstopper. p4raw-id: //depot/perl@15943
* Re: Change 15762: As noted by Philip Newton: nothing wrong with BOM,Philip Newton2002-04-071-1/+1
| | | | | Message-ID: <1dnvau4j684hke2igk990f01nit8r2811s@4ax.com> p4raw-id: //depot/perl@15777
* As noted by Philip Newton: nothing wrong with BOM,Jarkko Hietaniemi2002-04-061-14/+3
| | | | | but 0xFFFE quite wrong. p4raw-id: //depot/perl@15762
* What started as a small nit (the charnames test, nit foundJarkko Hietaniemi2002-04-021-6/+10
| | | | | | | | | be Hugo), ballooned a bit... the goal is Larry's wish that illegal Unicode (such as U+FFFF) by default doesn't warn, since what if somebody WANTS to create illegal Unicode? Now getting close to this in the regex runtime. (Also, fix more of my fixation that BOM would be U+FFFE.) p4raw-id: //depot/perl@15689
* A little bit better error message for \pq, stillJarkko Hietaniemi2002-03-281-1/+5
| | | | | not good because the script context is not shown. p4raw-id: //depot/perl@15581
* Warn instead of croak.Jarkko Hietaniemi2002-03-271-8/+42
| | | p4raw-id: //depot/perl@15556
* B::perlstring and unicodeRafael Garcia-Suarez2002-03-181-1/+1
| | | | | Message-ID: <20020318231431.A699@rafael> p4raw-id: //depot/perl@15308
* more warnings tidyupPaul Marquess2002-03-111-4/+4
| | | | | | From: "Paul Marquess" <paul_marquess@yahoo.co.uk> Message-ID: <AIEAJICLCBDNAAOLLOKLMEEGDPAA.paul_marquess@yahoo.co.uk> p4raw-id: //depot/perl@15155
* EBCDIC: this seems to calm the last of theJarkko Hietaniemi2002-02-241-12/+8
| | | | | Malformed UTF-8 warnings. p4raw-id: //depot/perl@14850
* In EBCDIC the UNI_TO_NATIVE() macro evaluates its argumentJarkko Hietaniemi2002-02-201-2/+4
| | | | | twice, causing the loop to skip every other character. p4raw-id: //depot/perl@14800
* Misplaced block end.Jarkko Hietaniemi2002-02-191-1/+1
| | | p4raw-id: //depot/perl@14766
* Oops.Jarkko Hietaniemi2002-02-191-1/+1
| | | p4raw-id: //depot/perl@14762
* Try special casing first.Jarkko Hietaniemi2002-02-191-27/+33
| | | p4raw-id: //depot/perl@14759
* Unused in ASCII, used in EBCDIC.Jarkko Hietaniemi2002-02-181-2/+1
| | | p4raw-id: //depot/perl@14747
* EBCDIC: now the worst seems to be over forJarkko Hietaniemi2002-02-181-12/+8
| | | | | | the "Malformed" warnings. Still a few of them, and plenty of test failures, but getting better. p4raw-id: //depot/perl@14739
* After much rewriting we are now pretty muchJarkko Hietaniemi2002-02-181-44/+41
| | | | | back to where we started. p4raw-id: //depot/perl@14737
* Clearing up to_utf8_case() continues: this time useJarkko Hietaniemi2002-02-171-56/+58
| | | | | a single return, and EBCDICification for all paths. p4raw-id: //depot/perl@14734
* Tiny tweak.Jarkko Hietaniemi2002-02-171-6/+2
| | | p4raw-id: //depot/perl@14732
* Redundant casts.Jarkko Hietaniemi2002-02-171-18/+18
| | | p4raw-id: //depot/perl@14731
* EBCDIC: to_utf8_case() is supposed to get its low 256Jarkko Hietaniemi2002-02-171-2/+2
| | | | | input in native code points, not Unicode. p4raw-id: //depot/perl@14726
* The #14715 and #14716 were okay: they just revealedJarkko Hietaniemi2002-02-171-14/+30
| | | | | a bug in the EXACTF matching. p4raw-id: //depot/perl@14724
* Retreat, retreat! (retract #14715 and #14716)Jarkko Hietaniemi2002-02-161-26/+14
| | | p4raw-id: //depot/perl@14723
* Tiny tweak.Jarkko Hietaniemi2002-02-161-4/+5
| | | p4raw-id: //depot/perl@14716
* Restructure to_utf8_case() for simpler execution paths.Jarkko Hietaniemi2002-02-161-14/+25
| | | p4raw-id: //depot/perl@14715
* Excise inexact blather.Jarkko Hietaniemi2002-02-141-5/+0
| | | p4raw-id: //depot/perl@14687
* Iteration continues.Jarkko Hietaniemi2002-02-131-17/+20
| | | p4raw-id: //depot/perl@14669
* Rewrite the "special mapping" part of to_utf8_case(),Jarkko Hietaniemi2002-02-131-26/+46
| | | | | | this time with fewer bugs. (See: The Law of Cybernetic Entymology.) p4raw-id: //depot/perl@14664
* EBCDIC: another "can't happen".Jarkko Hietaniemi2002-02-121-1/+5
| | | p4raw-id: //depot/perl@14660
* format problemRobin Barker2002-02-061-1/+1
| | | | | Message-Id: <200202061401.OAA25053@tempest.npl.co.uk> p4raw-id: //depot/perl@14570
* The Malformed UTF-8 Heisenbug seen by Merijn and NickCJarkko Hietaniemi2002-02-011-1/+5
| | | | | | | I got it in Tru64 + ithreads but only without -g, took some debugging by printf (which was no fun either since adding some debug printfs hid the error) p4raw-id: //depot/perl@14511
* Turn the I/O Unicode error by default on, but theJarkko Hietaniemi2002-01-311-1/+1
| | | | | | character-generating Unicode error by default off, as Larry suggested. p4raw-id: //depot/perl@14505
* EBCDIC fix: t/op/lc.t failures 24-25, 29-30, 34-35, 39-40Jarkko Hietaniemi2002-01-301-0/+1
| | | p4raw-id: //depot/perl@14494
* EBCDIC tweaks-- no new test passes, but getting closer.Jarkko Hietaniemi2002-01-291-3/+16
| | | p4raw-id: //depot/perl@14491
* Copyright++. (Not all the toplevel *.h have one, it seems.)Jarkko Hietaniemi2002-01-231-1/+1
| | | p4raw-id: //depot/perl@14391
* In dumping use isPRINT() instead of isprint() so that localeJarkko Hietaniemi2002-01-091-2/+3
| | | | | does not come into play. p4raw-id: //depot/perl@14146
* Document the flags of pv_uni_display().Jarkko Hietaniemi2002-01-071-9/+17
| | | p4raw-id: //depot/perl@14117
* More regex and utf8 debug dumping.Jarkko Hietaniemi2002-01-071-3/+26
| | | p4raw-id: //depot/perl@14114
* Finish up (ha!) the Unicode case folding;Jarkko Hietaniemi2002-01-051-2/+6
| | | | | enhance regex dumping code. p4raw-id: //depot/perl@14096