summaryrefslogtreecommitdiff
path: root/utf8.c
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] bug in utf8.c(?)Marty Pauley2002-09-261-0/+1
| | | | | Subject: [PATCH] bug in utf8.c(?) p4raw-id: //depot/perl@17928
* Small speedup by inlining the easy bits of is_utf8_char()Jarkko Hietaniemi2002-07-011-3/+11
| | | | | into is_utf8_string(). p4raw-id: //depot/perl@17392
* good day for WinCE port of perl.Vadim Konovalov2002-05-161-9/+0
| | | | | Message-ID: <001301c1fc68$e808e560$a95cc3d9@vad> p4raw-id: //depot/perl@16628
* WinCE several touchesVadim Konovalov2002-05-131-1/+1
| | | | | Message-ID: <007b01c1fabe$cc8cbbf0$785cc3d9@vad> p4raw-id: //depot/perl@16582
* WinCE many fixesVadim Konovalov2002-04-281-0/+9
| | | | | Message-ID: <00bf01c1eedd$c0c62a00$d25cc3d9@vad> p4raw-id: //depot/perl@16251
* fixes for all the warnings reported by Visual C (most of thisGurusamy Sarathy2002-04-211-82/+82
| | | | | | change is from change#12026) p4raw-link: @12026 on //depot/maint-5.6/perl: ff42b73b40f5a895aef4bed81c794f468e0609bc p4raw-id: //depot/perl@16048
* my $utf8here, our $utf8here, and package variable $utf8here.Jarkko Hietaniemi2002-04-161-8/+27
| | | | | | | | | | | | | | | The actual minimal fix is in utf8.c and from NI-S, the rest are the tests (in fresh_perl since I couldn't get them easily to work elsewhere) and a slight behaviour change: previously UTF-8 identifiers had to start with an alphabetic character. No more so, now they can start with an (Unicode) ID_Continue character (which however is not a (Unicode) digit). (Limiting the first character to ID_Start would be rather restrictive, since ID_Start allows only alphabetic letters.) TODO: use vars qw($utf8here). This I don't find to be a showstopper. p4raw-id: //depot/perl@15943
* Re: Change 15762: As noted by Philip Newton: nothing wrong with BOM,Philip Newton2002-04-071-1/+1
| | | | | Message-ID: <1dnvau4j684hke2igk990f01nit8r2811s@4ax.com> p4raw-id: //depot/perl@15777
* As noted by Philip Newton: nothing wrong with BOM,Jarkko Hietaniemi2002-04-061-14/+3
| | | | | but 0xFFFE quite wrong. p4raw-id: //depot/perl@15762
* What started as a small nit (the charnames test, nit foundJarkko Hietaniemi2002-04-021-6/+10
| | | | | | | | | be Hugo), ballooned a bit... the goal is Larry's wish that illegal Unicode (such as U+FFFF) by default doesn't warn, since what if somebody WANTS to create illegal Unicode? Now getting close to this in the regex runtime. (Also, fix more of my fixation that BOM would be U+FFFE.) p4raw-id: //depot/perl@15689
* A little bit better error message for \pq, stillJarkko Hietaniemi2002-03-281-1/+5
| | | | | not good because the script context is not shown. p4raw-id: //depot/perl@15581
* Warn instead of croak.Jarkko Hietaniemi2002-03-271-8/+42
| | | p4raw-id: //depot/perl@15556
* B::perlstring and unicodeRafael Garcia-Suarez2002-03-181-1/+1
| | | | | Message-ID: <20020318231431.A699@rafael> p4raw-id: //depot/perl@15308
* more warnings tidyupPaul Marquess2002-03-111-4/+4
| | | | | | From: "Paul Marquess" <paul_marquess@yahoo.co.uk> Message-ID: <AIEAJICLCBDNAAOLLOKLMEEGDPAA.paul_marquess@yahoo.co.uk> p4raw-id: //depot/perl@15155
* EBCDIC: this seems to calm the last of theJarkko Hietaniemi2002-02-241-12/+8
| | | | | Malformed UTF-8 warnings. p4raw-id: //depot/perl@14850
* In EBCDIC the UNI_TO_NATIVE() macro evaluates its argumentJarkko Hietaniemi2002-02-201-2/+4
| | | | | twice, causing the loop to skip every other character. p4raw-id: //depot/perl@14800
* Misplaced block end.Jarkko Hietaniemi2002-02-191-1/+1
| | | p4raw-id: //depot/perl@14766
* Oops.Jarkko Hietaniemi2002-02-191-1/+1
| | | p4raw-id: //depot/perl@14762
* Try special casing first.Jarkko Hietaniemi2002-02-191-27/+33
| | | p4raw-id: //depot/perl@14759
* Unused in ASCII, used in EBCDIC.Jarkko Hietaniemi2002-02-181-2/+1
| | | p4raw-id: //depot/perl@14747
* EBCDIC: now the worst seems to be over forJarkko Hietaniemi2002-02-181-12/+8
| | | | | | the "Malformed" warnings. Still a few of them, and plenty of test failures, but getting better. p4raw-id: //depot/perl@14739
* After much rewriting we are now pretty muchJarkko Hietaniemi2002-02-181-44/+41
| | | | | back to where we started. p4raw-id: //depot/perl@14737
* Clearing up to_utf8_case() continues: this time useJarkko Hietaniemi2002-02-171-56/+58
| | | | | a single return, and EBCDICification for all paths. p4raw-id: //depot/perl@14734
* Tiny tweak.Jarkko Hietaniemi2002-02-171-6/+2
| | | p4raw-id: //depot/perl@14732
* Redundant casts.Jarkko Hietaniemi2002-02-171-18/+18
| | | p4raw-id: //depot/perl@14731
* EBCDIC: to_utf8_case() is supposed to get its low 256Jarkko Hietaniemi2002-02-171-2/+2
| | | | | input in native code points, not Unicode. p4raw-id: //depot/perl@14726
* The #14715 and #14716 were okay: they just revealedJarkko Hietaniemi2002-02-171-14/+30
| | | | | a bug in the EXACTF matching. p4raw-id: //depot/perl@14724
* Retreat, retreat! (retract #14715 and #14716)Jarkko Hietaniemi2002-02-161-26/+14
| | | p4raw-id: //depot/perl@14723
* Tiny tweak.Jarkko Hietaniemi2002-02-161-4/+5
| | | p4raw-id: //depot/perl@14716
* Restructure to_utf8_case() for simpler execution paths.Jarkko Hietaniemi2002-02-161-14/+25
| | | p4raw-id: //depot/perl@14715
* Excise inexact blather.Jarkko Hietaniemi2002-02-141-5/+0
| | | p4raw-id: //depot/perl@14687
* Iteration continues.Jarkko Hietaniemi2002-02-131-17/+20
| | | p4raw-id: //depot/perl@14669
* Rewrite the "special mapping" part of to_utf8_case(),Jarkko Hietaniemi2002-02-131-26/+46
| | | | | | this time with fewer bugs. (See: The Law of Cybernetic Entymology.) p4raw-id: //depot/perl@14664
* EBCDIC: another "can't happen".Jarkko Hietaniemi2002-02-121-1/+5
| | | p4raw-id: //depot/perl@14660
* format problemRobin Barker2002-02-061-1/+1
| | | | | Message-Id: <200202061401.OAA25053@tempest.npl.co.uk> p4raw-id: //depot/perl@14570
* The Malformed UTF-8 Heisenbug seen by Merijn and NickCJarkko Hietaniemi2002-02-011-1/+5
| | | | | | | I got it in Tru64 + ithreads but only without -g, took some debugging by printf (which was no fun either since adding some debug printfs hid the error) p4raw-id: //depot/perl@14511
* Turn the I/O Unicode error by default on, but theJarkko Hietaniemi2002-01-311-1/+1
| | | | | | character-generating Unicode error by default off, as Larry suggested. p4raw-id: //depot/perl@14505
* EBCDIC fix: t/op/lc.t failures 24-25, 29-30, 34-35, 39-40Jarkko Hietaniemi2002-01-301-0/+1
| | | p4raw-id: //depot/perl@14494
* EBCDIC tweaks-- no new test passes, but getting closer.Jarkko Hietaniemi2002-01-291-3/+16
| | | p4raw-id: //depot/perl@14491
* Copyright++. (Not all the toplevel *.h have one, it seems.)Jarkko Hietaniemi2002-01-231-1/+1
| | | p4raw-id: //depot/perl@14391
* In dumping use isPRINT() instead of isprint() so that localeJarkko Hietaniemi2002-01-091-2/+3
| | | | | does not come into play. p4raw-id: //depot/perl@14146
* Document the flags of pv_uni_display().Jarkko Hietaniemi2002-01-071-9/+17
| | | p4raw-id: //depot/perl@14117
* More regex and utf8 debug dumping.Jarkko Hietaniemi2002-01-071-3/+26
| | | p4raw-id: //depot/perl@14114
* Finish up (ha!) the Unicode case folding;Jarkko Hietaniemi2002-01-051-2/+6
| | | | | enhance regex dumping code. p4raw-id: //depot/perl@14096
* Missed the =head1 additions.Jarkko Hietaniemi2002-01-031-2/+2
| | | p4raw-id: //depot/perl@14041
* One more iteration of the ibcmp_utf8() interface,Jarkko Hietaniemi2002-01-021-39/+55
| | | | | hopefully this is a convergent iteration... p4raw-id: //depot/perl@14014
* Make ibcmp_utf8() optionally progress in either string forJarkko Hietaniemi2002-01-021-8/+33
| | | | | as long as it takes and optionally record how far it got. p4raw-id: //depot/perl@14010
* -Wall silencing.Jarkko Hietaniemi2002-01-021-2/+2
| | | p4raw-id: //depot/perl@14008
* Make ibcmp_utf8() more robust and make regmatch() use it.Jarkko Hietaniemi2002-01-011-30/+36
| | | p4raw-id: //depot/perl@14005
* Document the to_utf8_*() functions.Jarkko Hietaniemi2002-01-011-1/+57
| | | p4raw-id: //depot/perl@14002