summaryrefslogtreecommitdiff
path: root/utf8.c
Commit message (Collapse)AuthorAgeFilesLines
* Varargs don't always work too well if one puts an unsignedYitzchak Scott-Thoennes2000-11-071-1/+1
| | | | | | | | char on the stack and pop an unsigned quad off the stack. Subject: Re: [ID 20001103.002] Not OK: perl v5.7.0 +DEVEL7523 on os2-64int-ld-2.30 (UNINSTALLED) Message-ID: <pxzB6gzkgKXY092yn@efn.org> p4raw-id: //depot/perl@7584
* printf UVs the correct way, noticed by Robin Barker.Jarkko Hietaniemi2000-11-011-3/+3
| | | p4raw-id: //depot/perl@7509
* UTF-8 decoder tweak.Jarkko Hietaniemi2000-10-291-1/+1
| | | p4raw-id: //depot/perl@7481
* Continue the internal UTF-8 API tweaking.Jarkko Hietaniemi2000-10-251-30/+29
| | | | | | | | Rename utf8_to_uv_chk() back to utf8_to_uv() because it's used much more than the simpler API, now called utf8_to_uv_simple(). Still not quite happy with API, too much partial duplication of functionality. p4raw-id: //depot/perl@7439
* Allow poking holes at the UTF-8 decoding strictness.Jarkko Hietaniemi2000-10-251-16/+25
| | | p4raw-id: //depot/perl@7438
* Rename UTF8LEN() to be UNISKIP(), too confusing to haveJarkko Hietaniemi2000-10-251-3/+3
| | | | | UTF8LEN() and UTF8SKIP(). p4raw-id: //depot/perl@7437
* Fix the bug reported inAndreas König2000-10-241-10/+28
| | | | | | | | Subject: Encode bug? Message-ID: <m3lmveqwh5.fsf@ak-71.mind.de> Also make is_utf8_char() stricter. p4raw-id: //depot/perl@7425
* Make the UTF-8 decoding stricter and more verbose whenJarkko Hietaniemi2000-10-241-48/+119
| | | | | | | | | | | | malformation happens. This involved adding an argument to utf8_to_uv_chk(), which involved changing its prototype, and prefer STRLEN over I32 for the UTF-8 length, which as a domino effect necessitated changing the prototypes of scan_bin(), scan_oct(), scan_hex(), and reg_uni(). The stricter UTF-8 decoding checking uses Markus Kuhn's UTF-8 Decode Stress Tester from http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt p4raw-id: //depot/perl@7416
* Thinko in #7222.Jarkko Hietaniemi2000-10-131-1/+1
| | | p4raw-id: //depot/perl@7223
* Use UTF8SKIP(), from Simon Cozens.Jarkko Hietaniemi2000-10-131-7/+1
| | | p4raw-id: //depot/perl@7222
* The HINT_BYTE patch is apparently unnecessary, retracted.Jarkko Hietaniemi2000-10-061-4/+0
| | | p4raw-id: //depot/perl@7156
* Patch from Peter Prymmer to disable utf8 in EBCDIC platforms.Jarkko Hietaniemi2000-10-061-0/+4
| | | p4raw-id: //depot/perl@7152
* Re-instate Perl_utf8_to_uv without checking parameter - added in change 7075.Nick Ing-Simmons2000-09-301-14/+34
| | | | | | | i.e. rename Simon's function to Perl_utf8_to_uv_chk, change all calls to it to use new name and add Perl_utf8_to_uv() as a wrapper which calls it passing 0 to checking to get the warning. p4raw-id: //depot/perl@7096
* utf8.c apidocSimon Cozens2000-09-141-1/+1
| | | | | Message-ID: <20000914234657.A13953@deep-dark-truthful-mirror.perlhacker.org> p4raw-id: //depot/perl@7087
* Replace #7084 withSpider Boardman2000-09-141-1/+2
| | | | | | Subject: Re: perl@7078 Message-Id: <200009142109.RAA03425@leggy.zk3.dec.com> p4raw-id: //depot/perl@7085
* UTF8-encoded version of 256 is 0xc4 0x80; test that a char isSimon Cozens2000-09-141-2/+1
| | | | | | | | convertible to bytes by checking it doesn't go above 0xc3 Subject: Re: perl@7078 Message-ID: <20000914205919.A11098@deep-dark-truthful-mirror.perlhacker.org> p4raw-id: //depot/perl@7084
* Batch of UTF-8 patches from Simon Cozens.Jarkko Hietaniemi2000-09-141-8/+37
| | | p4raw-id: //depot/perl@7075
* Fix forMarc Lehmann2000-09-071-1/+4
| | | | | | | | Subject: [ID 20000903.001] \w in utf8-strings Message-Id: <E13VUS5-0000cv-00.pgcc-forever-2000-09-03-09-44-29@fuji> and various related nits. p4raw-id: //depot/perl@7030
* small apidoc fixMarc Lehmann2000-09-071-1/+1
| | | | | Message-ID: <20000903051206.A5909@cerebro.laendle> p4raw-id: //depot/perl@7021
* Fix vec() / utf8 (was Re: bitvec ops still broken with utf8 -- or not?)Mike Guy2000-09-011-13/+20
| | | | | Message-Id: <E13Utuf-0004Bw-00@draco.cus.cam.ac.uk> p4raw-id: //depot/perl@6988
* various syntax errors and such (not fixed: comp/require.t#22 coredumpGurusamy Sarathy2000-08-011-1/+1
| | | | | on Windows) p4raw-id: //depot/perl@6476
* The swallow_bom() saga continues. The #23 of require.tJarkko Hietaniemi2000-07-311-22/+18
| | | | | | | | (UTF16-LE) still fails (silently, no output) but the #22 (UTF16-BE) seems to be working now. The root of the failure may be in sv_gets(): is it UTF-16LE-aware, especially when it comes to line endings? p4raw-id: //depot/perl@6469
* Tune the comments and hopefully stop a memory leak.Jarkko Hietaniemi2000-07-291-0/+1
| | | p4raw-id: //depot/perl@6463
* Get UTF16 BOMs working. Patch fromM. J. T. Guy2000-07-251-1/+8
| | | | | | | | | | | | | | Subject: Re: [ID 20000719.001] Problem with bleadperl of 7/18/00 Date: Tue, 25 Jul 2000 12:52:45 +0100 Message-Id: <E13H3GP-0004MR-00@libra.cus.cam.ac.uk> and notes from Subject: Re: [ID 20000719.001] Problem with bleadperl of 7/18/00 From: "M.J.T. Guy" <mjtg@cus.cam.ac.uk> Date: Tue, 25 Jul 2000 11:43:25 +0100 Message-Id: <E13H2BJ-0002nG-00@libra.cus.cam.ac.uk> p4raw-id: //depot/perl@6435
* integrate cfgperl changes#6242..6249 into mainlineGurusamy Sarathy2000-07-111-5/+36
| | | | | | | | | | | | | | | | | | | | | | p4raw-link: @6249 on //depot/cfgperl: cab27d238e930b8cddb5b1fb3260355f913b86a6 p4raw-link: @6242 on //depot/cfgperl: 1e72252ad7b8e23d1a1142285b8aa82986bd2491 p4raw-id: //depot/perl@6359 p4raw-integrated: from //depot/cfgperl@6358 'copy in' ext/DynaLoader/DynaLoader_pm.PL (@5953..) t/lib/peek.t (@6086..) t/lib/filefunc.t t/lib/filespec.t (@6230..) pod/perlintern.pod (@6237..) pod/perlapi.pod utf8.c (@6242..) p4raw-integrated: from //depot/cfgperl@6249 'copy in' lib/IPC/Open3.pm (@5937..) p4raw-integrated: from //depot/cfgperl@6248 'copy in' pod/perlfunc.pod (@6206..) p4raw-integrated: from //depot/cfgperl@6247 'ignore' lib/File/Spec.pm (@6230..) p4raw-integrated: from //depot/cfgperl@6244 'copy in' gv.c (@6217..) 'merge in' sv.c (@6196..) p4raw-integrated: from //depot/cfgperl@6243 'copy in' pp_proto.h (@6237..) 'ignore' embedvar.h perlapi.h (@6237..) 'merge in' embed.h objXSUB.h (@6237..) embed.pl perlapi.c proto.h (@6242..)
* integrate cfgperl changes#6231..6240 into mainlineGurusamy Sarathy2000-07-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | p4raw-link: @6240 on //depot/cfgperl: 514e70b26394e6b272960ab8b9b8b7dbb1e2c068 p4raw-link: @6231 on //depot/cfgperl: 7906debc4b99f108310cdade6e486754c15481e7 p4raw-id: //depot/perl@6355 p4raw-branched: from //depot/cfgperl@6353 'branch in' pod/perlutil.pod p4raw-integrated: from //depot/cfgperl@6353 'copy in' pod/roffitall (@5753..) op.h (@5833..) README.cygwin (@6096..) lib/ExtUtils/MM_VMS.pm (@6140..) lib/File/Find.pm (@6156..) Configure config_h.SH hints/solaris_2.sh (@6217..) Todo-5.6 (@6232..) keywords.h lib/warnings.pm opcode.h opnames.h pp.sym regnodes.h warnings.h (@6236..) 'ignore' ext/B/B/Asmdata.pm ext/ByteLoader/byterun.c ext/ByteLoader/byterun.h (@6236..) p4raw-integrated: from //depot/cfgperl@6240 'copy in' utils/h2xs.PL (@6192..) p4raw-integrated: from //depot/cfgperl@6238 'merge in' vms/vms.c (@6198..) p4raw-integrated: from //depot/cfgperl@6237 'copy in' utf8.c (@6221..) pod/perlapi.pod pod/perlintern.pod pp_proto.h (@6236..) 'ignore' embedvar.h perlapi.h (@6236..) 'merge in' embed.pl (@6225..) embed.h objXSUB.h perlapi.c proto.h (@6236..) p4raw-integrated: from //depot/cfgperl@6232 'copy in' pod/Makefile pod/perltoc.pod (@6161..) MANIFEST (@6227..)
* integrate cfgperl changes#6220..6222 into mainlineGurusamy Sarathy2000-07-111-0/+66
| | | | | | | | | | | | p4raw-link: @6222 on //depot/cfgperl: cb6e01d9fd93f1025bb60ed9c000931b2c8542a3 p4raw-link: @6220 on //depot/cfgperl: 94414bfbc497e71da32f6edca513d34725e3cae6 p4raw-id: //depot/perl@6350 p4raw-integrated: from //depot/cfgperl@6349 'copy in' lib/Pod/Usage.pm (@5717..) win32/win32.h (@6026..) pod/perlop.pod (@6206..) p4raw-integrated: from //depot/cfgperl@6221 'copy in' utf8.c (@6174..) doop.c (@6193..) toke.c (@6196..) 'merge in' embed.pl (@6217..) p4raw-integrated: from //depot/cfgperl@6220 'merge in' makedef.pl (@6156..)
* microperl changes from Simon Cozens; Makefile for microperlJarkko Hietaniemi2000-05-311-1/+1
| | | | | | written from scratch; few casts added as microperl compilation doesn't have all prototypes available. p4raw-id: //depot/cfgperl@6174
* make the is_utf8_*() safe for use on invalid utf8 (they nowGurusamy Sarathy2000-03-131-0/+61
| | | | | return false on such input instead of emitting warnings) p4raw-id: //depot/perl@5700
* demand-load utf8.pm in swash routinesGurusamy Sarathy2000-03-091-0/+7
| | | p4raw-id: //depot/perl@5622
* allocate sufficient buffer sizes for 64-bit wide utf8 charactersGurusamy Sarathy2000-02-191-17/+17
| | | | | | | permitted by change#5011 (from Gisle Aas) p4raw-link: @5011 on //depot/perl: 3c77ea2bace63b1ad27d15a6366cb938bdd158cb p4raw-id: //depot/perl@5136
* allow 64-bit utf8-encoded integers (from Ilya Zakharevich)Gurusamy Sarathy2000-02-071-2/+7
| | | p4raw-id: //depot/perl@5011
* set SvUTF8 on vectors only if there are chars > 127; update copyrightGurusamy Sarathy2000-02-061-1/+1
| | | | | years (from Gisle Aas) p4raw-id: //depot/perl@5009
* Continue qgcvt work; closer now but not yet there.Jarkko Hietaniemi2000-01-161-1/+1
| | | p4raw-id: //depot/cfgperl@4806
* uv_to_utf8() could lose 37th bit on HAS_QUAD platformsGurusamy Sarathy1999-12-201-1/+1
| | | p4raw-id: //depot/perl@4698
* Turn on largefileness always if available andJarkko Hietaniemi1999-11-111-2/+2
| | | | | continue 64-bit fixes. p4raw-id: //depot/cfgperl@4552
* sundry cleanups for clean build on windowsGurusamy Sarathy1999-07-081-0/+28
| | | p4raw-id: //depot/perl@3659
* fixes for logical bugs in the lexwarn patch; other tweaks to avoidGurusamy Sarathy1999-07-081-1/+1
| | | | | type mismatch problems p4raw-id: //depot/perl@3658
* Integrate with Sarathy; one conflict in t/pragma/warn/recgompJarkko Hietaniemi1999-07-071-3/+9
|\ | | | | | | | | resolved manually. p4raw-id: //depot/cfgperl@3648
| * lexical warnings update (warning.t fails one testPaul Marquess1999-07-071-3/+9
| | | | | | | | | | | | | | due to leaked scalar, investigation pending) Message-ID: <5104D4DBC598D211B5FE0000F8FE7EB29C6C8E@mbtlipnt02.btlabs.bt.co.uk> Subject: [PATCH 5.005_57] Lexical Warnings - mandatory warning are now default warnings p4raw-id: //depot/perl@3640
* | POSIX [[:character class:]] support for standard, locale,Jarkko Hietaniemi1999-07-061-1/+111
|/ | | | | | | and utf8. If both utf8 and locale are on, utf8 wins. I don't fully understand why so many tables changed in lib/unicode because of "make" -- maybe it was just overdue. p4raw-id: //depot/cfgperl@3624
* more complete support for implicit thread/interpreter pointer,Gurusamy Sarathy1999-06-091-6/+6
| | | | | | | | | | | | | | | | | | | | | enabled via -DPERL_IMPLICIT_CONTEXT (all changes are noops without that enabled): - USE_THREADS now enables PERL_IMPLICIT_CONTEXT, so dTHR is a noop; tests pass on Solaris; should be faster now! - MULTIPLICITY has been tested with and without PERL_IMPLICIT_CONTEXT on Solaris - improved function database now merged with embed.pl - everything except the varargs functions have foo(a,b,c) macros to provide compatibility - varargs functions default to compatibility variants that get the context pointer using dTHX - there should be almost no source compatibility issues as a result of all this - dl_foo.xs changes other than dl_dlopen.xs untested - still needs documentation, fixups for win32 etc Next step: migrate most non-mutex variables from perlvars.h to intrpvar.h p4raw-id: //depot/perl@3524
* initial stub implementation of implicit thread/thisGurusamy Sarathy1999-06-071-44/+45
| | | | | | | | | | | | pointer argument; builds/tests on Solaris, win32 hasn't been fixed up yet; proto.h, global.sym and static function decls are now generated from a common database in proto.pl; some inconsistently named perl_foo() things are now Perl_foo(), compatibility #defines provided; perl_foo() (lowercase 'p') reserved for functions that take an explicit context argument; next step: generate #define foo(a,b) Perl_foo(aTHX_ a,b) p4raw-id: //depot/perl@3522
* update copyright yearsGurusamy Sarathy1999-03-221-1/+1
| | | p4raw-id: //depot/perl@3124
* fix globals caught by change#1927; builds and tests on SolarisGurusamy Sarathy1998-10-061-20/+20
| | | | | p4raw-link: @1927 on //depot/perl: eb07465ebe1238598e948058857ec948c6697f86 p4raw-id: //depot/perl@1936
* fix mismatched UV/U32 types for to_utf8_*()Gurusamy Sarathy1998-09-231-3/+3
| | | p4raw-id: //depot/perl@1805
* various tweaks: fix signed vs. unsigned problems that prevented C++Gurusamy Sarathy1998-08-101-34/+34
| | | | | | builds; add sundry PERL_OBJECT scaffolding to get it to build; fix lexical warning testsuite for win32 p4raw-id: //depot/perl@1777
* Here are the long-expected Unicode/UTF-8 modifications.Larry Wall1998-07-241-0/+638
p4raw-id: //depot/utfperl@1651