summaryrefslogtreecommitdiff
path: root/lib/utf8_heavy.pl
Commit message (Collapse)AuthorAgeFilesLines
* Clarification and cleanup of the XS SWASHGET codeSADAHIRO Tomoyuki2005-12-051-19/+33
| | | | | | Subject: Re: XS-assisted SWASHGET (esp. for t/uni/class.t speedup) Message-Id: <20051204162508.D726.BQW10602@nifty.com> p4raw-id: //depot/perl@26255
* Re: XS-assisted SWASHGET (esp. for t/uni/class.t speedup)SADAHIRO Tomoyuki2005-11-301-7/+1
| | | | | Message-Id: <20051127170016.A786.BQW10602@nifty.com> p4raw-id: //depot/perl@26229
* XS-assisted SWASHGET (esp. for t/uni/class.t speedup)SADAHIRO Tomoyuki2005-11-231-138/+3
| | | | | | | | | | Message-Id: <20051123175603.FFD5.BQW10602@nifty.com> And : Message-Id: <20051123202935.4D9D.BQW10602@nifty.com> with some nits to use U8 instead of char more consistently p4raw-id: //depot/perl@26199
* replace the run time code in lib/utf8_pva.pl with data generatedNicholas Clark2004-05-311-1/+1
| | | | | at build by mktables, stored in lib/unicore/PVA.pl p4raw-id: //depot/perl@22881
* Don't need to require utf8_pva.pl at top of fileNicholas Clark2004-05-311-2/+1
| | | p4raw-id: //depot/perl@22880
* candidate for TR18 complianceJeff Pinyan2004-04-271-28/+54
| | | | | | | | | Date: Thu, 22 Apr 2004 14:31:30 -0400 (EDT) Message-ID: <Pine.LNX.4.44.0404221429040.10466-101000@perlmonk.org> Date: Mon, 26 Apr 2004 12:37:21 -0400 (EDT) Message-ID: <Pine.LNX.4.44.0404261222320.7154-400000@perlmonk.org> p4raw-id: //depot/perl@22744
* lib/utf8_heavy.pl -- cascading classes and '&' supportJeff Pinyan2004-04-141-4/+20
| | | | | Message-ID: <Pine.LNX.4.44.0404122011160.3038-200000@perlmonk.org> p4raw-id: //depot/perl@22693
* For characters beyond the BMP the $bits will be undef,Jarkko Hietaniemi2003-06-221-1/+1
| | | | | | which will cause utf8_heavy.pl noise (reported by Daniel Yacob, analysis and fix from SADAHIRO Tomoyuki) p4raw-id: //depot/perl@19835
* Integrate from the maint-5.8/ branch :Rafael Garcia-Suarez2002-12-101-15/+41
| | | | | | | | | | | | | | | | changes 18219, 18236, 18242-3, 18247-8, 18253-5, 18257, 18273-6 p4raw-id: //depot/perl@18280 p4raw-branched: from //depot/maint-5.8/perl@18279 'branch in' t/op/lc_user.t p4raw-integrated: from //depot/maint-5.8/perl@18279 'copy in' lib/File/Copy.pm (@17645..) lib/utf8_heavy.pl pod/perlsec.pod (@18080..) hints/irix_6.sh (@18173..) t/uni/tr_utf8.t (@18197..) pod/perlunicode.pod (@18242..) t/op/pat.t (@18248..) t/op/split.t (@18274..) 'edit in' pod/perlguts.pod (@18242..) 'merge in' pp.c (@18126..) MANIFEST (@18234..) p4raw-integrated: from //depot/maint-5.8/perl@18254 'merge in' pod/perldiag.pod (@18234..)
* Re: [perl #17951] Strange UTF errorJarkko Hietaniemi2002-10-201-2/+4
| | | | | Message-ID: <20021016155051.GB268437@lyta.hut.fi> p4raw-id: //depot/perl@18035
* perl #17453Jarkko Hietaniemi2002-09-261-15/+14
| | | | | Message-ID: <20020920142245.GG280265@lyta.hut.fi> p4raw-id: //depot/perl@17933
* Integrate #16353 from macperl;Jarkko Hietaniemi2002-05-021-2/+2
| | | | | | | | | | | | "fix" for utf8_heavy.pl, lexical UTF8 var crashed in test 92 of run/fresh_perl.t on MacOS (as pudge rightfully points out, this is voodoo programming at it best, the real bug is somewhere else, now we just happened to shake the chicken the right way) p4raw-id: //depot/perl@16355 p4raw-integrated: from //depot/macperl@16354 'merge in' lib/utf8_heavy.pl (@16123..)
* Re: Encode, charnames and utf8heavyDan Kogai2002-05-021-1/+1
| | | | | | | Message-Id: <539D985A-5D1A-11D6-BB19-00039301D480@dan.co.jp> (plus a respective perlunicode tweak) p4raw-id: //depot/perl@16354
* Make writing user-defined character properties nicer.Jarkko Hietaniemi2002-04-211-1/+7
| | | p4raw-id: //depot/perl@16054
* User-defined character properties were unintentionallyJarkko Hietaniemi2002-04-201-13/+36
| | | | | removed, noticed by Dan Kogai. p4raw-id: //depot/perl@16012
* A little bit better error message for \pq, stillJarkko Hietaniemi2002-03-281-1/+3
| | | | | not good because the script context is not shown. p4raw-id: //depot/perl@15581
* Jeffrey's Unicode adventure continues: unify the In/*.plJarkko Hietaniemi2002-01-161-131/+58
| | | | | | and Is/*.pl to lib/*.pl, remove In.pl and Is.pl, introduce Canonical.pl and Exact.pl. p4raw-id: //depot/perl@14294
* Additional utf8_heavy.pl tweak from Jeffrey.Jarkko Hietaniemi2002-01-151-4/+11
| | | p4raw-id: //depot/perl@14272
* Big mktables rewrite from Jeffrey;Jarkko Hietaniemi2002-01-141-84/+170
| | | | | documentation not yet updated. p4raw-id: //depot/perl@14254
* Future-proofing from Jeffrey Friedl (for conflictingJarkko Hietaniemi2002-01-131-2/+2
| | | | | In* and Is* names). p4raw-id: //depot/perl@14242
* RESENT - [PATCH] utf8_heavy.pl Jeffrey Friedl2001-12-161-2/+2
| | | | | Message-Id: <200112160355.fBG3t1t84835@ventrue.corp.yahoo.com> p4raw-id: //depot/perl@13710
* Support \p{All}, \p{IsAssigned}, \p{IsUnassigned}.Jarkko Hietaniemi2001-12-151-0/+2
| | | p4raw-id: //depot/perl@13706
* Unicode categories continue:Jarkko Hietaniemi2001-10-191-6/+13
| | | | | | | | implement Category=, Script=, Block= (these are based on an upcoming update of TR#18) Fix a bug where we got two In categories named "old italic", and another where shortcut for the Is categories wasn't taken. p4raw-id: //depot/perl@12500
* Document the problem with the swash_fetch() API that affectsJarkko Hietaniemi2001-10-161-0/+1
| | | | | more complex case conversions. p4raw-id: //depot/perl@12450
* Rewrite mktables from scratch.Jarkko Hietaniemi2001-10-131-29/+64
| | | | | | | | | | | | | | | | | | | - Cleaner. - Faster: 15-20 seconds as opposed to several minutes. - More dynamic: the names of the various categories such as the linebreak ones are dynamic, not static. - Is.pl: long names for the general category properties are now available. - Ranges (<... ,First>, <..., Last>) from the general categories work now. - No more mktables.PL because the mktables.PL is not and never has been run to create a mktables. - syllables.txt and Is/Syl*.pl removed: non-standard (not part of the Unicode), and the whole concept is being reworked (http://syllabary.sourceforge.net/), the old way wouldn't even work with the new Syllables.txt (it would result in 1000+ new categories) p4raw-id: //depot/perl@12427
* Enable more debugging.Jarkko Hietaniemi2001-10-091-5/+5
| | | p4raw-id: //depot/perl@12373
* Unicode properties saga continues.Jarkko Hietaniemi2001-10-041-1/+1
| | | p4raw-id: //depot/perl@12335
* Yet more Unicode properties.Jarkko Hietaniemi2001-10-041-3/+4
| | | p4raw-id: //depot/perl@12334
* Unicode properties: fix L& (the #12319 didn't allow L&,Jarkko Hietaniemi2001-10-031-2/+2
| | | | | | only IsL&) and Inherited (negative lookahead good); add tests for Common, Inherited, and L&. p4raw-id: //depot/perl@12320
* Unicode properties: support \p{(?:Is)?L&} as an alias for \pL.Jarkko Hietaniemi2001-10-031-7/+8
| | | | | (The Unicode standard uses L& quite often.) p4raw-id: //depot/perl@12319
* Further tweaks to the Unicode properties.Jarkko Hietaniemi2001-10-011-0/+1
| | | p4raw-id: //depot/perl@12286
* Cleanup utf8_heavy; allow dropping the In prefix fromJarkko Hietaniemi2001-09-301-41/+43
| | | | | Unicode script/block properties. p4raw-id: //depot/perl@12281
* #12272 wasn't right, it introduced an extra ().Jarkko Hietaniemi2001-09-301-1/+1
| | | p4raw-id: //depot/perl@12278
* Nasty recursion trap if one would match Unicode.Jarkko Hietaniemi2001-09-291-1/+1
| | | p4raw-id: //depot/perl@12272
* More leniency to the \p and \P: now can have whitespaceJarkko Hietaniemi2001-09-291-1/+1
| | | | | | | between the property definition and the curlies; now can invert the property by having a caret between the open curly and the property. p4raw-id: //depot/perl@12269
* Allow for more flexibility in the \p{In...} names, nowJarkko Hietaniemi2001-09-291-4/+13
| | | | | | | case doesn't matter, and any space or dash can be matched by any space, dash, underbar, or empty. (may be going too far on leniency) p4raw-id: //depot/perl@12264
* Rename lib/unicode files to lib/unicore to avoidJarkko Hietaniemi2001-08-091-1/+1
| | | | | | conflicts between core lib/unicode and Unicode:: files in case-ignoring filesystems. p4raw-id: //depot/perl@11623
* More \p{In...} testing, combined with \N{...}.Jarkko Hietaniemi2001-06-081-2/+3
| | | p4raw-id: //depot/perl@10481
* Salvage bits and pieces from the experimental 'utf8 everywhere'Jarkko Hietaniemi2001-05-311-1/+3
| | | | | | patch: rename HINT_BYTE and IN_BYTE to HINT_BYTES and IN_BYTES to match the pragma name; various robustness cleanups. p4raw-id: //depot/perl@10339
* Re: [ID 20010528.004] dual bug under utf8: $@ has UTF8 flag and \s+ does not ↵Hugo van der Sanden2001-05-301-2/+0
| | | | | | | | | | match Message-Id: <200105301059.LAA03182@crypt.compulink.co.uk> localizing $@ has unfortunate semantics - if you die past a local $@, the die message is lost. p4raw-id: //depot/perl@10310
* Additional safeguard against $@ getting trampled; idea from Hugo.Jarkko Hietaniemi2001-05-291-5/+10
| | | p4raw-id: //depot/perl@10279
* At least a partial fix for 20010528.004.Jarkko Hietaniemi2001-05-291-1/+1
| | | p4raw-id: //depot/perl@10277
* Explain the \p{} and \P{} error message better andJarkko Hietaniemi2001-04-281-1/+1
| | | | | have prettier prettyprint in In.pl. p4raw-id: //depot/perl@9899
* Add a level of indirection to the implementation of \p{InFoo}Jarkko Hietaniemi2001-04-281-1/+8
| | | | | | | | | so that we don't have to have long filenames. (Nothing changes in the user interface.) The indirection is defined in the file lib/unicode/In.pl and it is handled in lib/utf8_heavy.pl. Also rename some the character classes by removing '-' from the classnames, and finally renamed Block.pl as Blocks.pl. p4raw-id: //depot/perl@9897
* use warnings rather than fiddling with $^W (from Paul Marquess)Gurusamy Sarathy2000-02-021-1/+1
| | | p4raw-id: //depot/perl@4954
* tr///d does not seem to workLarry Wall1998-10-231-2/+1
| | | p4raw-id: //depot/perl@2039
* fix intolerance of SWASHes for blank linesGisle Aas1998-08-081-33/+31
| | | | | | Message-ID: <m3emutkdeu.fsf@furu.g.aas.no> Subject: Re: Re[2]: another joyride begins p4raw-id: //depot/perl@1767
* kill bogus warning from -we 'use utf8; $_="\x{FF}"'Gisle Aas1998-08-081-3/+3
| | | | | | Message-ID: <m3yat4sbys.fsf@furu.g.aas.no> Subject: Re: another joyride begins p4raw-id: //depot/perl@1765
* Here are the long-expected Unicode/UTF-8 modifications.Larry Wall1998-07-241-0/+224
p4raw-id: //depot/utfperl@1651