summaryrefslogtreecommitdiff
path: root/lib/unicode
Commit message (Collapse)AuthorAgeFilesLines
* Rename lib/unicode files to lib/unicore to avoidJarkko Hietaniemi2001-08-09323-103743/+5
| | | | | | conflicts between core lib/unicode and Unicode:: files in case-ignoring filesystems. p4raw-id: //depot/perl@11623
* Remove unicode::distinct, as per Inaba Hiroto.Jarkko Hietaniemi2001-07-131-35/+0
| | | p4raw-id: //depot/perl@11342
* The #11132 missed singleton characters (not partJarkko Hietaniemi2001-07-0428-1/+210
| | | | | of a unilo..unihi range) in Unicode scripts. p4raw-id: //depot/perl@11133
* Support preferentially the Unicode 'scripts' definitionJarkko Hietaniemi2001-07-04140-407/+1282
| | | | | | | | | | | | in the \p{In...} notation since according to Unicode the scripts concept is more natural for matching than using the somewhat artificial block names. The block names are still available, though, and if there's a name conflict, the scripts one wins and the blocks one has to do with 'Block' appended to its name. For more information see http://www.unicode.org/unicode/reports/tr24/ p4raw-id: //depot/perl@11132
* Forgot the latest mktables.PL from #9899.Jarkko Hietaniemi2001-04-281-1/+1
| | | p4raw-id: //depot/perl@9900
* Explain the \p{} and \P{} error message better andJarkko Hietaniemi2001-04-282-98/+98
| | | | | have prettier prettyprint in In.pl. p4raw-id: //depot/perl@9899
* Add a level of indirection to the implementation of \p{InFoo}Jarkko Hietaniemi2001-04-28100-13/+403
| | | | | | | | | so that we don't have to have long filenames. (Nothing changes in the user interface.) The indirection is defined in the file lib/unicode/In.pl and it is handled in lib/utf8_heavy.pl. Also rename some the character classes by removing '-' from the classnames, and finally renamed Block.pl as Blocks.pl. p4raw-id: //depot/perl@9897
* Unicode ReadMe update for Unicode 3.1.Jarkko Hietaniemi2001-04-011-15/+31
| | | p4raw-id: //depot/perl@9503
* Obsolete file (see #3938)Jarkko Hietaniemi2001-03-311-18/+0
| | | p4raw-id: //depot/perl@9484
* Update to Unicode 3.1.Jarkko Hietaniemi2001-03-3114-412/+1004
| | | | | (Rename Names.txt to NamesList.txt.) p4raw-id: //depot/perl@9483
* Upgrade to Unicode 3.1 beta 2001-03-23.Jarkko Hietaniemi2001-03-243-15/+16
| | | p4raw-id: //depot/perl@9326
* Upgrade to Unicode 3.1 beta 2001-03-01.Jarkko Hietaniemi2001-03-0846-541/+2393
| | | p4raw-id: //depot/perl@9077
* More tweakage on the Unicode character class descriptions.Jarkko Hietaniemi2001-03-071-1/+2
| | | p4raw-id: //depot/perl@9062
* Upgrade to Unicode 3.1 beta 2001-02-11.Jarkko Hietaniemi2001-02-11260-27469/+38224
| | | | | | | | | | | | | Blocks-4d3.beta.txt CaseFolding-3d4.beta.txt CompositionExclusions-3d6.beta.txt EastAsianWidth-4d4.beta.txt LineBreak-6d3.beta.txt NamesList-3.1.0d1.beta.txt PropList-3.1.0d4.beta.txt SpecialCasing-4d1.beta.txt UnicodeData-3.1.0d6.beta.txt p4raw-id: //depot/perl@8771
* The first bug found by 1_compile.t.Jarkko Hietaniemi2001-01-181-1/+1
| | | p4raw-id: //depot/perl@8472
* more UTF8 test suites and an UTF8 patchInaba Hiroto2000-12-301-0/+35
| | | | | | | | Message-ID: <3A4D722D.243AFD88@st.rim.or.jp> Just the patch part for now, and the pragma renamed as unicode::distinct. p4raw-id: //depot/perl@8267
* Get the three different space character classes right under utf8.Jarkko Hietaniemi2000-12-013-0/+31
| | | p4raw-id: //depot/perl@7940
* Various doc oddball characters.Michael Somos2000-11-091-1329/+1329
| | | | | | Subject: [ID 20001106.004] Perl 5.6.0 bugs Message-Id: <200011062244.RAA28632@grail.cba.csuohio.edu> p4raw-id: //depot/perl@7632
* Tweak the Is* definitions of Unicode character classesJarkko Hietaniemi2000-10-229-162/+427
| | | | | | | | to better match the official categorizations; embrace the official categorizations; add the combining marks as alpha (and -numeric); fix DCinital (a typo and edito) to be DCmedial. p4raw-id: //depot/perl@7394
* Fix forMarc Lehmann2000-09-071-3/+3
| | | | | | | | Subject: [ID 20000903.001] \w in utf8-strings Message-Id: <E13VUS5-0000cv-00.pgcc-forever-2000-09-03-09-44-29@fuji> and various related nits. p4raw-id: //depot/perl@7030
* Missed one Unicode file.Jarkko Hietaniemi2000-08-311-0/+1025
| | | p4raw-id: //depot/perl@6934
* Update to Unicode 3.0.1.Jarkko Hietaniemi2000-08-30256-2854/+1856
| | | p4raw-id: //depot/perl@6930
* Zero entries were skipped, fix from Adrian GoalbyJarkko Hietaniemi2000-08-102-1/+23
| | | | | <argoalby@yahoo.co.uk> p4raw-id: //depot/perl@6565
* revise mktables.PL for bugs and newness in Unicode 3.0Gurusamy Sarathy2000-05-2849-73/+1892
| | | | | (from James Bence <jbence@amgen.com>) p4raw-id: //depot/perl@6139
* Is{Alnum,Alpha,Word} don't match titlecaseGurusamy Sarathy2000-04-304-39/+19
| | | | | | | TODO: IsSpace is defined recursively! (both spotted by Larry) p4raw-id: //depot/perl@6025
* add linebreak properties from unicode/LineBrk.txt (fromGurusamy Sarathy2000-04-2430-0/+1129
| | | | | | | Dave Hartnoll <Dave_Hartnoll@3b2.com>) p4raw-link: @3 on //depot/thrperl: a4f68e9b64464684b732bc17fd65ed4a1aa4708c p4raw-id: //depot/perl@5911
* See http://www.unicode.org/unicode/reports/tr15/Jarkko Hietaniemi2000-02-223-790/+0
| | | | | for in-depth description of the problem. p4raw-id: //depot/cfgperl@5216
* change#4641 needs perldiag.pod editGurusamy Sarathy1999-12-061-10617/+0
| | | | | p4raw-link: @4641 on //depot/perl: b89fed5ff1fc43a68f98ebc06fd23230eb6697a8 p4raw-id: //depot/perl@4657
* re-add missing Unicode database masterGurusamy Sarathy1999-12-021-0/+10617
| | | p4raw-id: //depot/perl@4619
* Another Unicode update.Jarkko Hietaniemi1999-11-14192-198/+2429
| | | p4raw-id: //depot/cfgperl@4580
* Regen Unicode tables to include a warning:Jarkko Hietaniemi1999-11-13187-6/+651
| | | | | | | Thou Shalt Not Edit Them By Hand; add missing (Unicode 2.0 -introduced) tables to MANIFEST; convert the equivalence tables to be valid Perl code. p4raw-id: //depot/cfgperl@4563
* Tweak the equivalence tables once again.Jarkko Hietaniemi1999-09-223-88/+147
| | | p4raw-id: //depot/cfgperl@4218
* Add description of the Unicode database files.Jarkko Hietaniemi1999-09-181-0/+345
| | | p4raw-id: //depot/cfgperl@4190
* Update Unicode database and recompute the tables.Jarkko Hietaniemi1999-09-1448-6106/+35734
| | | | | | | | | Rename the .txt files to be more Unicode 3.0-like. Unihan-3.0.txt not included because it is 16 MB. syllables.txt is manually maintained. See ReadMe.txt for description of the .txt files. (not all of them are used yet) p4raw-id: //depot/cfgperl@4151
* Create the equivalence tables based onJarkko Hietaniemi1999-08-293-159/+621
| | | | | | the real Unicode decomposition, not on the character name. p4raw-id: //depot/cfgperl@4037
* add missing Is/Syl*.pl filesGurusamy Sarathy1999-08-2012-0/+24
| | | p4raw-id: //depot/perl@4009
* Regenerate Unicode tables based on new syllable listsJarkko Hietaniemi1999-08-122-428/+1331
| | | | | from Daniel Yacob. p4raw-id: //depot/cfgperl@3965
* Remove blathering.Jarkko Hietaniemi1999-08-101-22/+0
| | | p4raw-id: //depot/cfgperl@3943
* Regenerate the Unicode tables after having updated the UnicodeJarkko Hietaniemi1999-08-0963-1210/+6192
| | | | | | | database (change #3939). p4raw-link: @3939 on //depot/cfgperl: 1b840072c89904927826b140322b783653b204a1 p4raw-id: //depot/cfgperl@3940
* Unicode data updated to be the latest beta of the Unicode 3.0.Jarkko Hietaniemi1999-08-091-2076/+5758
| | | p4raw-id: //depot/cfgperl@3939
* Ethiopic changes via private email from Daniel Yacob,Jarkko Hietaniemi1999-08-094-377/+446
| | | | | | <dmulholl@cs.indiana.edu>. Ethiopic and Cherokee done, Canadian Syllabics and Yi under construction. p4raw-id: //depot/cfgperl@3938
* Move the equivalence class creation last.Jarkko Hietaniemi1999-08-091-68/+67
| | | p4raw-id: //depot/cfgperl@3937
* Compute equivalence classes (diacritics stripping) onlyJarkko Hietaniemi1999-08-092-202/+125
| | | | | for letters, not for ligatures. p4raw-id: //depot/cfgperl@3936
* Todo update.Jarkko Hietaniemi1999-08-061-2/+2
| | | p4raw-id: //depot/cfgperl@3931
* Character class equivalence tables.Jarkko Hietaniemi1999-08-063-1/+342
| | | p4raw-id: //depot/cfgperl@3930
* POSIX [[:character class:]] support for standard, locale,Jarkko Hietaniemi1999-07-0622-369/+1235
| | | | | | | and utf8. If both utf8 and locale are on, utf8 wins. I don't fully understand why so many tables changed in lib/unicode because of "make" -- maybe it was just overdue. p4raw-id: //depot/cfgperl@3624
* add Ethiopic section to unicode master database (from KenGurusamy Sarathy1999-05-061-0/+345
| | | | | Whistler <kenw@sybase.com>) p4raw-id: //depot/perl@3312
* sundry pod nigglesGurusamy Sarathy1999-03-161-2/+2
| | | p4raw-id: //depot/perl@3110
* applied suggested patch (mailed to perl-unicode@perl.org) with minor tweaksDaniel Yacob1999-03-1523-0/+788
| | | | | | Message-Id: <199902232113.QAA26135@drum.cs.indiana.edu> Subject: ../lib/unicode/ Unicode 3.0 Extensions for Ethiopic p4raw-id: //depot/perl@3107
* add trailing newline to fileGurusamy Sarathy1998-07-261-1/+0
| | | p4raw-id: //depot/perl@1665