summaryrefslogtreecommitdiff
path: root/pod/perlunicode.pod
Commit message (Collapse)AuthorAgeFilesLines
* perlunicode: Fix pod errorKarl Williamson2011-05-181-1/+1
|
* perlunicode.pod: NitsKarl Williamson2011-04-181-2/+4
|
* Nits in perlunicodeTom Christiansen2011-04-161-4/+5
|
* perlunicode: More 5.14 editsKarl Williamson2011-04-151-11/+9
|
* perlunicode: Edits for 5.14Tom Christiansen2011-04-151-113/+116
|
* perlunicode: Update for 5.14Karl Williamson2011-04-131-210/+91
|
* /dual are available in 5.14 as suffix after allKarl Williamson2011-03-301-7/+1
|
* perlunicode: mention quotemeta utf8 inconsistencyKarl Williamson2011-03-261-2/+9
|
* [perl #86994] perlunicode: Rebuilding databases needs a source treeDavid Leadbeater2011-03-251-8/+8
| | | | | | | At some point enough files were installed that it was possible to rebuild perl's Unicode databases outside the source tree. This is no longer possible. (171f12bc in 2003 seems to have stopped installing Makefiles under lib/ so this doc is very outdated.)
* pods: Unicode::Casing is now availableKarl Williamson2011-03-241-1/+1
|
* perlunicode: Minor correctionsKarl Williamson2011-03-221-8/+21
|
* perlunicode: mention new Unicode::CasingKarl Williamson2011-03-211-3/+9
|
* perlunicode: double spaceFather Chrysostomos2011-03-121-1/+1
|
* make /\p{isUserDefined}/ die on taintDavid Mitchell2011-02-221-0/+4
| | | | | | If the string which contains the name of a user-defined character property function is tainted, then die rather than calling that function. See [perl #82616].
* perlunicode.pod: Remove false statementKarl Williamson2011-02-191-6/+1
| | | | In fact the code is such that changing an A to a cntrol does work
* Move ANYOF folding from regexec to regcompKarl Williamson2011-02-021-0/+27
| | | | | | | | | | This is for security as well as performance. It allows Unicode properties to not be matched case sensitively. As a result the swash inversion hash is converted from having utf8 keys to numeric, code point, keys. It also for the first time fixes the bug where /i doesn't work for a code point not at the end of a range in a bracketed character class has a multi-character fold
* perlunicode: Add explanatory textKarl Williamson2011-01-191-5/+10
|
* Typos and nits in podsKarl Williamson2011-01-191-2/+2
|
* perlunicode.pod: Update for /aKarl Williamson2011-01-191-17/+73
|
* Document the flip of problematic code points handlingKarl Williamson2011-01-091-18/+43
|
* Fix typos in pod/*Peter J. Acklam) (via RT2011-01-071-1/+1
| | | | | | | # New Ticket Created by (Peter J. Acklam) # Please include the string: [perl #81906] # in the subject line of all future correspondence about this issue. # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=81906 >
* Nit in perlunicode.podKarl Williamson2010-12-011-3/+3
|
* Document Unicode doc fixKarl Williamson2010-12-011-40/+17
|
* Nit in perlunicode.podKarl Williamson2010-12-011-1/+1
|
* [bracketed char class] fixesKarl Williamson2010-11-221-4/+25
| | | | | | | | | This patch adds two functions for setting the ANYOF node bitmaps. The one for dealing with folds has intelligence as to what to do if unicode semantics is in effect. Together with previous commits, this fixes the unicode bug for bracketed character classes, as far as known bugs go, so pods are updated as well.
* Nits in perlunicode.podKarl Williamson2010-11-221-2/+3
|
* perlunicode.pod: Add detail on utf8/locale conflictsKarl Williamson2010-10-311-2/+7
|
* [:posix:] now works under /uKarl Williamson2010-10-311-2/+3
| | | | | | | This patch is part of fixing the Unicode bug. The /u regex modifier now applies to posix character classes. This resolves [perl #18281]. The Todo tests in reg_posicc.t have all been made not todo.
* Subject: [perl #58182] partial: Add uni \s,\w matchingKarl Williamson2010-10-151-5/+8
| | | | | | | | | | | | | | | | | | | This commit causes regex sequences \b, \s, and \w (and complements) to match in the latin1 range in the scope of feature 'unicode_strings' or with the /u regex modifier. It uses the previously unused flags field in the respective regnodes to indicate the type of matching, and in regexec.c, uses that to decide which of the handy.h macros to use, native or Latin1. I chose this for now rather than create new nodes for each type of match. An earlier version of this patch did that, and in every case the switch case: statements were adjacent, offering no performance advantage. If regexec were modified to use in-line functions or more macros for various short section of it, then it would be faster to have new nodes rather than using the flags field. But, using that field simplified things, as this change flies under the radar in a number of places where it would not if separate nodes were used.
* Fix a pod link I brokeFlorian Ragwitz2010-09-161-2/+3
| | | | Oooops!
* perlunicode.pod: Clarify user-defined casing.Karl Williamson2010-09-151-30/+45
| | | | | | I ran some experiments and found out that the user-defined casing worked in ways that were surprises to me. And thus, this brutally lays out its shortcomings.
* perlunicode.pod: Fix misleading info, expandKarl Williamson2010-08-251-42/+92
| | | | | | | | | There was some misleading, or uncharitably, wrong text in this pod about user-defined casing. And, it jumped the gun, presuming that 5.14 would fix something for which there has not been a patch submitted yet. And, I realized there was a way around having to figure out the utf8 for a character.
* perlunicode.pod: Elaborate unicode bug for POSIXKarl Williamson2010-08-111-1/+3
| | | | | Mention the POSIX character classes as being affected by the Unicode bug.
* Revert "perlunicode.pod: Elaborate unicode bug for POSIX"Rafael Garcia-Suarez2010-08-111-2/+1
| | | | This reverts commit d67647f5f40a7e78bffc92ff8600c67f95d3d7b0.
* perlunicode.pod: Elaborate unicode bug for POSIXKarl Williamson2010-08-111-1/+2
| | | | | Mention the POSIX character classes as being affected by the Unicode bug.
* [perl 71764] Extend charnames to all of UnicodeKarl Williamson2010-07-131-2/+1
| | | | | | | | | | | | | | | | | | | | This patch causes \N{}, vianame, and viacode to know the names of all Unicode code points. Previously the names that are algorithmically determinable were not handled. These include the Hangul syllables and many CJK characters. It simply adds using the routines that mktables inserts into Name.pl that handle these characters. mktables generates these algorithms from data in the Unicode data base. The routines have been there since 11/2009 in anticipation of this change, but have been unused until now. They probably have not been reviewed thoroughly. The major change to this is the .t file. Now that all code points are understood, the .t tests them all. But this would take too long each time, so it tests a random sample. If there is a failure, the seed is output so that the test can be reproduced. This idea came from Michael Schwern, and is the same he uses in Test::Sims. Various parameters about the sampling are easily adjustable.
* Document tricks, work-arounds for user-defined casingKarl Williamson2010-05-301-7/+70
| | | | And add a .t file to verify that it works.
* PATCH: correct misstatement, formats in perlunicodeKarl Williamson2010-05-251-10/+11
| | | | This is suitable for 5.12.2, but not many people use this feature.
* perlunicode: fix for 80 col displayKarl Williamson2010-05-081-49/+56
|
* Nits in perlunicode.podKarl Williamson2010-04-261-27/+35
|
* Note that can be warned on implicit utf8 upgradeKarl Williamson2010-03-111-1/+2
| | | | | | | | The module encoding::warnings can be used to warn when two strings are concatenated where one is utf8 and the other is not and contains non-ASCII. Note the existence of this in the pod documentation.
* Note that \N{U+...} forces character semanticsKarl Williamson2010-02-281-5/+6
|
* Mention \N{U+...} in perlunicode.podKarl Williamson2010-02-281-7/+13
|
* * Em dash cleanup in pod/brian d foy2010-01-131-2/+2
| | | | | | | | | | | | | I looked at all the instances of spaces around -- and in most cases converted the sentences to use more appropriate punctuation. In general, the -- in the perl docs seem to be there only to make really complicated and really long sentences. I didn't look at the closed em-dashes. They probably have the same sentence-complexity problem. I left some open em-dashes in place. Those are the ones used in lists.
* Correct \p{print} to not match LINE SEPARATOR nor PARAGRAPH SEPARATORKarl Williamson2009-12-301-1/+1
| | | | | | | | | | The Unicode Standard defines (as a recommendation) that Print be based on graphical characters and blank characters (minus controls). Perl's has been based on space rather than blank. The only practical effect this has is that Perl erroneously matches the LINE SEPARATOR and PARAGRAPH SEPARATOR, which clearly are not printable characters. Signed-off-by: Abigail <abigail@abigail.be>
* PATCH: correct grammatical error in perlunicode.podkarl williamson2009-12-291-2/+2
| | | | | | | | | | | Attached From 75bb462da5f7ea844447dfdd7d9aadfe15f6dcf3 Mon Sep 17 00:00:00 2001 From: Karl Williamson <khw@khw-desktop.(none)> Date: Tue, 29 Dec 2009 13:08:28 -0700 Subject: [PATCH] Correct grammatical error in perlunicode.pod Signed-off-by: H.Merijn Brand <h.m.brand@xs4all.nl>
* PATCH: document all Perl Unicode \p{} extensionskarl williamson2009-12-281-28/+245
| | | | | | | | | | | This also changes some C<> constructs. From d01b049b3aa9bc3a394adb30d6db735f5dd52321 Mon Sep 17 00:00:00 2001 From: Karl Williamson <khw@khw-desktop.(none)> Date: Mon, 28 Dec 2009 09:14:48 -0700 Subject: [PATCH] Document all perl Unicode \p extensions Signed-off-by: H.Merijn Brand <h.m.brand@xs4all.nl>
* Update podsKarl Williamson2009-12-251-3/+3
| | | | Signed-off-by: Abigail <abigail@abigail.be>
* Update .podsKarl Williamson2009-12-251-118/+181
| | | | Signed-off-by: Abigail <abigail@abigail.be>
* Unicode documentation updatesKarl Williamson2009-12-201-430/+237
|