summaryrefslogtreecommitdiff
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
* "no feature" now means reset to defaultRicardo Signes2012-02-221-7/+8
| | | | | | | | | See https://rt.perl.org/rt3/Ticket/Display.html?id=108776 "no feature" now resets to the default feature set. To disable all features (which is likely to be a pretty special-purpose request, since it presumably won't match any named set of semantics) you can now write "no feature ':all'"
* Move Pod::Functions from lib/ to ext/Nicholas Clark2012-02-183-520/+1
|
* Bring the joy of strict and warnings to Functions.tNicholas Clark2012-02-181-5/+7
| | | | | This reveals that use_ok() was not in a BEGIN block, and in turn that the test count needs to be declared before this BEGIN block runs. Now fixed.
* Terser code in Pod::Functions to generate $Type_Description and @Type_Order.Nicholas Clark2012-02-181-48/+27
|
* Teach Pod::Functions that each, keys and values also operate on arrays.Nicholas Clark2012-02-182-4/+4
| | | | | | | These were added to the section 'Functions for real @ARRAYs' in perlfunc.pod by commit a5ce339cb0c533c9 in Sep 2010. As ever, tweak the golden results in the test to match these changes.
* Add all missing functions to Pod::Functions.Nicholas Clark2012-02-182-13/+22
| | | | | | | | | | | evalbytes was added to perlfunc.pod by commit 7289c5e6ca773d7c in Nov 2011. fc was added to perlfunc.pod by commit 628253b8ba8b9cbe in Jan 2012. say was added by commit 0d863452f5cac863 in Dec 2005. state was added.pod by commit 36fb85f3330d45ee in Jul 2006. __FILE__, __LINE__ and __PACKAGE__ were added by commit cfa52385fa426b5e in Aug 2011, and __SUB__ by commit 84ed01088568ffe9 in Nov 2011. Again, tweak the golden results in the test to match these changes.
* Teach Pod::Functions about 'Keywords related to the switch feature'.Nicholas Clark2012-02-182-4/+13
| | | | | | | Commit 0d863452f5cac863 in Dec 2005 added the switch feature, along with documentation in perlfunc.pod, but did not update Pod::Functions. Again, tweak the golden results in the test to match these changes.
* Update Pod::Functions with changes from perlfunc.podNicholas Clark2012-02-182-31/+17
| | | | | | | | | | | | | | | | | | | | | | | Updated description of Binary from commit 5dac7880bdc47787 in Feb 2011. Updated description of Flow from commit cf2649810f00335b in Jul 2005, and added the "the" which has always been missing from Pod::Function's version. Updated description of Modules from commit 3b10bc60979cfe9a in Jan 2010. Updated description of Objects from commit 353c650532037e40 in Oct 2007. The description of Namespaces had always differed from that in perlfunc.pod. Remove stray tabs from the descriptions of gets and sprintf. Commit 19799a22062ef658 (May 1999) added lock to perlfunc.pod without a it is the only function in "Threads", move it to "Misc", instead of creating a category just for it. use always had two entries with different descriptions in the __DATA__ section. This isn't actually sensible, as the code that builds the exported data structures ends up taking Types from both, and using the last description that it sees. So merge the two together to reflect this. Drop the CHANGES section from the Pod, which is both incomplete and redundant, given that version control does this job much better. Tweak the golden results in the test to match these changes.
* perl #77654: quotemeta quotes non-ASCII consistentlyKarl Williamson2012-02-151-2/+2
| | | | | | | | | | As described in the pod changes in this commit, this changes quotemeta() to consistenly quote non-ASCII characters when used under unicode_strings. The behavior is changed for these and UTF-8 encoded strings to more closely align with Unicode's recommendations. The end result is that we *could* at some future point start using other characters as metacharacters than the 12 we do now.
* mktables: Generate a table for quotemetaKarl Williamson2012-02-151-0/+25
| | | | | This adds a new table generated by mktables consisting of the code points that should be escaped by quotemeta
* charnames.t: viacode doesn't return Unicode_1 name alwaysKarl Williamson2012-02-131-1/+7
| | | | There are now four characters which have a different preferred name.
* mktables: viacode() return unparenthesized names for 4 controlsKarl Williamson2012-02-132-6/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit changes the viacode() returned name for four control characters, as follows: Code point Old Name New Name U+000A LINE FEED (LF) LINE FEED U+000C FORM FEED (FF) FORM FEED U+000D CARRIAGE RETURN (CR) CARRIAGE RETURN U+0085 NEXT LINE (NEL) NEXT LINE Only the return from viacode is affected. All the names are accepted as input, as they always have been. Unicode 6.1 now has official names for all the controls, and the new names match those. The old names were the ones that were recommended by TR18 prior to 6.1, and still are, sort of. This change uses the official names in preference to the TR18 ones. We probably wouldn't bother except that the old names were problematic--the only names in the whole universe of names containing parentheses, and not matching traditional usage. The new names have always been accepted as inputs by Perl. I actually doubt that Unicode ever grokked that they were recommending these ugly names. and they haven't paid much attention to TR18 anyway, breaking it in version 6.0 by encoding one of the recommended names (BELL) as an official name for another code point, and without realizing it. TR18 now is in limbo, still wrongly recommending BELL, with a rewrite being promised for many months now. It's unclear what will happen with it. It was agreed on p5p to go with the cleaner, now official names, instead of the older, likely obsolete, TR18 names. I did a search of CPAN; it was unclear if this change, (which again is only for viacode()) mattered to any code there or not. There were a few instances of the old names, but none of those were apparently associated with viacode().
* mktables: Don't add exact duplicate to tablesKarl Williamson2012-02-131-1/+4
| | | | | | This was a bug in the case where there can be multiple entries in a table for a single code point. But there only can be one identical entry.
* unset PERLDB_OPTS environment variable or rt-61222 might hang.Todd Rinaldo2012-02-121-0/+1
| | | | eg. PERLDB_OPTS='RemotePort=some.other.host:9000'
* mktables: Update comments, variable namesKarl Williamson2012-02-111-36/+38
| | | | | | | | | Commit d11155ec2b4e3f6cf952e2a25615aec506a8e296 changed the format of some of the generated tables, but I left some of the old comments and variable names the same in order to not make this already large commit bigger. This updates these to reflect the new format. It also refactors one 'if' statement to not use a block.
* UCD.t: white-space onlyKarl Williamson2012-02-101-13/+13
| | | | This outdents some statements that are no longer enclosed in a block
* mktables: Fix up some comments in the generated filesKarl Williamson2012-02-101-9/+28
| | | | | These were incorrectly stating that some tables are accessible via Unicode::UCD, and giving the wrong name in some instances.
* Unicode::UCD::prop_invmap: Store Nv property as adjusted typeKarl Williamson2012-02-103-25/+41
| | | | | By converting this property to requiring adjustments to get the proper values, its storage size decreases by more than half.
* Unicode::UCD::prop_invmap(): New improved APIKarl Williamson2012-02-103-207/+306
| | | | | | | | | | | | Thanks to Tony Cook for suggesting this. The API is changed from returning deltas of code points, to storing the actual correct values, but requiring adjustments for the non-initial elements in a range, as explained in the pod. This makes the data less confusing to look at, and gets rid of inconsistencies if we didn't make the same sort of deltas for entries that were, e.g. arrays of code points.
* Unicode::UCD: move common directory to subroutineKarl Williamson2012-02-101-15/+11
| | | | | | | All the files that should ever be read by the subroutine will be found in the unicore directory, so can specify it in the subroutine instead of in each call to it. This makes things slightly easier in future commits.
* Unicode::UCD: pod and comment nitsKarl Williamson2012-02-101-33/+30
| | | | | One comment is out-dated, also moves a line of code so that the comments flow better.
* Add regen/mk_invlists.pl, charclass_invlists.hKarl Williamson2012-02-091-0/+4
| | | | | | | This will be used to generate compile-time inversion lists in a C hdr file that can be included in programs for initialization speed Three simple inversion lists are included in this initial commit
* Move lib/Pod/t/eol.t to ext/Pod-Html, as it's testing Pod::Html.Nicholas Clark2012-02-081-71/+0
|
* Convert triplicated code in lib/Pod/t/eol.t to a loop.Nicholas Clark2012-02-081-37/+13
|
* Refactor lib/Pod/t/eol.tNicholas Clark2012-02-081-38/+39
| | | | | | | | | * use variables for the names of temporary files * use lexicals for file handles * check the return value of close * use is() rather than ok() with == [possibly still dubious that it's using unpack checksums for comparison, instead of SHAs or simply File::Compare]
* The cleanup code in lib/Pod/t/eol.t needs updating to track Pod::Html changes.Nicholas Clark2012-02-081-2/+1
| | | | | | eol.t gained code to clean up temporary files it generated as part of commit 0ec158f4b0db050a in 2002. The temporary file names used by Pod::Html were changed by commit 33869856bc668ad8 in 2003, but eol.t had never been updated.
* sync version.pm code with CPANDavid Golden2012-02-052-16/+18
| | | | | | Applied patch from John Peacock, but added whitespace fixes, corrected pod link error and updated known Pod issues to reflect a fix.
* warnings.pm docs: clarify categories are in perllexwarnDavid Golden2012-02-041-1/+2
|
* Unicode::UCD move =item in podKarl Williamson2012-02-041-38/+38
| | | | | This merely moves a whole=item to another place, in preparation for future commits
* Unicode::UCD::prop_invmap() compress digit resultsKarl Williamson2012-02-043-28/+49
| | | | | | This changes the output of prop_invmap() for the Perl_Decimal_Digit property to use code point deltas, similar to other properties. This causes the output to be 1/10 what it used to be.
* UCD.t: White space onlyKarl Williamson2012-02-041-25/+25
| | | | Indent properly to account for these being in a newly formed block
* Unicode::UCD::prop_invmap(): Make the NFKCCF property return deltasKarl Williamson2012-02-042-25/+48
| | | | | | | The file for this property is stored in the old-style format for backward compatibility with any applications that might be reading it directly. But the values should be returned through the Unicode::UCD API as deltas for consistency with other, similar properties.
* Unicode::UCD::prop_invmap(): Return deltas for the 'dm' propertyKarl Williamson2012-02-042-25/+122
| | | | | | | | | | Earlier commits caused the return of prop_invmap() for certain properties to return deltas from code points instead of the code points themselves, for compactness of storage and speed of searching. This causes the same for the 'dm' property, for consistency with the others, even though the space savings is not large for this one; essentially the same code can be used for the two types now; instead of an application having to have special cases.
* mktables: Generate some delta tablesKarl Williamson2012-02-043-75/+107
| | | | | | | | | | This commit has the effect of changing the non-legacy tables for the lc, uc, tc, and fc properties to use maps of deltas from the code points instead of the code points themselves, thus shortening them significantly, and hence the time required to search through them. Note that these tables are new, and currently used only by Unicode::UCD. A future commit will change the Perl core to use them.
* mktables: Change generated file commentKarl Williamson2012-02-041-2/+4
| | | | | | | | | All the files that mktables generates that are for external-to-core use have now been changed so that the code requests explicitly for each that they have the comment that says they are for external use, but it is deprecated to use them. That means that any files that haven't been so explicitly set should have the comment instead that says they are for internal use only.
* mktables: Preserve old format in some tablesKarl Williamson2012-02-041-0/+18
| | | | | | | | Future commits will cause tables that map to code points to, in general, use deltas instead. This ensures that files that contain tables and have been mentioned publicly in the past continue to have their current contents and format, so that applications that read them (such as Unicode::Normalize) are unaffected.
* Unicode::UCD: pod and comment nitsKarl Williamson2012-02-041-19/+19
|
* mktables: Allow generation of delta tablesKarl Williamson2012-02-041-8/+123
| | | | | | | | | | | | | | | | | | | Delta tables are those in which the mapping is not stored as-is, but is modified to be the delta between the actual mapping and the code point it is for. This allows for smaller tables that are faster to search and require less memory to store. For example, consider the lower case mapping of A=>a, B=b, ... Z=>z. Prior to this patch, this requires 26 entries in the table; now it requires just one. This is because A=65 and a=97. We store 97-65=32. And 32 is the same delta for each of A-Z, so we can store these as a single range each with the same value, 32. The delta tables tend to be half as large as the non-ones, or even smaller. This just enables the feature. No tables currently use it. For that, changes in other Unicode::UCD need to be coordinated.
* mktables: White-space, comments onlyKarl Williamson2012-02-041-163/+179
| | | | | | | | | | | A previous commit has added two nested blocks surrounding the affected code. This looks like a big change, but it is in fact only white space plus reflowing things to fit in an 80 column window, plus slight changes to comments. I verified that there were no code changes by using a diff command that can ignore leading white space changes, and hence gave a more accurate difference listing
* mktables: Refactor if-else seriesKarl Williamson2012-02-041-6/+8
| | | | | | | This is a slight refactoring to avoid using 'next' in the loop, and to surround things with a bare block. Future commits will want to do common code at the bottom of the loop, including a redo of the bare block.
* Unicode::UCD::prop_invmap(): Use regex to get trieKarl Williamson2012-02-041-2/+1
| | | | This should speed up this test slightly
* mktables: Don't generate no-longer used tablesKarl Williamson2012-02-041-36/+1
| | | | | Previous commits have removed all uses of these tables, so they are no longer needed.
* Unicode::UCD: Rmv uses of no-longer needed tablesKarl Williamson2012-02-041-74/+17
| | | | | | Previous commits have expanded whats in the full case mapping tables to include the simple maps as well. Thus the specially constructed tables need no longer be used, leading to simplification.
* UCD.t: white space onlyKarl Williamson2012-02-041-3/+3
| | | | outdent now that surrounding block is removed
* mktables: Include simple mappings in full tablesKarl Williamson2012-02-042-56/+32
| | | | | | | | | | | | | | | | | | | This changes the case change mapping tables to include the simple mappings. This was done in 5.14 for the case folding table. The full mappings are contained, as before, in a hash. Now the simple mappings they override (when doing multi-char case changing) are added to the main body of the table, to the already existing simple mappings that aren't overridden. If the caller wants to do full mapping, it should look first in the hash, and only if not found, look in the main body. If the caller wants only simple mapping, it ignores the hash. This is already how the code in utf8.c that reads these tables is constructed. The .t is modified to take into account that these code points are now in the main table body.
* mktables: Add duplicate tablesKarl Williamson2012-02-041-11/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | This is for backwards compatibility. Future commits will change these tables that are generated by mktables to be more efficient. But the existence of them was advertised in v5.12 and v5.14, as something a Perl program could use because the Perl core did not provide access to their contents. We can't change the format of those without some notice. The solution adopted is to have two versions of the tables, one kept in the original file name has the original format; and the other is free to change formats at will. This commit just creates copies of the original, with the same format. Later commits will change the format to be more efficient. We state in v5.16 that using these files is now deprecated, as the information is now available through Unicode::UCD in a stable API. But we don't test for whether someone is opening and reading these files; so the deprecation cycle should be somewhat long; they will be unused, and the only drawbacks to having them are some extra disk space and the time spent in having to generate them at Perl build time. This commit also changes the Perl core to use the original tables, so that the new format can be gradually developed in a series of patches without having to cut over the whole thing at once.
* mktables: avoid some extra workKarl Williamson2012-02-041-8/+7
| | | | | | | The object is already known to us as the loop variable, so no need to derive it again; and change the loop variable name and one other variable name to distinguish the table as being the full map one from the simple map one
* mktables: Allow non-standard initializations of propertiesKarl Williamson2012-02-041-5/+27
| | | | | | | | | | | | | | Some property tables have multiple values per code point. These include the final Name-equivalent property in which some code points have more than one synonym; and the full case changing property tables that are supersets of the simple case changing tables, in which some code points have a full mapping that differs from the simple mapping. Prior to this patch, these could not be initialized simply using the Initialize parameter to the constructor, as it was unable to handle multiple values per code point. This also preserves the range type.
* mktables: Comments, white-space and typo in message text onlyKarl Williamson2012-02-041-11/+13
|
* mktables: Refactor populating simple case folding tablesKarl Williamson2012-02-041-15/+43
| | | | | | | | | | | | | These three tables are handled alike; this creates a loop to execute the same instructions on each of them. Currently there is so little to do, that it wouldn't be worth it, except that future commits will add complications, and this makes those easier to handle. There is now a test that the input data is sane, and instead of overwriting a value in a table with a known identical value, we skip that. This doesn't save much effort, because most of the work is looking up the value (which we can now check sanity for), but again will be useful for future commits.