| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "multiline" logic of diag.t was getting confused by define
statements that would define a symbol to call an error function but not
end in ";", this would then slurp potentially many lines errorenously,
potentially absorbing more than one message. The multi-line logic also
would undef $listed_as and lose the diag_listed_as data in some
circumstances.
Fixing those issues revealed some interesting cases. To fix one of them
I defined a new noop macro in perl.h to help: PERL_DIAG_WARN_SYNTAX(),
which helps the diag.t parser identify messages without needing to be
actually part of a specific message line. These macros are noops, they
just return their argument, but they help hint to diag.t what is going
on. Maybe in the future this can be reworked to be more generic, there
are other similar cases that are not covered.
Interestingly fixing this bug meant that at least one message that used
to be erroneously picked up was no longer identified or tested. This was
replaced with a PERL_DIAG_DIE_SYNTAX() wrapper.
|
|
|
|
|
|
|
| |
This is a rebasing by @khw of part of GH #18792, which I needed to get
in now to proceed with other commits.
It also strips trailing white space from the affected files.
|
|
|
|
|
| |
This was the consensus in
http://nntp.perl.org/group/perl.perl5.porters/258489
|
|
|
|
|
| |
A future commit will need it to represent just the meaning of the new
name
|
|
|
|
|
|
|
|
|
|
|
| |
This just detabifies to get rid of the mixed tab/space indentation.
Applying consistent indentation and dealing with other tabs are another issue.
Done with `expand -i`.
* vutil.* left alone, it's part of version.
* Left regen managed files alone for now.
|
|
|
|
|
| |
The remaining function in this file is moved to inline.h, just to not
have an extra file lying around with hardly anything in it.
|
|
|
|
|
|
|
|
| |
All the other messages raised when a construct is expecting a
terminating '}' but none is found include the '}' in the message. '\o{'
did not. Since these diagnostics are getting revised anyway, and I
didn't find any CPAN modules relying on the wording, this commit makes
the messages consistent by adding the '}' to the \o message.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit causes these functions to allow a caller to request any
messages generated to be returned to the caller, instead of always being
handled within these functions. The messages are somewhat changed from
previously to be clearer. I did not find any code in CPAN that relied
on the previous message text.
Like the previous commit for grok_bslash_c, here are two reasons to do
this, repeated here.
1) In pattern compilation this brings these messages into conformity
with the other ones that get generated in pattern compilation, where
there is a particular syntax, including marking the exact position in
the parse where the problem occurred.
2) These could generate truncated messages due to the (mostly)
single-pass nature of pattern compilation that is now in effect. It
keeps track of where during a parse a message has been output, and
won't output it again if a second parsing pass turns out to be
necessary. Prior to this commit, it had to assume that a message
from one of these functions did get output, and this caused some
out-of-bounds reads when a subparse (using a constructed pattern) was
executed. The possibility of those went away in commit 5d894ca5213,
which guarantees it won't try to read outside bounds, but that may
still mean it is outputting text from the wrong parse, giving
meaningless results. This commit should stop that possibility.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit causes this function to allow a caller to request any
messages generated to be returned to the caller, instead of always being
handled within this function.
Like the previous commit for grok_bslash_c, here are two reasons to do
this, repeated here.
1) In pattern compilation this brings these messages into conformity
with the other ones that get generated in pattern compilation, where
there is a particular syntax, including marking the exact position in
the parse where the problem occurred.
2) The messages could be truncated due to the (mostly) single-pass
nature of pattern compilation that is now in effect. It keeps track
of where during a parse a message has been output, and won't output
it again if a second parsing pass turns out to be necessary. Prior
to this commit, it had to assume that a message from one of these
functions did get output, and this caused some out-of-bounds reads
when a subparse (using a constructed pattern) was executed. The
possibility of those went away in commit 5d894ca5213, which
guarantees it won't try to read outside bounds, but that may still
mean it is outputting text from the wrong parse, giving meaningless
results. This commit should stop that possibility.
|
|
|
|
|
|
| |
In two functions, future commits will generalize this parameter to be
possibly a warning message instead of only an error message. Change its
name to reflect the added meaning.
|
| |
|
|
|
|
|
|
|
|
|
| |
This recently added assertion actually caught an error, which is a
potential read beyond end of buffer. This doesn't actually happen in
this case because this is a regular expression pattern, and the toker
makes sure there is a trailing NUL (that isn't counted).
The solution is to check the bounds before reading.
|
|
|
|
|
| |
This code read a byte that was potentially out-of-bounds. I don't know
how it could get this far, but maybe some fuzzing code could get it.
|
|
|
|
|
|
|
|
| |
An empty \x{} is unfortunately legal (returning a NUL) except in the
scope of "use re 'strict'". Since this is an experimental feature,
things like wording changes are allowed. It is unlikely anyone is
relying on the precise wording of this fatal error under 'strict', and
now all the messages for similar errors are similarly worded.
|
|
|
|
|
|
|
|
|
|
|
| |
An empty \o{} no longer says "Number with no digits" in favor of "Empty
\o{}" which is more consistent with errors raised for things like \b{},
\P{}.
There is a small risk of breakage with this change, as with any
diagnostic wording change. However, this construct is relatively new
and rarely used, and this is a fatal error, not a warning you might want
to trap on. There are no empty \o{} instances in CPAN.
|
|
|
|
|
| |
Otherwise malformed input could cause this to return a pointer outside
its buffer
|
|
|
|
|
|
|
| |
This allows \x and \o to work properly in the face of embedded NULs.
A limit parameter is added to each function, and that is passed to
memchr (which replaces strchr). See the branch merge message for more
information.
|
|
|
|
|
|
| |
assert() already does nothing unless -DDEBUGGING; no need to enclose
them in #ifdef DEBUGGING. And this adds another assertion that is
required to be true on entry to the function.
|
|
|
|
| |
This reverts commit bfdc8cd3d5a81ab176f7d530d2e692897463c97d.
|
|
|
|
|
|
|
|
|
|
|
| |
Starting in 5.14, we deprecated the use of "\cI<X>" when this
results in a printable character. For instance, "\c:" is just
a fancy way of writing "z". Starting in 5.28, this will be a
fatal error.
This also includes certain usage in regular expressions with the
experimental (?[ ]) construct, or when "use re 'strict'" is in
effect (also experimental).
|
|
|
|
|
|
|
|
|
|
| |
grok_bslash_x() is so large that no compiler will inline it. Move it to
dquote.c from dq_inline.c. Conversely, move form_octal_warning() to
dq_inline.c. It is so tiny that the function call overhead is scarcely
smaller than the function body.
This also moves things in embed.fnc so all these functions. are not
visible outside the few files they are supposed to be used in.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code in toke.c assumes that the UTF8 expansion of the string
"\x{foo}" takes no more bytes than the original input text, which
includes the 4 bytes of overhead "\x{}". Similarly for "\o{}". The
functions that convert to the code point actually now assert for this.
The next commit will make this assumption definitely invalid on EBCDIC
platforms. Remove the assertions, and actually handle the case
properly. The other places that call the conversion functions do not
make this assumption, so there is no harm in removing them from there.
Since we believe that this can't happen except on EBCDIC, we
could #ifdef this code and use just an assert on non-EBCDIC. But it's
easier to maintain if #ifdef's are minimized. Parsing is not a
time-critical operation, like being in an inner loop, and the extra test
gives a branch prediction hint to the compiler.
|
|
|
|
|
|
|
|
|
|
| |
UNI_SKIP is somewhat ambiguous. Perl has long used 'uvchr' as part of a
name to mean the unsigned values using the native character set plus
Unicode values for those above 255.
This also changes two calls (one in dquote_static.c and one in
dquote_inline.h) to use UVCHR_SKIP; they should not have been OFFUNI, as
they are dealing with native values.
|
|
Instead of #include-ing the C file, compile it normally.
|