| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8().
* encoding.c:
(UTF16LEToUTF8):
- Fix comment to describe what the code does.
(UTF16BEToUTF8):
- Fix undefined behavior which was applied to UTF16LEToUTF8() in
2f9382033e.
- Add bounds check to while() loop which was applied to
UTF16LEToUTF8() in be803967db.
- Do not return -2 when (in >= inend) to fix the bug. This was
applied to UTF16LEToUTF8() in 496a1cf592.
- Inline (<< 8) statements to match UTF16LEToUTF8().
Add the following tests and results:
test/text-4-byte-UTF-16-BE-offset.xml
test/text-4-byte-UTF-16-BE.xml
test/text-4-byte-UTF-16-LE-offset.xml
test/text-4-byte-UTF-16-LE.xml
|
|
|
|
|
|
|
| |
Fix regression introduced when reworking htmlParsePubidLiteral in
commit 93ce33c2.
Fixes #318.
|
|
|
|
|
|
|
|
| |
Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was
removed in commit 62150ed2.
This commit also introduced a regression for direct users of
xmlParseContent. Unclosed tags weren't checked.
|
|
|
|
|
|
|
|
|
| |
Commit 62150ed2 introduced a small regression in the error messages for
mismatched tags. This typically only affected messages after the first
mismatch, but with custom SAX handlers all line numbers would be off.
This also fixes line numbers in the SAX push parser which were never
handled correctly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement section "4.6 Predefined Entities" of the XML 1.0 spec and
check whether redeclarations of predefined entities match the original
definitions.
Note that some test cases declared
<!ENTITY lt "<">
But the XML spec clearly states that this is illegal:
> If the entities lt or amp are declared, they MUST be declared as
> internal entities whose replacement text is a character reference to
> the respective character (less-than sign or ampersand) being escaped;
> the double escaping is REQUIRED for these entities so that references
> to them produce a well-formed result.
Also fixes #217 but the connection is only tangential. The integer
overflow discovered by fuzzing was more related to the fact that various
parts of the parser disagreed on whether to prefer predefined entities
over their redeclarations. The whole situation is a mess and even
depends on legacy parser options. But now that redeclarations are
validated, it shouldn't make a difference.
As noted in the added comment, this is also one of the cases where
overly defensive checks can hide interesting logic bugs from fuzzers.
|
|
|
|
|
|
|
| |
Abort parsing early to avoid an almost infinite loop in certain error
cases involving recursive entities.
Found with libFuzzer.
|
|
|
|
|
|
|
|
|
| |
Note that the caret in error messages generated during comment parsing
may have moved by one byte.
See guidance provided on incorrectly-closed comments here:
https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment
|
|
|
|
|
|
| |
See guidance provided on incorrectly-closed comments here:
https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment
|
|
|
|
|
| |
this establishes the baseline behavior so that subsequent commits
which modify this behavior are clear about what's being changed.
|
|
|
|
|
|
| |
The code wasn't dead after all, but I can see no reason in delaying
the XPointer evaluation. This could lead to nodes included earlier
appearing in XPointer results.
|
|
|
|
|
| |
xi:fallback could become empty after recursive expansion. Use a flag
to track whether nodes should be skipped.
|
|
|
|
|
|
|
|
|
| |
When creating XML_XINCLUDE_START nodes, the children of the original
xi:include node must be freed, otherwise fallback content is copied
twice, doubling runtime and memory consumption for each nested
xi:fallback/xi:include pair.
Found with libFuzzer.
|
|
|
|
|
|
|
| |
Otherwise, nested xi:include nodes might result in a use-after-free
if XML_PARSE_NOXINCNODE is specified.
Found with libFuzzer and ASan.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a few remaining cases where the HTML push parser would scan more
content during lookahead than being parsed later.
Make sure that htmlParseDocTypeDecl consumes all content up to the
final '>' in case of errors. The old comment said "We shouldn't try to
resynchronize", but ignoring invalid content is also what the HTML5
spec mandates.
Likewise, make htmlParseEndTag skip to the final '>' in invalid end
tags even if not in recovery mode. This is probably the most visible
change in practice and leads to different output for some tests but is
also more in line with HTML5.
Make sure that htmlParsePI and htmlParseComment don't abort if invalid
characters are encountered but log an error and ignore the character.
Change some other end-of-buffer checks to test for a zero byte instead
of relying on IS_CHAR.
Fix usage of IS_CHAR macro in htmlParseScript.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup
<https://bugzilla.gnome.org/show_bug.cgi?id=757711>
- Bug 783015 - Integer-overflow in xmlFAParseQuantExact
<https://bugzilla.gnome.org/show_bug.cgi?id=783015>
(Regexptests): Add support for checking stderr output when
running regexp tests. This makes it possible to check in test
cases that fail and not see false-positive error output when
running the tests. Unlike other libxml2 test suites, if there
is no stderr output, no *.err file needs to be created.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit eeb99329 removed an important optimization avoiding quadratic
runtime when repeatedly scanning the input buffer for terminating
characters in the HTML push parser. The related bug is
https://bugzilla.gnome.org/show_bug.cgi?id=444994
Make sure that ctxt->checkIndex is always written and store additional
parser state in ctxt->inSubset which is unused in the HTML parser.
Found by OSS-Fuzz.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Despite the comment, I can't see a reason why external entities must be
loaded in the SAX handler. For external entities, the handler is
typically first invoked via xmlParseReference which will later load the
entity on its own if it wasn't loaded yet.
The old code also lead to duplicated SAX events which makes it
basically impossible to reuse xmlSAX2GetEntity for a custom SAX parser.
See the change to the expected test output.
Note that xmlSAX2GetEntity was loading the entity via
xmlParseCtxtExternalEntity while xmlParseReference uses
xmlParseExternalEntityPrivate. In the previous commit, the two
functions were merged, trying to compensate for some slight differences
between the two mostly identical implementations.
But the more urgent reason for this change is that xmlParseReference
has the facility to abort early when recursive entities are detected,
avoiding what could practically amount to an infinite loop.
If you want to backport this change, note that the previous three
commits are required as well:
f9ea1a24 Fix copying of entities in xmlParseReference
5c7e0a9a Copy some XMLReader option flags to parser context
1a3e584a Merge code paths loading external entities
Found by OSS-Fuzz.
|
|
|
|
|
|
|
|
|
| |
Before, reader mode would end up in a branch that didn't handle
entities with multiple children and failed to update ent->last, so the
hack copying the "extra" reader data wouldn't trigger. Consequently,
some empty nodes in entities are correctly detected now in the test
suite. (The detection of empty nodes in entities is still buggy,
though.)
|
|
|
|
| |
Closes #109.
|
|
|
|
|
| |
Conditional sections are only allowed in *external* parameter entities
referenced from the internal subset.
|
|
|
|
|
|
| |
Avoid call stack overflow in deeply nested conditional sections.
Found by OSS-Fuzz.
|
|
|
|
|
|
| |
- One of the bug316338 test cases is expected to succeed.
- Memory leak in testRegexp.c.
- Refcount handling in xmlExpHashGetEntry.
|
|
|
|
|
|
|
| |
Fixes bug 649244:
https://bugzilla.gnome.org/show_bug.cgi?id=649244
Closes #57.
|
|
|
|
|
|
|
| |
Split xmlParseElement into subfunctions. Use nameNsPush to store prefix,
URI and nsNr on the heap, similar to the push parser.
Closes #84.
|
| |
|
|
|
|
|
|
|
|
| |
libxml2 correctly rejects any4_0.xsd as invalid schema. I can't figure
out what the intent behind this test case was. Simply adjust the
expected output to match the current behavior.
Closes #92.
|
|
|
|
|
|
|
| |
Non-compound (##local) and compound string atoms are always disjoint
regardless of whether the compound atom is negated (##other).
Closes #40.
|
|
|
|
| |
Closes #53.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, test/relaxng/ambig_name-class2.xml would fail to validate
against test/relaxng/ambig_name-class2.rng:
> test/relaxng/ambig_name-class2.rng:4:
> element attribute: Relax-NG parser error :
> Found anyName attribute without oneOrMore ancestor
> Relax-NG schema test/relaxng/ambig_name-class2.rng failed to compile
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Previously, test/relaxng/ambig_name-class.xml would fail to validate
for a simple reason -- interleave within "open-name-class" context
is supposed to be fine with whatever else is pending the consumption,
since effectively, it's unrelated from a higher parsing perspective.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
|
|
|
|
|
| |
It's defined behavior but -fsanitize=unsigned-integer-overflow is
useful to discover bugs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consolidate code paths evaluating XPath predicates and filters.
Don't push context node on stack when evaluating predicates. I have no
idea why this was done. It seems completely useless and trying to pop
the context node from a corrupted stack has already caused security
issues.
Filter nodesets in-place and don't create node sets with NULL gaps which
allows to simplify merging a great deal. Simply move matched nodes
backward and create a compact node set.
Merge xmlXPathCompOpEvalPositionalPredicate into
xmlXPathCompOpEvalPredicate.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rewrite conversion of double to int in xmlXPathSubstringFunction, adding
range checks to avoid undefined behavior. Make sure to add start and
length as floating-point numbers before converting to int. Fix a bug
when rounding negative start indices.
Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math
instead. Avoid computing the string length. xmlUTF8Strsub works as
expected if the length of the requested substring exceeds the input.
Found with libFuzzer and UBSan.
|
|
|
|
|
|
| |
When including a grammar from another grammar, we need to make sure that any
redefines of starts and includes that that grammar does inside any of its
include elements are also removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pattern nameClass allows for nested choice elements, for example
<name>
<choice>
<choice>
<name>a</name>
<name>b</name>
</choice>
<name>c</name>
</choice>
</name>
which is semantically equivalent to
<name>
<choice>
<name>a</name>
<name>b</name>
<name>c</name>
</choice>
</name>
The old code didn’t handle this correctly, as it never expected a choice inside
another choice. This patch fixes this by flattening any nested choices.
This pattern of nested choice elements comes up in RELAX NG simplification,
where all choice elements are rewritten in this nested manner, see section 4.12
of the RELAX NG specification.
|
|
|
|
|
|
| |
RELAX NG allows for div elements inside of include elements. We need to look
inside those div elements for start and define elements that may be redefining
start and define elements in the included grammar.
|
|
|
|
|
|
|
|
| |
This avoids miscalculation of available bytes.
Thanks to Yunho Kim for the report.
Closes: #26
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix two bugs in xmlXPathNodeValHash which could lead to errors when
comparing nodesets to strings:
- Only use contents of text nodes to compute the hash for element nodes.
Comments, PIs, and other node types don't affect the string-value and
must be ignored.
- Reset `string` to NULL for node types other than text.
Reported by Aleksei on the mailing list:
https://mail.gnome.org/archives/xml/2017-September/msg00016.html
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 79c8a6b which caused a serious regression in
streaming mode.
Also reverts part of commit 52ceced "Fix infinite loops with push
parser in recovery mode".
Fixes bug 786554.
|
|
|
|
|
|
|
|
|
| |
When expanding a parameter entity in a DTD, infinite recursion could
lead to an infinite loop or memory exhaustion.
Thanks to Wei Lei for the first of many reports.
Fixes bug 759579.
|
|
|
|
|
|
| |
Now that replacement of parameter entities goes exclusively through
xmlSkipBlankChars, we can account for the surrounding space characters
there and remove the "blanks wrapper" hack.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pop all extra input streams before resetting the input. Otherwise,
a call to xmlPopInput could make input available again.
Also set input->end to input->cur.
Changes the test output for some error tests. Unfortunately, some
fuzzed test cases were added to the test suite without manual cleanup.
This makes it almost impossible to review the impact of later changes
on the test output.
|
|
|
|
|
| |
Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other
misspellings.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure to finish all entities in the internal subset. Nevertheless,
readd a sanity check in xmlParseStartTag2 that was lost in my previous
commit. Also add a sanity check in xmlPopInput. Popping an input
unexpectedly was the source of many recent memory bugs. The check
doesn't mitigate such issues but helps with diagnosis.
Always base entity boundary checks on the input ID, not the input
pointer. The pointer could have been reallocated to the old address.
Always throw a well-formedness error if a boundary check fails. In a
few places, a validity error was thrown.
Fix a few error codes and improve indentation.
|
|
|
|
| |
This detects regressions like bug 760367.
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Silence test output.
- Clean up after doc/examples tests.
- Adjust expected output for script tests.
- Add missing results for relaxng/pattern3
There are still two test failures I can't comment on:
- regexp/bug316338
- schemas/any4_0
|
|
|
|
|
| |
This caused failures in the HTML push tests but the fix required to
change the expected output of the HTML SAX tests.
|