diff options
author | Nick Wellnhofer <wellnhofer@aevum.de> | 2023-03-31 16:47:48 +0200 |
---|---|---|
committer | Nick Wellnhofer <wellnhofer@aevum.de> | 2023-03-31 17:08:43 +0200 |
commit | d7d0bc6581e332f49c9ff628f548eced03c65189 (patch) | |
tree | 5f7e0bcd90a4ee375d29caa946c37eb519cf4691 /result | |
parent | 0e42adce77a9c115402d7f24d8d3b0130f841ed1 (diff) | |
download | libxml2-d7d0bc6581e332f49c9ff628f548eced03c65189.tar.gz |
SAX2: Ignore namespaces in HTML documents
In commit 21ca8829, we started to ignore namespaces in HTML element
names but we still called xmlSplitQName, effectively stripping the
namespace prefix. This would cause elements like <o:p> being parsed
as <p>. Now we leave the name untouched.
Fixes #508.
Diffstat (limited to 'result')
-rw-r--r-- | result/HTML/names.html | 6 | ||||
-rw-r--r-- | result/HTML/names.html.err | 3 | ||||
-rw-r--r-- | result/HTML/names.html.sax | 20 |
3 files changed, 29 insertions, 0 deletions
diff --git a/result/HTML/names.html b/result/HTML/names.html new file mode 100644 index 00000000..dd7dcc2e --- /dev/null +++ b/result/HTML/names.html @@ -0,0 +1,6 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> +<html> +<body> + <o:p></o:p> +</body> +</html> diff --git a/result/HTML/names.html.err b/result/HTML/names.html.err new file mode 100644 index 00000000..4d91a5d2 --- /dev/null +++ b/result/HTML/names.html.err @@ -0,0 +1,3 @@ +./test/HTML/names.html:3: HTML parser error : Tag o:p invalid + <o:p></o:p> + ^ diff --git a/result/HTML/names.html.sax b/result/HTML/names.html.sax new file mode 100644 index 00000000..12a107f8 --- /dev/null +++ b/result/HTML/names.html.sax @@ -0,0 +1,20 @@ +SAX.setDocumentLocator() +SAX.startDocument() +SAX.startElement(html) +SAX.characters( +, 1) +SAX.startElement(body) +SAX.characters( + , 3) +SAX.startElement(o:p) +SAX.error: Tag o:p invalid +SAX.endElement(o:p) +SAX.characters( +, 1) +SAX.endElement(body) +SAX.characters( +, 1) +SAX.endElement(html) +SAX.characters( +, 1) +SAX.endDocument() |