Since we can't parse in CData objects ATM, added a test for CData objects created manually, to keep the bits from rotting.

author: Leonard Richardson <leonard.richardson@canonical.com> 2011-02-20 10:39:30 -0500
committer: Leonard Richardson <leonard.richardson@canonical.com> 2011-02-20 10:39:30 -0500
commit: 75cc2bc54252cdcdf05c669bb3e6ffaa12eb0b97 (patch)
tree: b9e23800d9f83e594bae121712a6b98208def20d /CHANGELOG
parent: 590ffbfd4f0b1ff656bb45ac6344b0f815ffa149 (diff)
download: beautifulsoup4-75cc2bc54252cdcdf05c669bb3e6ffaa12eb0b97.tar.gz
1 files changed, 19 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 96a9ed4..3fb4f36 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -74,6 +74,25 @@ Unicode character. There are no longer any smartQuotesTo or
 convert_entities arguments. (Unicode Dammit still has smart_quotes_to,
 but the default is now to turn smart quotes into Unicode.)
 
+== CDATA sections are normal text, if they're understood at all. ==
+
+Currently, both HTML parsers ignore CDATA sections in markup:
+
+ <p><![CDATA[foo]]></p> => <p></p>
+
+A future version of html5lib will turn CDATA sections into text nodes,
+but only within tags like <svg> and <math>:
+
+ <svg><![CDATA[foo]]></svg> => <p>foo</p>
+
+The default XML parser (which uses lxml behind the scenes) turns CDATA
+sections into ordinary text elements:
+
+ <p><![CDATA[foo]]></p> => <p>foo</p>
+
+In theory it's possible to preserve the CDATA sections when using the
+XML parser, but I don't see how to get it to work in practice.
+
 = 3.1.0 =
 
 A hybrid version that supports 2.4 and can be automatically converted
author	Leonard Richardson <leonard.richardson@canonical.com>	2011-02-20 10:39:30 -0500
committer	Leonard Richardson <leonard.richardson@canonical.com>	2011-02-20 10:39:30 -0500
commit	75cc2bc54252cdcdf05c669bb3e6ffaa12eb0b97 (patch)
tree	b9e23800d9f83e594bae121712a6b98208def20d /CHANGELOG
parent	590ffbfd4f0b1ff656bb45ac6344b0f815ffa149 (diff)
download	beautifulsoup4-75cc2bc54252cdcdf05c669bb3e6ffaa12eb0b97.tar.gz