diff options
author | Stefan Behnel <stefan_ml@behnel.de> | 2019-08-13 19:49:54 +0200 |
---|---|---|
committer | Stefan Behnel <stefan_ml@behnel.de> | 2019-08-13 19:49:54 +0200 |
commit | 2f64a0c52ff57c6116be436ddf7953895c344399 (patch) | |
tree | 02b6e7f658527d4cf078ccdffb70501ec644c679 | |
parent | 1781e48f8e51bb3eba8e31c3d7fbc47b4acfae26 (diff) | |
download | python-lxml-2f64a0c52ff57c6116be436ddf7953895c344399.tar.gz |
Clarify the usage of "element.clear(keep_tail=True)" in some examples.
-rw-r--r-- | CHANGES.txt | 6 | ||||
-rw-r--r-- | doc/parsing.txt | 6 | ||||
-rw-r--r-- | doc/tutorial.txt | 9 |
3 files changed, 12 insertions, 9 deletions
diff --git a/CHANGES.txt b/CHANGES.txt index dc9f33ad..f157b6ea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -20,9 +20,9 @@ Bugs fixed Features added -------------- -* ``Element.clear()`` accepts a new keyword argument ``keep_tail=True`` to - clear everything but the tail text. This is helpful in some document-style - use cases. +* ``Element.clear()`` accepts a new keyword argument ``keep_tail=True`` to clear + everything but the tail text. This is helpful in some document-style use cases + and for clearing the current element in ``iterparse()`` and pull parsing. * When creating attributes or namespaces from a dict in Python 3.6+, lxml now preserves the original insertion order of that dict, instead of always sorting diff --git a/doc/parsing.txt b/doc/parsing.txt index a9664d67..a271dc03 100644 --- a/doc/parsing.txt +++ b/doc/parsing.txt @@ -654,14 +654,14 @@ that are no longer needed: >>> parser.feed('<element><child /></element>') >>> for action, elem in events: ... print('%s: %d' % (elem.tag, len(elem))) # processing - ... elem.clear() # delete children + ... elem.clear(keep_tail=True) # delete children element: 0 child: 0 element: 1 >>> parser.feed('<empty-element xmlns="http://testns/" /></root>') >>> for action, elem in events: ... print('%s: %d' % (elem.tag, len(elem))) # processing - ... elem.clear() # delete children + ... elem.clear(keep_tail=True) # delete children {http://testns/}empty-element: 0 root: 3 @@ -688,7 +688,7 @@ of the current element: >>> for event, element in parser.read_events(): ... # ... do something with the element - ... element.clear() # clean up children + ... element.clear(keep_tail=True) # clean up children ... while element.getprevious() is not None: ... del element.getparent()[0] # clean up preceding siblings diff --git a/doc/tutorial.txt b/doc/tutorial.txt index 18c4e97c..b98d3b4f 100644 --- a/doc/tutorial.txt +++ b/doc/tutorial.txt @@ -1004,7 +1004,10 @@ that the Element has been parsed completely. It also allows you to ``.clear()`` or modify the content of an Element to save memory. So if you parse a large tree and you want to keep memory usage small, you should clean up parts of the tree that you no longer -need: +need. The ``keep_tail=True`` argument to ``.clear()`` makes sure that +(tail) text content that follows the current element will not be touched. +It is highly discouraged to modify any content that the parser may not +have completely read through yet. .. sourcecode:: pycon @@ -1016,7 +1019,7 @@ need: ... print(element.text) ... elif element.tag == 'a': ... print("** cleaning up the subtree") - ... element.clear() + ... element.clear(keep_tail=True) data ** cleaning up the subtree None @@ -1041,7 +1044,7 @@ for data extraction. >>> for _, element in etree.iterparse(xml_file, tag='a'): ... print('%s -- %s' % (element.findtext('b'), element[1].text)) - ... element.clear() + ... element.clear(keep_tail=True) ABC -- abc MORE DATA -- more data XYZ -- xyz |