summaryrefslogtreecommitdiff
path: root/doc/source
diff options
context:
space:
mode:
authorLeonard Richardson <leonardr@segfault.org>2018-12-24 09:54:10 -0500
committerLeonard Richardson <leonardr@segfault.org>2018-12-24 09:54:10 -0500
commita36e7ac2a24bf8aa91b51da3c82ad11368adb146 (patch)
treedd61de49cd0af70491b77f4c1b771b5caa0b2bb0 /doc/source
parentb3aa1fe88487ea8fbd4533d410d2fa26962ed608 (diff)
downloadbeautifulsoup4-a36e7ac2a24bf8aa91b51da3c82ad11368adb146.tar.gz
Keep track of the namespace abbreviations found while parsing the document. This makes select() work most of the time without requiring a value for 'namespaces'.
Diffstat (limited to 'doc/source')
-rw-r--r--doc/source/index.rst23
1 files changed, 15 insertions, 8 deletions
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 2977029..9bf9cf1 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -1781,8 +1781,7 @@ first tag that matches a selector::
# <a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>
If you've parsed XML that defines namespaces, you can use them in CSS
-selectors. You just have to pass a dictionary of the namespace
-mappings into ``select()``::
+selectors.::
from bs4 import BeautifulSoup
xml = """<tag xmlns:ns1="http://namespace1/" xmlns:ns2="http://namespace2/">
@@ -1794,15 +1793,23 @@ mappings into ``select()``::
soup.select("child")
# [<ns1:child>I'm in namespace 1</ns1:child>, <ns2:child>I'm in namespace 2</ns2:child>]
- namespaces = dict(ns1="http://namespace1/", ns2="http://namespace2/")
soup.select("ns1|child", namespaces=namespaces)
# [<ns1:child>I'm in namespace 1</ns1:child>]
-All of this is a convenience for people who know the CSS selector
-syntax. You can do all this stuff with the Beautiful Soup API. And if
-CSS selectors are all you need, you should parse the document
-with lxml: it's a lot faster. But this lets you `combine` CSS
-selectors with the Beautiful Soup API.
+When handling a CSS selector that uses namespaces, Beautiful Soup
+uses the namespace abbreviations it found when parsing the
+document. You can override this by passing in your own dictionary of
+abbreviations::
+
+ namespaces = dict(first="http://namespace1/", second="http://namespace2/")
+ soup.select("second|child", namespaces=namespaces)
+ # [<ns1:child>I'm in namespace 2</ns1:child>]
+
+All this CSS selector stuff is a convenience for people who already
+know the CSS selector syntax. You can do all of this with the
+Beautiful Soup API. And if CSS selectors are all you need, you should
+parse the document with lxml: it's a lot faster. But this lets you
+`combine` CSS selectors with the Beautiful Soup API.
Modifying the tree
==================