summaryrefslogtreecommitdiff
path: root/Doc/library/email.parser.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/email.parser.rst')
-rw-r--r--Doc/library/email.parser.rst133
1 files changed, 85 insertions, 48 deletions
diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst
index 49a59c0100..ee6af3fb39 100644
--- a/Doc/library/email.parser.rst
+++ b/Doc/library/email.parser.rst
@@ -7,7 +7,8 @@
Message object structures can be created in one of two ways: they can be created
from whole cloth by instantiating :class:`~email.message.Message` objects and
-stringing them together via :meth:`attach` and :meth:`set_payload` calls, or they
+stringing them together via :meth:`~email.message.Message.attach` and
+:meth:`~email.message.Message.set_payload` calls, or they
can be created by parsing a flat text representation of the email message.
The :mod:`email` package provides a standard parser that understands most email
@@ -16,8 +17,9 @@ or a file object, and the parser will return to you the root
:class:`~email.message.Message` instance of the object structure. For simple,
non-MIME messages the payload of this root object will likely be a string
containing the text of the message. For MIME messages, the root object will
-return ``True`` from its :meth:`is_multipart` method, and the subparts can be
-accessed via the :meth:`get_payload` and :meth:`walk` methods.
+return ``True`` from its :meth:`~email.message.Message.is_multipart` method, and
+the subparts can be accessed via the :meth:`~email.message.Message.get_payload`
+and :meth:`~email.message.Message.walk` methods.
There are actually two parser interfaces available for use, the classic
:class:`Parser` API and the incremental :class:`FeedParser` API. The classic
@@ -58,12 +60,18 @@ list of defects that it can find.
Here is the API for the :class:`FeedParser`:
-.. class:: FeedParser(_factory=email.message.Message)
+.. class:: FeedParser(_factory=email.message.Message, *, policy=policy.default)
Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument
callable that will be called whenever a new message object is needed. It
defaults to the :class:`email.message.Message` class.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
.. method:: feed(data)
Feed the :class:`FeedParser` some more data. *data* should be a string
@@ -94,15 +102,18 @@ Parser class API
The :class:`Parser` class, imported from the :mod:`email.parser` module,
provides an API that can be used to parse a message when the complete contents
of the message are available in a string or file. The :mod:`email.parser`
-module also provides a second class, called :class:`HeaderParser` which can be
-used if you're only interested in the headers of the message.
-:class:`HeaderParser` can be much faster in these situations, since it does not
-attempt to parse the message body, instead setting the payload to the raw body
-as a string. :class:`HeaderParser` has the same API as the :class:`Parser`
-class.
+module also provides header-only parsers, called :class:`HeaderParser` and
+:class:`BytesHeaderParser`, which can be used if you're only interested in the
+headers of the message. :class:`HeaderParser` and :class:`BytesHeaderParser`
+can be much faster in these situations, since they do not attempt to parse the
+message body, instead setting the payload to the raw body as a string. They
+have the same API as the :class:`Parser` and :class:`BytesParser` classes.
+.. versionadded:: 3.3
+ The BytesHeaderParser class.
-.. class:: Parser(_class=email.message.Message, strict=None)
+
+.. class:: Parser(_class=email.message.Message, *, policy=policy.default)
The constructor for the :class:`Parser` class takes an optional argument
*_class*. This must be a callable factory (such as a function or a class), and
@@ -110,13 +121,13 @@ class.
:class:`~email.message.Message` (see :mod:`email.message`). The factory will
be called without arguments.
- The optional *strict* flag is ignored.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the parser's operation. The default policy maintains
+ backward compatibility.
- .. deprecated:: 2.4
- Because the :class:`Parser` class is a backward compatible API wrapper
- around the new-in-Python 2.4 :class:`FeedParser`, *all* parsing is
- effectively non-strict. You should simply stop passing a *strict* flag to
- the :class:`Parser` constructor.
+ .. versionchanged:: 3.3
+ Removed the *strict* argument that was deprecated in 2.4. Added the
+ *policy* keyword.
The other public :class:`Parser` methods are:
@@ -125,7 +136,8 @@ class.
Read all the data from the file-like object *fp*, parse the resulting
text, and return the root message object. *fp* must support both the
- :meth:`readline` and the :meth:`read` methods on file-like objects.
+ :meth:`~io.TextIOBase.readline` and the :meth:`~io.TextIOBase.read`
+ methods on file-like objects.
The text contained in *fp* must be formatted as a block of :rfc:`2822`
style headers and header continuation lines, optionally preceded by a
@@ -147,19 +159,25 @@ class.
Optional *headersonly* is as with the :meth:`parse` method.
-.. class:: BytesParser(_class=email.message.Message, strict=None)
+.. class:: BytesParser(_class=email.message.Message, *, policy=policy.default)
This class is exactly parallel to :class:`Parser`, but handles bytes input.
The *_class* and *strict* arguments are interpreted in the same way as for
- the :class:`Parser` constructor. *strict* is supported only to make porting
- code easier; it is deprecated.
+ the :class:`Parser` constructor.
+
+ The *policy* keyword specifies a :mod:`~email.policy` object that
+ controls a number of aspects of the parser's operation. The default
+ policy maintains backward compatibility.
+
+ .. versionchanged:: 3.3
+ Removed the *strict* argument. Added the *policy* keyword.
.. method:: parse(fp, headeronly=False)
Read all the data from the binary file-like object *fp*, parse the
resulting bytes, and return the message object. *fp* must support
- both the :meth:`readline` and the :meth:`read` methods on file-like
- objects.
+ both the :meth:`~io.IOBase.readline` and the :meth:`~io.IOBase.read`
+ methods on file-like objects.
The bytes contained in *fp* must be formatted as a block of :rfc:`2822`
style headers and header continuation lines, optionally preceded by a
@@ -190,39 +208,55 @@ in the top-level :mod:`email` package namespace.
.. currentmodule:: email
-.. function:: message_from_string(s, _class=email.message.Message, strict=None)
+.. function:: message_from_string(s, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure from a string. This is exactly equivalent to
- ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as
- with the :class:`Parser` class constructor.
+ ``Parser().parsestr(s)``. *_class* and *policy* are interpreted as
+ with the :class:`~email.parser.Parser` class constructor.
+
+ .. versionchanged:: 3.3
+ Removed the *strict* argument. Added the *policy* keyword.
-.. function:: message_from_bytes(s, _class=email.message.Message, strict=None)
+.. function:: message_from_bytes(s, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure from a byte string. This is exactly
equivalent to ``BytesParser().parsebytes(s)``. Optional *_class* and
- *strict* are interpreted as with the :class:`Parser` class constructor.
+ *strict* are interpreted as with the :class:`~email.parser.Parser` class
+ constructor.
.. versionadded:: 3.2
+ .. versionchanged:: 3.3
+ Removed the *strict* argument. Added the *policy* keyword.
-.. function:: message_from_file(fp, _class=email.message.Message, strict=None)
+.. function:: message_from_file(fp, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure tree from an open :term:`file object`.
- This is exactly equivalent to ``Parser().parse(fp)``. Optional *_class*
- and *strict* are interpreted as with the :class:`Parser` class constructor.
+ This is exactly equivalent to ``Parser().parse(fp)``. *_class*
+ and *policy* are interpreted as with the :class:`~email.parser.Parser` class
+ constructor.
+
+ .. versionchanged::
+ Removed the *strict* argument. Added the *policy* keyword.
-.. function:: message_from_binary_file(fp, _class=email.message.Message, strict=None)
+.. function:: message_from_binary_file(fp, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure tree from an open binary :term:`file
object`. This is exactly equivalent to ``BytesParser().parse(fp)``.
- Optional *_class* and *strict* are interpreted as with the :class:`Parser`
- class constructor.
+ *_class* and *policy* are interpreted as with the
+ :class:`~email.parser.Parser` class constructor.
.. versionadded:: 3.2
+ .. versionchanged:: 3.3
+ Removed the *strict* argument. Added the *policy* keyword.
Here's an example of how you might use this at an interactive Python prompt::
>>> import email
- >>> msg = email.message_from_string(myString)
+ >>> msg = email.message_from_string(myString) # doctest: +SKIP
Additional notes
@@ -232,32 +266,35 @@ Here are some notes on the parsing semantics:
* Most non-\ :mimetype:`multipart` type messages are parsed as a single message
object with a string payload. These objects will return ``False`` for
- :meth:`is_multipart`. Their :meth:`get_payload` method will return a string
- object.
+ :meth:`~email.message.Message.is_multipart`. Their
+ :meth:`~email.message.Message.get_payload` method will return a string object.
* All :mimetype:`multipart` type messages will be parsed as a container message
object with a list of sub-message objects for their payload. The outer
- container message will return ``True`` for :meth:`is_multipart` and their
- :meth:`get_payload` method will return the list of :class:`~email.message.Message`
- subparts.
+ container message will return ``True`` for
+ :meth:`~email.message.Message.is_multipart` and their
+ :meth:`~email.message.Message.get_payload` method will return the list of
+ :class:`~email.message.Message` subparts.
* Most messages with a content type of :mimetype:`message/\*` (e.g.
:mimetype:`message/delivery-status` and :mimetype:`message/rfc822`) will also be
parsed as container object containing a list payload of length 1. Their
- :meth:`is_multipart` method will return ``True``. The single element in the
- list payload will be a sub-message object.
+ :meth:`~email.message.Message.is_multipart` method will return ``True``.
+ The single element in the list payload will be a sub-message object.
* Some non-standards compliant messages may not be internally consistent about
their :mimetype:`multipart`\ -edness. Such messages may have a
:mailheader:`Content-Type` header of type :mimetype:`multipart`, but their
- :meth:`is_multipart` method may return ``False``. If such messages were parsed
- with the :class:`FeedParser`, they will have an instance of the
- :class:`MultipartInvariantViolationDefect` class in their *defects* attribute
- list. See :mod:`email.errors` for details.
+ :meth:`~email.message.Message.is_multipart` method may return ``False``.
+ If such messages were parsed with the :class:`~email.parser.FeedParser`,
+ they will have an instance of the
+ :class:`~email.errors.MultipartInvariantViolationDefect` class in their
+ *defects* attribute list. See :mod:`email.errors` for details.
.. rubric:: Footnotes
.. [#] As of email package version 3.0, introduced in Python 2.4, the classic
- :class:`Parser` was re-implemented in terms of the :class:`FeedParser`, so the
- semantics and results are identical between the two parsers.
+ :class:`~email.parser.Parser` was re-implemented in terms of the
+ :class:`~email.parser.FeedParser`, so the semantics and results are
+ identical between the two parsers.