diff options
Diffstat (limited to 'Doc/library/email.parser.rst')
-rw-r--r-- | Doc/library/email.parser.rst | 133 |
1 files changed, 85 insertions, 48 deletions
diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst index 49a59c0100..ee6af3fb39 100644 --- a/Doc/library/email.parser.rst +++ b/Doc/library/email.parser.rst @@ -7,7 +7,8 @@ Message object structures can be created in one of two ways: they can be created from whole cloth by instantiating :class:`~email.message.Message` objects and -stringing them together via :meth:`attach` and :meth:`set_payload` calls, or they +stringing them together via :meth:`~email.message.Message.attach` and +:meth:`~email.message.Message.set_payload` calls, or they can be created by parsing a flat text representation of the email message. The :mod:`email` package provides a standard parser that understands most email @@ -16,8 +17,9 @@ or a file object, and the parser will return to you the root :class:`~email.message.Message` instance of the object structure. For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME messages, the root object will -return ``True`` from its :meth:`is_multipart` method, and the subparts can be -accessed via the :meth:`get_payload` and :meth:`walk` methods. +return ``True`` from its :meth:`~email.message.Message.is_multipart` method, and +the subparts can be accessed via the :meth:`~email.message.Message.get_payload` +and :meth:`~email.message.Message.walk` methods. There are actually two parser interfaces available for use, the classic :class:`Parser` API and the incremental :class:`FeedParser` API. The classic @@ -58,12 +60,18 @@ list of defects that it can find. Here is the API for the :class:`FeedParser`: -.. class:: FeedParser(_factory=email.message.Message) +.. class:: FeedParser(_factory=email.message.Message, *, policy=policy.default) Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument callable that will be called whenever a new message object is needed. It defaults to the :class:`email.message.Message` class. + The *policy* keyword specifies a :mod:`~email.policy` object that controls a + number of aspects of the parser's operation. The default policy maintains + backward compatibility. + + .. versionchanged:: 3.3 Added the *policy* keyword. + .. method:: feed(data) Feed the :class:`FeedParser` some more data. *data* should be a string @@ -94,15 +102,18 @@ Parser class API The :class:`Parser` class, imported from the :mod:`email.parser` module, provides an API that can be used to parse a message when the complete contents of the message are available in a string or file. The :mod:`email.parser` -module also provides a second class, called :class:`HeaderParser` which can be -used if you're only interested in the headers of the message. -:class:`HeaderParser` can be much faster in these situations, since it does not -attempt to parse the message body, instead setting the payload to the raw body -as a string. :class:`HeaderParser` has the same API as the :class:`Parser` -class. +module also provides header-only parsers, called :class:`HeaderParser` and +:class:`BytesHeaderParser`, which can be used if you're only interested in the +headers of the message. :class:`HeaderParser` and :class:`BytesHeaderParser` +can be much faster in these situations, since they do not attempt to parse the +message body, instead setting the payload to the raw body as a string. They +have the same API as the :class:`Parser` and :class:`BytesParser` classes. +.. versionadded:: 3.3 + The BytesHeaderParser class. -.. class:: Parser(_class=email.message.Message, strict=None) + +.. class:: Parser(_class=email.message.Message, *, policy=policy.default) The constructor for the :class:`Parser` class takes an optional argument *_class*. This must be a callable factory (such as a function or a class), and @@ -110,13 +121,13 @@ class. :class:`~email.message.Message` (see :mod:`email.message`). The factory will be called without arguments. - The optional *strict* flag is ignored. + The *policy* keyword specifies a :mod:`~email.policy` object that controls a + number of aspects of the parser's operation. The default policy maintains + backward compatibility. - .. deprecated:: 2.4 - Because the :class:`Parser` class is a backward compatible API wrapper - around the new-in-Python 2.4 :class:`FeedParser`, *all* parsing is - effectively non-strict. You should simply stop passing a *strict* flag to - the :class:`Parser` constructor. + .. versionchanged:: 3.3 + Removed the *strict* argument that was deprecated in 2.4. Added the + *policy* keyword. The other public :class:`Parser` methods are: @@ -125,7 +136,8 @@ class. Read all the data from the file-like object *fp*, parse the resulting text, and return the root message object. *fp* must support both the - :meth:`readline` and the :meth:`read` methods on file-like objects. + :meth:`~io.TextIOBase.readline` and the :meth:`~io.TextIOBase.read` + methods on file-like objects. The text contained in *fp* must be formatted as a block of :rfc:`2822` style headers and header continuation lines, optionally preceded by a @@ -147,19 +159,25 @@ class. Optional *headersonly* is as with the :meth:`parse` method. -.. class:: BytesParser(_class=email.message.Message, strict=None) +.. class:: BytesParser(_class=email.message.Message, *, policy=policy.default) This class is exactly parallel to :class:`Parser`, but handles bytes input. The *_class* and *strict* arguments are interpreted in the same way as for - the :class:`Parser` constructor. *strict* is supported only to make porting - code easier; it is deprecated. + the :class:`Parser` constructor. + + The *policy* keyword specifies a :mod:`~email.policy` object that + controls a number of aspects of the parser's operation. The default + policy maintains backward compatibility. + + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. .. method:: parse(fp, headeronly=False) Read all the data from the binary file-like object *fp*, parse the resulting bytes, and return the message object. *fp* must support - both the :meth:`readline` and the :meth:`read` methods on file-like - objects. + both the :meth:`~io.IOBase.readline` and the :meth:`~io.IOBase.read` + methods on file-like objects. The bytes contained in *fp* must be formatted as a block of :rfc:`2822` style headers and header continuation lines, optionally preceded by a @@ -190,39 +208,55 @@ in the top-level :mod:`email` package namespace. .. currentmodule:: email -.. function:: message_from_string(s, _class=email.message.Message, strict=None) +.. function:: message_from_string(s, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure from a string. This is exactly equivalent to - ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as - with the :class:`Parser` class constructor. + ``Parser().parsestr(s)``. *_class* and *policy* are interpreted as + with the :class:`~email.parser.Parser` class constructor. + + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. -.. function:: message_from_bytes(s, _class=email.message.Message, strict=None) +.. function:: message_from_bytes(s, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure from a byte string. This is exactly equivalent to ``BytesParser().parsebytes(s)``. Optional *_class* and - *strict* are interpreted as with the :class:`Parser` class constructor. + *strict* are interpreted as with the :class:`~email.parser.Parser` class + constructor. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. -.. function:: message_from_file(fp, _class=email.message.Message, strict=None) +.. function:: message_from_file(fp, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure tree from an open :term:`file object`. - This is exactly equivalent to ``Parser().parse(fp)``. Optional *_class* - and *strict* are interpreted as with the :class:`Parser` class constructor. + This is exactly equivalent to ``Parser().parse(fp)``. *_class* + and *policy* are interpreted as with the :class:`~email.parser.Parser` class + constructor. + + .. versionchanged:: + Removed the *strict* argument. Added the *policy* keyword. -.. function:: message_from_binary_file(fp, _class=email.message.Message, strict=None) +.. function:: message_from_binary_file(fp, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure tree from an open binary :term:`file object`. This is exactly equivalent to ``BytesParser().parse(fp)``. - Optional *_class* and *strict* are interpreted as with the :class:`Parser` - class constructor. + *_class* and *policy* are interpreted as with the + :class:`~email.parser.Parser` class constructor. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. Here's an example of how you might use this at an interactive Python prompt:: >>> import email - >>> msg = email.message_from_string(myString) + >>> msg = email.message_from_string(myString) # doctest: +SKIP Additional notes @@ -232,32 +266,35 @@ Here are some notes on the parsing semantics: * Most non-\ :mimetype:`multipart` type messages are parsed as a single message object with a string payload. These objects will return ``False`` for - :meth:`is_multipart`. Their :meth:`get_payload` method will return a string - object. + :meth:`~email.message.Message.is_multipart`. Their + :meth:`~email.message.Message.get_payload` method will return a string object. * All :mimetype:`multipart` type messages will be parsed as a container message object with a list of sub-message objects for their payload. The outer - container message will return ``True`` for :meth:`is_multipart` and their - :meth:`get_payload` method will return the list of :class:`~email.message.Message` - subparts. + container message will return ``True`` for + :meth:`~email.message.Message.is_multipart` and their + :meth:`~email.message.Message.get_payload` method will return the list of + :class:`~email.message.Message` subparts. * Most messages with a content type of :mimetype:`message/\*` (e.g. :mimetype:`message/delivery-status` and :mimetype:`message/rfc822`) will also be parsed as container object containing a list payload of length 1. Their - :meth:`is_multipart` method will return ``True``. The single element in the - list payload will be a sub-message object. + :meth:`~email.message.Message.is_multipart` method will return ``True``. + The single element in the list payload will be a sub-message object. * Some non-standards compliant messages may not be internally consistent about their :mimetype:`multipart`\ -edness. Such messages may have a :mailheader:`Content-Type` header of type :mimetype:`multipart`, but their - :meth:`is_multipart` method may return ``False``. If such messages were parsed - with the :class:`FeedParser`, they will have an instance of the - :class:`MultipartInvariantViolationDefect` class in their *defects* attribute - list. See :mod:`email.errors` for details. + :meth:`~email.message.Message.is_multipart` method may return ``False``. + If such messages were parsed with the :class:`~email.parser.FeedParser`, + they will have an instance of the + :class:`~email.errors.MultipartInvariantViolationDefect` class in their + *defects* attribute list. See :mod:`email.errors` for details. .. rubric:: Footnotes .. [#] As of email package version 3.0, introduced in Python 2.4, the classic - :class:`Parser` was re-implemented in terms of the :class:`FeedParser`, so the - semantics and results are identical between the two parsers. + :class:`~email.parser.Parser` was re-implemented in terms of the + :class:`~email.parser.FeedParser`, so the semantics and results are + identical between the two parsers. |