summaryrefslogtreecommitdiff
path: root/Doc/reference/lexical_analysis.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/reference/lexical_analysis.rst')
-rw-r--r--Doc/reference/lexical_analysis.rst163
1 files changed, 145 insertions, 18 deletions
diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst
index 37f25f1def..da7017afff 100644
--- a/Doc/reference/lexical_analysis.rst
+++ b/Doc/reference/lexical_analysis.rst
@@ -405,7 +405,8 @@ String literals are described by the following lexical definitions:
.. productionlist::
stringliteral: [`stringprefix`](`shortstring` | `longstring`)
- stringprefix: "r" | "u" | "R" | "U"
+ stringprefix: "r" | "u" | "R" | "U" | "f" | "F"
+ : | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF"
shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"'
longstring: "'''" `longstringitem`* "'''" | '"""' `longstringitem`* '"""'
shortstringitem: `shortstringchar` | `stringescapeseq`
@@ -464,6 +465,11 @@ is not supported.
to simplify the maintenance of dual Python 2.x and 3.x codebases.
See :pep:`414` for more information.
+A string literal with ``'f'`` or ``'F'`` in its prefix is a
+:dfn:`formatted string literal`; see :ref:`f-strings`. The ``'f'`` may be
+combined with ``'r'``, but not with ``'b'`` or ``'u'``, therefore raw
+formatted strings are possible, but formatted bytes literals are not.
+
In triple-quoted literals, unescaped newlines and quotes are allowed (and are
retained), except that three unescaped quotes in a row terminate the literal. (A
"quote" is the character used to open the literal, i.e. either ``'`` or ``"``.)
@@ -554,6 +560,10 @@ is more easily recognized as broken.) It is also important to note that the
escape sequences only recognized in string literals fall into the category of
unrecognized escapes for bytes literals.
+ .. versionchanged:: 3.6
+ Unrecognized escape sequences produce a DeprecationWarning. In
+ some future version of Python they will be a SyntaxError.
+
Even in a raw literal, quotes can be escaped with a backslash, but the
backslash remains in the result; for example, ``r"\""`` is a valid string
literal consisting of two characters: a backslash and a double quote; ``r"\"``
@@ -583,7 +593,111 @@ comments to parts of strings, for example::
Note that this feature is defined at the syntactical level, but implemented at
compile time. The '+' operator must be used to concatenate string expressions
at run time. Also note that literal concatenation can use different quoting
-styles for each component (even mixing raw strings and triple quoted strings).
+styles for each component (even mixing raw strings and triple quoted strings),
+and formatted string literals may be concatenated with plain string literals.
+
+
+.. index::
+ single: formatted string literal
+ single: interpolated string literal
+ single: string; formatted literal
+ single: string; interpolated literal
+ single: f-string
+.. _f-strings:
+
+Formatted string literals
+-------------------------
+
+.. versionadded:: 3.6
+
+A :dfn:`formatted string literal` or :dfn:`f-string` is a string literal
+that is prefixed with ``'f'`` or ``'F'``. These strings may contain
+replacement fields, which are expressions delimited by curly braces ``{}``.
+While other string literals always have a constant value, formatted strings
+are really expressions evaluated at run time.
+
+Escape sequences are decoded like in ordinary string literals (except when
+a literal is also marked as a raw string). After decoding, the grammar
+for the contents of the string is:
+
+.. productionlist::
+ f_string: (`literal_char` | "{{" | "}}" | `replacement_field`)*
+ replacement_field: "{" `f_expression` ["!" `conversion`] [":" `format_spec`] "}"
+ f_expression: (`conditional_expression` | "*" `or_expr`)
+ : ("," `conditional_expression` | "," "*" `or_expr`)* [","]
+ : | `yield_expression`
+ conversion: "s" | "r" | "a"
+ format_spec: (`literal_char` | NULL | `replacement_field`)*
+ literal_char: <any code point except "{", "}" or NULL>
+
+The parts of the string outside curly braces are treated literally,
+except that any doubled curly braces ``'{{'`` or ``'}}'`` are replaced
+with the corresponding single curly brace. A single opening curly
+bracket ``'{'`` marks a replacement field, which starts with a
+Python expression. After the expression, there may be a conversion field,
+introduced by an exclamation point ``'!'``. A format specifier may also
+be appended, introduced by a colon ``':'``. A replacement field ends
+with a closing curly bracket ``'}'``.
+
+Expressions in formatted string literals are treated like regular
+Python expressions surrounded by parentheses, with a few exceptions.
+An empty expression is not allowed, and a :keyword:`lambda` expression
+must be surrounded by explicit parentheses. Replacement expressions
+can contain line breaks (e.g. in triple-quoted strings), but they
+cannot contain comments. Each expression is evaluated in the context
+where the formatted string literal appears, in order from left to right.
+
+If a conversion is specified, the result of evaluating the expression
+is converted before formatting. Conversion ``'!s'`` calls :func:`str` on
+the result, ``'!r'`` calls :func:`repr`, and ``'!a'`` calls :func:`ascii`.
+
+The result is then formatted using the :func:`format` protocol. The
+format specifier is passed to the :meth:`__format__` method of the
+expression or conversion result. An empty string is passed when the
+format specifier is omitted. The formatted result is then included in
+the final value of the whole string.
+
+Top-level format specifiers may include nested replacement fields.
+These nested fields may include their own conversion fields and
+format specifiers, but may not include more deeply-nested replacement fields.
+
+Formatted string literals may be concatenated, but replacement fields
+cannot be split across literals.
+
+Some examples of formatted string literals::
+
+ >>> name = "Fred"
+ >>> f"He said his name is {name!r}."
+ "He said his name is 'Fred'."
+ >>> f"He said his name is {repr(name)}." # repr() is equivalent to !r
+ "He said his name is 'Fred'."
+ >>> width = 10
+ >>> precision = 4
+ >>> value = decimal.Decimal("12.34567")
+ >>> f"result: {value:{width}.{precision}}" # nested fields
+ 'result: 12.35'
+
+A consequence of sharing the same syntax as regular string literals is
+that characters in the replacement fields must not conflict with the
+quoting used in the outer formatted string literal::
+
+ f"abc {a["x"]} def" # error: outer string literal ended prematurely
+ f"abc {a['x']} def" # workaround: use different quoting
+
+Backslashes are not allowed in format expressions and will raise
+an error::
+
+ f"newline: {ord('\n')}" # raises SyntaxError
+
+To include a value in which a backslash escape is required, create
+a temporary variable.
+
+ >>> newline = ord('\n')
+ >>> f"newline: {newline}"
+ 'newline: 10'
+
+See also :pep:`498` for the proposal that added formatted string literals,
+and :meth:`str.format`, which uses a related format string mechanism.
.. _numbers:
@@ -612,20 +726,24 @@ Integer literals
Integer literals are described by the following lexical definitions:
.. productionlist::
- integer: `decimalinteger` | `octinteger` | `hexinteger` | `bininteger`
- decimalinteger: `nonzerodigit` `digit`* | "0"+
+ integer: `decinteger` | `bininteger` | `octinteger` | `hexinteger`
+ decinteger: `nonzerodigit` (["_"] `digit`)* | "0"+ (["_"] "0")*
+ bininteger: "0" ("b" | "B") (["_"] `bindigit`)+
+ octinteger: "0" ("o" | "O") (["_"] `octdigit`)+
+ hexinteger: "0" ("x" | "X") (["_"] `hexdigit`)+
nonzerodigit: "1"..."9"
digit: "0"..."9"
- octinteger: "0" ("o" | "O") `octdigit`+
- hexinteger: "0" ("x" | "X") `hexdigit`+
- bininteger: "0" ("b" | "B") `bindigit`+
+ bindigit: "0" | "1"
octdigit: "0"..."7"
hexdigit: `digit` | "a"..."f" | "A"..."F"
- bindigit: "0" | "1"
There is no limit for the length of integer literals apart from what can be
stored in available memory.
+Underscores are ignored for determining the numeric value of the literal. They
+can be used to group digits for enhanced readability. One underscore can occur
+between digits, and after base specifiers like ``0x``.
+
Note that leading zeros in a non-zero decimal number are not allowed. This is
for disambiguation with C-style octal literals, which Python used before version
3.0.
@@ -634,6 +752,10 @@ Some examples of integer literals::
7 2147483647 0o177 0b100110111
3 79228162514264337593543950336 0o377 0xdeadbeef
+ 100_000_000_000 0b_1110_0101
+
+.. versionchanged:: 3.6
+ Underscores are now allowed for grouping purposes in literals.
.. _floating:
@@ -645,23 +767,28 @@ Floating point literals are described by the following lexical definitions:
.. productionlist::
floatnumber: `pointfloat` | `exponentfloat`
- pointfloat: [`intpart`] `fraction` | `intpart` "."
- exponentfloat: (`intpart` | `pointfloat`) `exponent`
- intpart: `digit`+
- fraction: "." `digit`+
- exponent: ("e" | "E") ["+" | "-"] `digit`+
+ pointfloat: [`digitpart`] `fraction` | `digitpart` "."
+ exponentfloat: (`digitpart` | `pointfloat`) `exponent`
+ digitpart: `digit` (["_"] `digit`)*
+ fraction: "." `digitpart`
+ exponent: ("e" | "E") ["+" | "-"] `digitpart`
Note that the integer and exponent parts are always interpreted using radix 10.
For example, ``077e010`` is legal, and denotes the same number as ``77e10``. The
-allowed range of floating point literals is implementation-dependent. Some
-examples of floating point literals::
+allowed range of floating point literals is implementation-dependent. As in
+integer literals, underscores are supported for digit grouping.
- 3.14 10. .001 1e100 3.14e-10 0e0
+Some examples of floating point literals::
+
+ 3.14 10. .001 1e100 3.14e-10 0e0 3.14_15_93
Note that numeric literals do not include a sign; a phrase like ``-1`` is
actually an expression composed of the unary operator ``-`` and the literal
``1``.
+.. versionchanged:: 3.6
+ Underscores are now allowed for grouping purposes in literals.
+
.. _imaginary:
@@ -671,7 +798,7 @@ Imaginary literals
Imaginary literals are described by the following lexical definitions:
.. productionlist::
- imagnumber: (`floatnumber` | `intpart`) ("j" | "J")
+ imagnumber: (`floatnumber` | `digitpart`) ("j" | "J")
An imaginary literal yields a complex number with a real part of 0.0. Complex
numbers are represented as a pair of floating point numbers and have the same
@@ -679,7 +806,7 @@ restrictions on their range. To create a complex number with a nonzero real
part, add a floating point number to it, e.g., ``(3+4j)``. Some examples of
imaginary literals::
- 3.14j 10.j 10j .001j 1e100j 3.14e-10j
+ 3.14j 10.j 10j .001j 1e100j 3.14e-10j 3.14_15_93j
.. _operators: