summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEksperimental <eksperimental@autistici.org>2021-02-17 01:46:26 -0500
committerGitHub <noreply@github.com>2021-02-17 07:46:26 +0100
commit5343ed80eaf60c7b8da692934685f719e16f9f37 (patch)
tree3afe4ca755bb0ef5e8af252d98f4cb366f57e6d7
parent6acd6694fb13bb76a2743eb0adeddc05d93825fa (diff)
downloadelixir-5343ed80eaf60c7b8da692934685f719e16f9f37.tar.gz
Improve syntax docs (#10723)
-rw-r--r--lib/elixir/pages/syntax-reference.md2
-rw-r--r--lib/elixir/pages/unicode-syntax.md16
2 files changed, 10 insertions, 8 deletions
diff --git a/lib/elixir/pages/syntax-reference.md b/lib/elixir/pages/syntax-reference.md
index db1addd22..3c9865619 100644
--- a/lib/elixir/pages/syntax-reference.md
+++ b/lib/elixir/pages/syntax-reference.md
@@ -21,7 +21,7 @@ Integers (`1234`) and floats (`123.4`) in Elixir are represented as a sequence o
### Atoms
-Unquoted atoms start with a colon (`:`) which must be immediately followed by an underscore or a Unicode letter. The atom may continue using a sequence of Unicode letters, numbers, underscores, and `@`. Atoms may end in `!` or `?`. See [Unicode Syntax](unicode-syntax.md) for a formal specification. Valid unquoted atoms are: `:ok`, `:ISO8601`, and `:integer?`.
+Unquoted atoms start with a colon (`:`) which must be immediately followed by a Unicode letter or an underscore. The atom may continue using a sequence of Unicode letters, numbers, underscores, and `@`. Atoms may end in `!` or `?`. See [Unicode Syntax](unicode-syntax.md) for a formal specification. Valid unquoted atoms are: `:ok`, `:ISO8601`, and `:integer?`.
If the colon is immediately followed by a pair of double- or single-quotes surrounding the atom name, the atom is considered quoted. In contrast with an unquoted atom, this one can be made of any Unicode character (not only letters), such as `:'🌢 Elixir'`, `:"++olá++"`, and `:"123"`.
diff --git a/lib/elixir/pages/unicode-syntax.md b/lib/elixir/pages/unicode-syntax.md
index 4196e1ecc..963d0d8f6 100644
--- a/lib/elixir/pages/unicode-syntax.md
+++ b/lib/elixir/pages/unicode-syntax.md
@@ -18,13 +18,13 @@ where `<Start>` uses the same categories as the spec but restricts them to the N
> characters derived from the Unicode General Category of uppercase letters, lowercase letters, titlecase letters, modifier letters, other letters, letter numbers, plus `Other_ID_Start`, minus `Pattern_Syntax` and `Pattern_White_Space` code points
>
-> In set notation: `[\p{L}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`
+> In set notation: `[\p{L}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`.
and `<Continue>` uses the same categories as the spec but restricts them to the NFC form (see R6):
> ID_Start characters, plus characters having the Unicode General Category of nonspacing marks, spacing combining marks, decimal number, connector punctuation, plus `Other_ID_Continue`, minus `Pattern_Syntax` and `Pattern_White_Space` code points.
>
-> In set notation: `[\p{ID_Start}\p{Mn}\p{Mc}\p{Nd}\p{Pc}\p{Other_ID_Continue}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`
+> In set notation: `[\p{ID_Start}\p{Mn}\p{Mc}\p{Nd}\p{Pc}\p{Other_ID_Continue}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`.
`<Ending>` is an addition specific to Elixir that includes only the code points `?` (003F) and `!` (0021).
@@ -36,17 +36,19 @@ Elixir does not allow the use of ZWJ or ZWNJ in identifiers and therefore does n
Unicode atoms in Elixir follow the identifier rule above with the following modifications:
- * `<Start>` includes the code point `_` (005F)
- * `<Continue>` includes the code point `@` (0040)
+ * `<Start>` additionally includes the code point `_` (005F)
+ * `<Continue>` additionally includes the code point `@` (0040)
-> Note that all Elixir operators are also valid atoms. Therefore `:+`, `:@`, `:|>`, and others are all valid atoms. The full description of valid atoms is available in the Syntax Reference, this document covers only the rules for identifier-based atoms.
+> Note that all Elixir operators are also valid atoms. Therefore `:+`, `:@`, `:|>`, and others are all valid atoms. Atoms can also be quoted, which allows any character, such as `:"hello world"`. The full description of valid atoms is available in the Syntax Reference, this document covers only the rules for identifier-based atoms.
### Variables
Variables in Elixir follow the identifier rule above with the following modifications:
- * `<Start>` includes the code point `_` (005F)
- * `<Start>` must not include Lu (letter uppercase) and Lt (letter titlecase) characters
+ * `<Start>` additionally includes the code point `_` (005F)
+ * `<Start>` additionally excludes Lu (letter uppercase) and Lt (letter titlecase) characters
+
+In set notation: `[\u{005F}\p{Ll}\p{Lm}\p{Lo}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`.
## R3. Pattern_White_Space and Pattern_Syntax Characters