diff options
author | Eksperimental <eksperimental@autistici.org> | 2021-02-17 01:46:26 -0500 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-02-17 07:46:26 +0100 |
commit | 5343ed80eaf60c7b8da692934685f719e16f9f37 (patch) | |
tree | 3afe4ca755bb0ef5e8af252d98f4cb366f57e6d7 | |
parent | 6acd6694fb13bb76a2743eb0adeddc05d93825fa (diff) | |
download | elixir-5343ed80eaf60c7b8da692934685f719e16f9f37.tar.gz |
Improve syntax docs (#10723)
-rw-r--r-- | lib/elixir/pages/syntax-reference.md | 2 | ||||
-rw-r--r-- | lib/elixir/pages/unicode-syntax.md | 16 |
2 files changed, 10 insertions, 8 deletions
diff --git a/lib/elixir/pages/syntax-reference.md b/lib/elixir/pages/syntax-reference.md index db1addd22..3c9865619 100644 --- a/lib/elixir/pages/syntax-reference.md +++ b/lib/elixir/pages/syntax-reference.md @@ -21,7 +21,7 @@ Integers (`1234`) and floats (`123.4`) in Elixir are represented as a sequence o ### Atoms -Unquoted atoms start with a colon (`:`) which must be immediately followed by an underscore or a Unicode letter. The atom may continue using a sequence of Unicode letters, numbers, underscores, and `@`. Atoms may end in `!` or `?`. See [Unicode Syntax](unicode-syntax.md) for a formal specification. Valid unquoted atoms are: `:ok`, `:ISO8601`, and `:integer?`. +Unquoted atoms start with a colon (`:`) which must be immediately followed by a Unicode letter or an underscore. The atom may continue using a sequence of Unicode letters, numbers, underscores, and `@`. Atoms may end in `!` or `?`. See [Unicode Syntax](unicode-syntax.md) for a formal specification. Valid unquoted atoms are: `:ok`, `:ISO8601`, and `:integer?`. If the colon is immediately followed by a pair of double- or single-quotes surrounding the atom name, the atom is considered quoted. In contrast with an unquoted atom, this one can be made of any Unicode character (not only letters), such as `:'🌢 Elixir'`, `:"++olá++"`, and `:"123"`. diff --git a/lib/elixir/pages/unicode-syntax.md b/lib/elixir/pages/unicode-syntax.md index 4196e1ecc..963d0d8f6 100644 --- a/lib/elixir/pages/unicode-syntax.md +++ b/lib/elixir/pages/unicode-syntax.md @@ -18,13 +18,13 @@ where `<Start>` uses the same categories as the spec but restricts them to the N > characters derived from the Unicode General Category of uppercase letters, lowercase letters, titlecase letters, modifier letters, other letters, letter numbers, plus `Other_ID_Start`, minus `Pattern_Syntax` and `Pattern_White_Space` code points > -> In set notation: `[\p{L}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]` +> In set notation: `[\p{L}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`. and `<Continue>` uses the same categories as the spec but restricts them to the NFC form (see R6): > ID_Start characters, plus characters having the Unicode General Category of nonspacing marks, spacing combining marks, decimal number, connector punctuation, plus `Other_ID_Continue`, minus `Pattern_Syntax` and `Pattern_White_Space` code points. > -> In set notation: `[\p{ID_Start}\p{Mn}\p{Mc}\p{Nd}\p{Pc}\p{Other_ID_Continue}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]` +> In set notation: `[\p{ID_Start}\p{Mn}\p{Mc}\p{Nd}\p{Pc}\p{Other_ID_Continue}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`. `<Ending>` is an addition specific to Elixir that includes only the code points `?` (003F) and `!` (0021). @@ -36,17 +36,19 @@ Elixir does not allow the use of ZWJ or ZWNJ in identifiers and therefore does n Unicode atoms in Elixir follow the identifier rule above with the following modifications: - * `<Start>` includes the code point `_` (005F) - * `<Continue>` includes the code point `@` (0040) + * `<Start>` additionally includes the code point `_` (005F) + * `<Continue>` additionally includes the code point `@` (0040) -> Note that all Elixir operators are also valid atoms. Therefore `:+`, `:@`, `:|>`, and others are all valid atoms. The full description of valid atoms is available in the Syntax Reference, this document covers only the rules for identifier-based atoms. +> Note that all Elixir operators are also valid atoms. Therefore `:+`, `:@`, `:|>`, and others are all valid atoms. Atoms can also be quoted, which allows any character, such as `:"hello world"`. The full description of valid atoms is available in the Syntax Reference, this document covers only the rules for identifier-based atoms. ### Variables Variables in Elixir follow the identifier rule above with the following modifications: - * `<Start>` includes the code point `_` (005F) - * `<Start>` must not include Lu (letter uppercase) and Lt (letter titlecase) characters + * `<Start>` additionally includes the code point `_` (005F) + * `<Start>` additionally excludes Lu (letter uppercase) and Lt (letter titlecase) characters + +In set notation: `[\u{005F}\p{Ll}\p{Lm}\p{Lo}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`. ## R3. Pattern_White_Space and Pattern_Syntax Characters |