diff options
author | Akim Demaille <akim.demaille@gmail.com> | 2018-12-09 19:58:34 +0100 |
---|---|---|
committer | Akim Demaille <akim.demaille@gmail.com> | 2018-12-16 12:27:28 +0100 |
commit | 1d5956f87fc6cdbe66d848f11c21d66cbacf8b12 (patch) | |
tree | 0806b35028e4102dd180c5ac0cb013290988503f /src/symlist.c | |
parent | fdceb6330f9e7e8dd7f6e43080c8976f216f6505 (diff) | |
download | bison-1d5956f87fc6cdbe66d848f11c21d66cbacf8b12.tar.gz |
symbols: clean up their parsing
Prompted by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html
We have four classes of directives that declare symbols: %nterm,
%type, %token, and the family of %left etc. Currently not all of them
support the possibility to have several type tags (`<type>`), and not
all of them support the fact of not having any type tag at all
(%type). Let's unify this.
- %type
POSIX Yacc specifies that %type is for nonterminals only. However,
some Bison users want to use it for both tokens and nterms
(actually, Bison's own grammar does this in several places, e.g.,
CHAR). So it should accept char/string literals.
As a consequence cannot be used to declare tokens with their alias:
`%type foo "foo"` would be ambiguous (are we defining foo = "foo",
or are these two different symbols?)
POSIX specifies that it is OK to use %type without a type tag. I'm
not sure what it means, but we support it.
- %token
Accept token declarations with number and string literal:
(ID|CHAR) NUM? STRING?.
- %left, etc.
They cannot be the same as %token, because we accept to declare the
symbol with %token, and to then qualify its precedence with %left.
Then `%left foo "foo"` would also be ambiguous: foo="foo", or two
symbols.
They cannot be simply a list of identifiers, but POSIX Yacc says we
can declare token numbers here. I personally think this is a bad
idea, precedence management is tricky in itself and should not be
cluttered with token declaration issues.
We used to accept declaring a token number on a string literal here
(e.g., `%left "token" 1`). This is abnormal. Either the feature is
useful, and then it should be supported in %token, or it's useless
and we should not support it in corner cases.
- %nterm
Obviously cannot accept tokens, nor char/string literals. Does not
exist in POSIX Yacc, but since %type also works for terminals, it is
a nice option to have.
* src/parse-gram.y: Avoid relying on side effects. For instance, get
rid of current_type, rather, build the list of symbols and iterate
over it to assign the type.
It's not always possible/convenient. For instance, we still use
current_class.
Prefer "decl" to "def", since in the rest of the implementation we
actually "declare" symbols, we don't "define" them.
(token_decls, token_decls_for_prec, symbol_decls, nterm_decls): New.
Use them for %token, %left, %type and %nterm.
* src/symlist.h, src/symlist.c (symbol_list_type_set): New.
* tests/regression.at b/tests/regression.at
(Token number in precedence declaration): We no longer accept
to give a number to string literals.
Diffstat (limited to 'src/symlist.c')
-rw-r--r-- | src/symlist.c | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/src/symlist.c b/src/symlist.c index d7b61e24..7d9fe83f 100644 --- a/src/symlist.c +++ b/src/symlist.c @@ -81,6 +81,15 @@ symbol_list_type_new (uniqstr type_name, location loc) } +symbol_list * +symbol_list_type_set (symbol_list *syms, uniqstr type_name, location loc) +{ + for (symbol_list *l = syms; l; l = l->next) + symbol_type_set (l->content.sym, type_name, loc); + return syms; +} + + /*-----------------------------------------------------------------------. | Print this list, for which every content_type must be SYMLIST_SYMBOL. | `-----------------------------------------------------------------------*/ |