-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: More lenient parsing #200
Comments
Import cases:
|
While still producing an error, allowing the rpc or option ast nodes to still be generated. See #200 Will follow up with diffs of a similar pattern for the other body contexts.
Hey @Alfus 👋 I decided on changing the parser grammar to require semicolons to terminate most declarations, then inserting them from the lexer wherever they are technically required by the grammar. I found that having the grammar be as unambiguous as possible made it much less likely I'd run into shift/reduce problems, especially when combining this with other unrelated grammar changes like permitting trailing commas, extension names with mismatched parentheses ( IIRC there were also a couple places where I was unable to make the grammar unambiguous without doing something like treating '\n' as a token, which gets super weird. There are some drawbacks to this method, mostly that handling syntax errors when they aren't actually syntax errors is trickier. But you might find this strategy easier overall. Technically the "best" solution is rolling your own parser, but that's obviously a lot of work. What are your thoughts? |
While still producing an error. See #200
While still producing an error, see #200.
While still reporting an error, see: #200 A little different from the others, as oneof and extensions don't support empty decls.
While still producing an error, see: #200
While still producing an error, see: #200
I considered several techniques like this (for example injecting a special cursor token), though anything that modifies the input will also make source positions inaccurate. |
While still producing an error, see #200.
While still producing an error, see: #200
While still reporting an error. The important case for code completion is extension type names. See #200
Only remaining issue from the original list is making a field with only a type work:
|
Here is how I implemented that one: messageFieldDecl : fieldCardinality notGroupElementTypeIdent identifier '=' _INT_LIT ';' {
$$ = ast.NewFieldNode($1.ToKeyword(), $2, $3, $4, $5, nil, $6)
}
| fieldCardinality notGroupElementTypeIdent identifier '=' _INT_LIT compactOptions ';' {
$$ = ast.NewFieldNode($1.ToKeyword(), $2, $3, $4, $5, $6, $7)
}
| msgElementTypeIdent identifier '=' _INT_LIT ';' {
$$ = ast.NewFieldNode(nil, $1, $2, $3, $4, nil, $5)
}
| msgElementTypeIdent identifier '=' _INT_LIT compactOptions ';' {
$$ = ast.NewFieldNode(nil, $1, $2, $3, $4, $5, $6)
}
// new code below
| fieldCardinality notGroupElementTypeIdent identifier '=' ';' {
$$ = ast.NewIncompleteFieldNode($1.ToKeyword(), $2, $3, $4, nil, nil, $5)
}
| fieldCardinality notGroupElementTypeIdent identifier ';' {
$$ = ast.NewIncompleteFieldNode($1.ToKeyword(), $2, $3, nil, nil, nil, $4)
}
| fieldCardinality notGroupElementTypeIdent ';' {
$$ = ast.NewIncompleteFieldNode($1.ToKeyword(), $2, nil, nil, nil, nil, $3)
}
| msgElementTypeIdent identifier '=' ';' {
$$ = ast.NewIncompleteFieldNode(nil, $1, $2, $3, nil, nil, $4)
}
| msgElementTypeIdent identifier ';' {
$$ = ast.NewIncompleteFieldNode(nil, $1, $2, nil, nil, nil, $3)
}
| msgElementTypeIdent ';' {
$$ = ast.NewIncompleteFieldNode(nil, $1, nil, nil, nil, nil, $2)
} (NewIncompleteFieldNode here returns the same *ast.FieldNode, but handles missing nodes differently) Deciding what to do with invalid fields is tricky, I decided to skip them in the parser so they don't end up in descriptors. This means I can't ctrl-click or hover on them etc, but I can still handle them in the formatter, generate semantic tokens for them, use them in completion logic, etc. Regarding source positions, I was initially concerned about that too, but so far I have not encountered any issues. |
While still producing an error, allowing the rpc or option ast nodes to still be generated. See bufbuild#200 Will follow up with diffs of a similar pattern for the other body contexts.
While still producing an error. See bufbuild#200
… options (bufbuild#212) While still producing an error. See bufbuild#200
While still producing an error, see bufbuild#200.
…fbuild#216) While still reporting an error, see: bufbuild#200 A little different from the others, as oneof and extensions don't support empty decls.
While still producing an error, see: bufbuild#200
While still producing an error, see: bufbuild#200
While still producing an error, see bufbuild#200.
While still producing an error, see: bufbuild#200
While still reporting an error. The important case for code completion is extension type names. See bufbuild#200
New use case:
|
While still returning an error. See #200 Note that the error occurs during validation instead of parsing to eventually allow algorithmic field tag assignments.
To provide good completion suggestions, an ast is needed to know if the cursor is in an option or not. However common completion cases do not parse in protocompile, for example:
int32 foo [<cursor>];
(done)int32 foo [bar<cursor>];
(done)int32 foo [deprecated = true, <cursor>];
(done)int32 foo [bar<cursor>]
(done)int32 foo [(bar.<cursor>)];
orint32 foo [foo.<cursor>]
(done)foo.<cursor>
The text was updated successfully, but these errors were encountered: