API feedback

I started using `pest-ast` for a programming language and hit many critical design issues:

## 1. Parsers should never panic.

The design of the derive macro encourages unavoidable panics, demonstrated by [the example CSV parser](https://github.com/pest-parser/ast/blob/4f45bb9c082dbe94d7b8c49e937753de5d75710e/examples/csv.rs#L29).

Given invalid input:

```patch
diff --git a/examples/csv.csv b/examples/csv.csv
index 20a98c9..f4d4042 100644
--- a/examples/csv.csv
+++ b/examples/csv.csv
@@ -1,5 +1,5 @@
 65279,1179403647,1463895090
 3.1415927,2.7182817,1.618034
 -40,-273.15
-13,42
+13,-.42.-
 65537
```

The example panics within the `FromPest` impl:

```text
thread 'main' (1902657) panicked at examples/csv.rs:26:21:
called `Result::unwrap()` on an `Err` value: ParseFloatError { kind: Invalid }
```

This is an acceptable design when panics can be statically proven impossible. Not for a parser that is routinely expected to accept "untrusted" input.

According to what I see in the issue tracker, panicking is absolutely by design. And it appears that more panics may be introduced later (assuming this project will continue evolving).

## 2. Error handling leaves a lot to be desired.

When the parser/converter fails, I want to present a human readable error message that helpfully highlights what went wrong.

In the case of the CSV example, a rustc-like error message would be superb:

```text
error: Expected `f64`
   --> csv.csv:4:4
  |
4 | 13,-.42.-
       ^^^^^^ help: invalid float literal
```

It does not appear to be possible to get this level of detail out of `ConversionError`. The error type does not carry a span that could be used to highlight the erroneous tokens. As far as I can tell, it is not possible to get this level of detail from the `FromPest` trait. #26 is related.

The derive macro adds another dimension of trouble for error handling: It only uses `ConversionError<Void>`, eliminating any possibility of returning the `FatalError` variant. Even worse, the `inner(with(...))` and `outer(with(...))` attributes cannot call a function that returns `Result<T, ConversionError<Void>>` for a field with type `T`.

The only possibility is unwrapping, as the CSV example does. I hopelessly tried `with(try!)` desperately and naively thinking I could sneak in a propagating unwrap by injecting the function-like macro into the token stream. There is no option to inject the `?` operator, though this would be ideal.

My actual use case for error handling was parsing decimal integers. As a concrete example, the following grammar cannot be parsed into `u8`:

```pest
integer = @{ ASCII_DIGIT{1, 3} }
```

The grammar can constrain the input to a maximum of 3 digits, but that allows numbers out of range such as `999`. The only thing for the parser to do is panic!

That leaves me with only one possible solution: Change the type on the AST field and process parsing errors in a second phase. (Maybe a `Span<'pest>` field can be added in addition to `Result<T, ConversionError<Void>>` for contextual errors? I didn't try it.) That means that I would end up with both a "half-parsed" AST type, and "parsed and validated" AST type. Which clearly goes against the ["parse, don't validate"](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) mantra. But worse than that, it duplicates a lot of code (mostly structural).

## 3. A few non-critical issues I ran into.

The derive impl for enums does not do what I expect (unit variants are not really handled, especially for terminals; I think #37 hit this case as well). I had to hand-write `FromPest` impls for all of my enums.

More surprisingly, I could not get the derived impl to match unary expressions at all with the following grammar:

```pest
unary = { unary_op* ~ term }
```

Parsing into type:

```rust
#[derive(Debug)]
pub struct Unary<'pest> {
    pub ops: Vec<UnaryOp>,
    pub term: Term<'pest>,
}
```

The `FromPest` impl for `Vec<T>` is greedy. It consumes until it hits the `term` token and returns `Err(NoMatch)`. I had to write a non-greedy `FromPest` impl to handle this grammar. #34 looks related.

----

I know this project is sort of in light maintenance mode, now. Given my experience so far, I cannot see it being realistically usable in serious parsers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API feedback #42

1. Parsers should never panic.

2. Error handling leaves a lot to be desired.

3. A few non-critical issues I ran into.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

API feedback #42

Description

1. Parsers should never panic.

2. Error handling leaves a lot to be desired.

3. A few non-critical issues I ran into.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions