Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial error reporting framework #266

Merged
merged 34 commits into from
Feb 14, 2025
Merged

Add initial error reporting framework #266

merged 34 commits into from
Feb 14, 2025

Conversation

jeaye
Copy link
Member

@jeaye jeaye commented Feb 13, 2025

This branch is getting very far from main, which used to work fine when it was just me, but now we have a lot of people changing main each week. I've decided to merge this in batches.

Included in this PR

  • New error namespace, types, helpers
  • Lexer completely reworked to use new errors
  • Parser completely reworked to use new errors
  • Analysis partially reworked to use new errors
  • New error report namespace
  • Lexer-based highlighter
  • CMake summary output on ./configure

Remaining work

  • Add all necessary source info as meta during parsing
  • Finish rework of analysis error handling
  • Polish error reporting (jar support, horizontal centering, various edge cases, etc)
  • Add a test suite for the error reporter
  • Tackle the whole world of runtime errors

@jeaye jeaye requested a review from frenchy64 February 13, 2025 20:06
Copy link
Contributor

@frenchy64 frenchy64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! First skim, I haven't looked at the tests or tried it yet.

* the whole pretty function string, so we copy it into an array which contains only
* the type name.
*
* Just do type_name<T>() and there's your string_view. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final piece: We could do this all in type_name, since we're under a function parameterized by T. But to prevent the entire __PRETTY_FUNCTION__ string from get compiled into the binary, we reassemble it to a smaller string via an array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said that already, in the second to last paragraph.

case kind::analysis_invalid_recur_from_try:
return "recur may not be used within a 'try'";
case kind::analysis_invalid_recur_args:
return "The argument arity passed to 'recur' don't match the function's arity";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't => doesn't

* is collapsed by an ellipsis. Thus, we need to expand the ellipsis either partially
* or fully.
*
* We do this by finding the releveant ellipsis (there may be multiple) and inserting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ellipsis => ellipses?

Ditto remove unneeded ellips[e]s, maybe a few other places.

I'm not familiar with the word ellipse in the variable names.

/* Remove ellipsis if needed. */
if(lines[i].kind == line::kind::ellipsis)
{
/* We can be confident there's no note right after an ellipsis. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just skimming but I didn't catch which invariants guarantee this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top margin guarantees this.

return err(
error{ start_token.pos, native_persistent_string{ "value after #( must be present" } });
return error::parse_invalid_shorthand_function({ start_token.start, latest_token.end },
"Value after #( must be present");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something's not sitting right with me talking about "values" for all these cases. Is this reporting EOF while reading these tokens?

What does this #( case in particular correspond to?

;; main branch
clojure.core=> #(
Read error (1 - 1): Unterminated list
clojure.core=> #'
Read error (1 - 1): value after #' must be present

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't combed through every parse error yet, to verify they're all reachable. I think this one is not reachable and so should be either removed or changed to an internal parse error instead.

@@ -735,7 +727,8 @@ namespace jank::read::parse
}
else
{
return err(error{ start_token.pos, native_persistent_string{ "Unknown symbolic value" } });
return error::parse_invalid_reader_symbolic_value("Unsupported symbolic value",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote "Invalid symbolic value".

return err(error{ start_token.pos,
native_persistent_string{ "#?@ splice must not be top-level" } });
return error::parse_invalid_reader_splice({ start_token.start, latest_token.end },
"#?@ splice must not be top-level");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Top-level #?@ not allowed"?

Or simplify #?@ splice to #?@? Ditto below. Sort of reads like "reader conditional splice splice".

return err(error{ token.pos, fmt::format("unknown namespace: {}", ns_portion) });
return error::parse_unresolved_namespace(
fmt::format("Unresolved namespace '{}'", ns_portion),
{ start_token.start, latest_token.end });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this case might not be an error due to #195, but I'm working on that.

I have this as

            ns = ns_portion;

in my WIP branch, but I don't have an example handy:

https://github.com/jank-lang/jank/pull/239/files#diff-bced15263f03bb66a379b6f9346fa47a66e15b168e39a710e7153d1c115090e2R1222

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Leaving as-is for now. Can be fixed in your PR.

* to later review that for error reporting. We automatically clean it up
* and we reuse the same file over and over. */
auto const tmp{ boost::filesystem::temp_directory_path() };
auto const path{ tmp / boost::filesystem::unique_path("jank-repl-%%%%.jank") };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this bothering you somehow? 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this file littered around my directories at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. We weren't writing to a file before. Must've been from something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, probably .jank-repl-history.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. You'll see that. We don't clean them up. Leiningen does the same with .lein-repl-history.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never been bothered by Leiningen's behavior here. Looks like it uses the project-root if project.clj is present, otherwise uses a config directory

https://github.com/technomancy/leiningen/blob/24fb93936133bd7fc30c393c127e9e69bb5f2392/src/leiningen/repl.clj#L152-L153

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd prefer if we did something like that, too. Not sure how we'd want to do it, though. We could just use jank's cache dir, which is in ~/.cache/jank by default. That would be a global history for all projects.

Copy link
Contributor

@frenchy64 frenchy64 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes maybe jank should take a path to save history, that can be set by e.g., lein-jank.

@frenchy64 frenchy64 self-requested a review February 13, 2025 21:12
@jeaye
Copy link
Member Author

jeaye commented Feb 13, 2025

Wow! First skim, I haven't looked at the tests or tried it yet.

Thanks for the review. No requirement to try it locally. You're welcome to, though.

@frenchy64
Copy link
Contributor

frenchy64 commented Feb 14, 2025

@jeaye just tried it, it's great. I can look at the tests another time, for now I'm happy.

@jeaye jeaye merged commit 9138db9 into main Feb 14, 2025
13 checks passed
@jeaye jeaye deleted the error-reporting branch February 14, 2025 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants