Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dhall-format: Detect character set from input #2108

Merged
merged 15 commits into from
Dec 31, 2020

Conversation

basile-henry
Copy link
Collaborator

This PR implements the suggestion from dhall-lang/dhall-lang#1086

dhall format can now detect which character set is used in the input instead of defaulting to Unicode.

  • Only Dhall syntax is considered for the detection (strings and comments don't affect the decision)
  • If no Unicode is used in expression default to ASCII
  • Add CLI switch to specify Unicode explicitly

@basile-henry basile-henry force-pushed the basile/dhall-format-detect branch 2 times, most recently from f0da6f1 to 43a65dd Compare December 7, 2020 17:27
Copy link
Collaborator

@Gabriella439 Gabriella439 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just a few small suggestions

dhall-openapi/openapi-to-dhall/Main.hs Outdated Show resolved Hide resolved
dhall/src/Dhall/Pretty/Internal.hs-boot Show resolved Hide resolved
dhall/src/Dhall/Main.hs Outdated Show resolved Hide resolved
@Gabriella439
Copy link
Collaborator

@basile-henry: Note that I'd like to hold off on merging this until we first cut a small 1.37.1 release to push out the fix in #2112

@basile-henry
Copy link
Collaborator Author

@Gabriel439 No worries, it can wait. I was also wondering if we should get feedback/approval from others in the dhall-lang/dhall-lang#1086 thread.

Someone in that thread wanted to be able to configure the language server to use Unicode specifically. Is this something I can do in this PR/codebase or do I need to synchronize with the VS code LSP plugin somewhere else as well?

Unrelated: About cutting fixes for previous releases, do you know that Mergify can backports PR automatically? https://docs.mergify.io/commands.html#backport 😄 I think we just need to keep around a branch for the latest release (or however many releases we want to keep supporting) and then we can easily backport a PR to that branch.

@basile-henry basile-henry force-pushed the basile/dhall-format-detect branch from 6edad2b to fefd169 Compare December 29, 2020 15:12
Comment on lines 43 to 45
data ServerConfig = ServerConfig
{ asciiOnly :: Bool
-- ^ Use ASCII symbols rather than fancy unicode when formatting and linting
-- code.
{ chosenCharacterSet :: Maybe CharacterSet
} deriving Show
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PanAeon Do you know if I need to modify something else to make this new config format work? Maybe somewhere in https://github.com/PanAeon/vscode-dhall-lsp-server ? I haven't seen anything that sets the LSP config there though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it works just fine without modifying the VS code extension, and this config can still be set from settings.json manually. I might try to make this new option accessible from the extension's UI, but it's not necessary for now.

@timbertson
Copy link
Collaborator

timbertson commented Dec 30, 2020

If no Unicode is used in expression default to ASCII

Is that accurate? I am extremely in favour of ASCII mode so I'd love it. But I haven't seen discussion of it, and dhall has historically defaulted to unicode, so I don't want to get my hopes up ;)

If it is accurate, are there eventual plans to drop unicode, or would it be supported as opt-in forever?

@basile-henry
Copy link
Collaborator Author

Is that accurate?

Yes! The main way people use the Unicode syntax is by letting the formatter change their operators automatically (I don't think anyone types in Unicode, but maybe I'm wrong).
We don't want to completely lose that, so the logic used for detection is: if there is at least one Unicode operator used, then use Unicode.
However that does mean that the default has to change, otherwise ASCII would never be detected.

dhall has historically defaulted to unicode

Yes, and that's one of the reasons I've been so slow at merging this PR. Ideally I'd like more feedback on this new way of handling character sets. I know some people have become attached to the Unicode character set and now they have one extra step to use it (either type a Unicode operator once or use dhall format --unicode once).

are there eventual plans to drop unicode, or would it be supported as opt-in forever

As far as I know there are no plans to drop Unicode.

@Gabriella439
Copy link
Collaborator

@basile-henry: I'm fine merging this now. There's no need to hold off merging any longer, in my opinion

@timbertson: This comment explains why the default is Unicode if the file has a mixture of ASCII and unicode: dhall-lang/dhall-lang#1086 (comment). If the file is all ASCII then it will default to ASCII

@basile-henry
Copy link
Collaborator Author

I'm fine merging this now. There's no need to hold off merging any longer, in my opinion

Alright, let's do it! We can always revert/fixup as needed in future PRs 👍

@basile-henry basile-henry merged commit fa27901 into master Dec 31, 2020
@basile-henry basile-henry deleted the basile/dhall-format-detect branch December 31, 2020 12:30
@Gabriella439
Copy link
Collaborator

@basile-henry: Thank you for contributing this! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants