-
-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(biome_console): fix printing emojis consisting of multiple codepoints #693
fix(biome_console): fix printing emojis consisting of multiple codepoints #693
Conversation
I see from the test run the that windows character replacement stuff is not working fully, I'll try to figure out something there! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here I thought that the issue was the formatter :) Great work here and thank you for fixing it!
/// Determines if a unicode grapheme consists only of code points | ||
/// which are considered whitepsace characters in ASCII | ||
fn grapheme_is_whitespace(grapheme: &str) -> bool { | ||
grapheme.chars().all(|c| c.is_whitespace()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL "grapheme" 😄
Summary
Fixes #455
The source of the bug was, that the code used the
.chars()
iterator, which creates Unicode code point chunks. Some complex unicode characters, like emojis can consist of multiple code points, which the code did not handle correctly. As suggested in the PR (and the conclusion I came to as well), I used theunicode_segmentation
library, to be able to iterate over graphemes (which is basically a character as perceived by humans). This fixes the error by treating multiple code points representing a single character as one unit.Test Plan
I've added a new test where the input string contains emojis that the CLI couldn't print before, and I'm basically asserting that the output is the exact same as the input.