-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from colored to owo_colors/anstream #10784
Conversation
CodSpeed Performance ReportMerging #10784 will not alter performanceComparing Summary
|
|
55 |-nested_fstrings = f'{f'{f''}'}' | ||
55 |+nested_fstrings = f'\b{f'{f''}'}' | ||
55 |-nested_fstrings = f'{f'{f''}' | ||
55 |+nested_fstrings = f'\b{f'{f''}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand, why did the outer }
get stripped here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I was just editing my comment about this in the PR summary. I don't quite understand this either yet. These are tests of files with invalid/non-printable characters. Locally I thought what was happening was that anstream was just stripping the invalid chars, but looking at it now in the github diff it looks different. So I need to investigate this a bit more tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh interesting! Yeah I did see the PR summary prior to writing this, but was trying to figure out what's up here.
This looks good to me! I'll hold off on approving while we figure out what's up with those snapshots. |
The anstream docs do say:
I don't fully understand what that means, but it gives me hope that it does support Windows 10. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me and I like the Colors::none
in tests. Much cleaner than my feature flag hack.
Would you mind running the CPython hyperfine benchmark and compare main with your changes. I remember that there was a perf regression when I worked on this last time (my solution was different because I used if_supports_colors
. I'm fine with a perf regression. I only want to understand the impact of the change.
See https://github.com/astral-sh/ruff/blob/main/CONTRIBUTING.md#cpython-benchmark
I'm sorry, but I only have a Win 11 installation. I added an inline comment regarding the snapshot changes. I think the changes are good because they match what users see when running Ruff (what we show in snapshots today is a lie! It's not what our users see; we've been tricked by insta).
cargo run --bin ruff -- check --select PLE /home/micha/astral/ruff/crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py --no-cache --preview
Finished dev [unoptimized + debuginfo] target(s) in 0.10s
Running `target/debug/ruff check --select PLE /home/micha/astral/ruff/crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py --no-cache --preview`
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:15:6: PLE2510 [*] Invalid unescaped character backspace, use "\b" instead
|
13 | # (Pylint, "C3002") => Rule::UnnecessaryDirectLambdaCall,
14 | #foo = 'hi'
15 | b = '
| PLE2510
16 | b = f'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:16:7: PLE2510 [*] Invalid unescaped character backspace, use "\b" instead
|
14 | #foo = 'hi'
15 | b = '
16 | b = f'
| PLE2510
17 |
18 | b_ok = '\\b'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:24:12: PLE2512 [*] Invalid unescaped character SUB, use "\x1A" instead
|
22 | cr_ok = f'\\r'
23 |
24 | sub = 'sub '
| PLE2512
25 | sub = f'sub '
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:25:13: PLE2512 [*] Invalid unescaped character SUB, use "\x1A" instead
|
24 | sub = 'sub '
25 | sub = f'sub '
| PLE2512
26 |
27 | sub_ok = '\x1a'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:30:16: PLE2513 [*] Invalid unescaped character ESC, use "\x1B" instead
|
28 | sub_ok = f'\x1a'
29 |
30 | esc = 'esc esc
| PLE2513
31 | esc = f'esc esc
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:31:17: PLE2513 [*] Invalid unescaped character ESC, use "\x1B" instead
|
30 | esc = 'esc esc
31 | esc = f'esc esc
| PLE2513
32 |
33 | esc_ok = '\x1b'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:37:5: PLE2514 [*] Invalid unescaped character NUL, use "\0" instead
|
36 | nul = '''
37 | nul '''
| PLE2514
38 | nul = f'''
39 | nul '''
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:39:5: PLE2514 [*] Invalid unescaped character NUL, use "\0" instead
|
37 | nul '''
38 | nul = f'''
39 | nul '''
| PLE2514
40 |
41 | nul_ok = '\0'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:44:13: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
42 | nul_ok = f'\0'
43 |
44 | zwsp = 'zerowidth'
| PLE2515
45 | zwsp = f'zerowidth'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:45:14: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
44 | zwsp = 'zerowidth'
45 | zwsp = f'zerowidth'
| PLE2515
46 |
47 | zwsp_ok = '\u200b'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:50:36: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
48 | zwsp_ok = f'\u200b'
49 |
50 | zwsp_after_multibyte_character = "ಫ"
| PLE2515
51 | zwsp_after_multibyte_character = f"ಫ"
52 | zwsp_after_multicharacter_grapheme_cluster = "ಫ್ರಾನ್ಸಿಸ್ಕೊ "
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:51:37: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
50 | zwsp_after_multibyte_character = "ಫ"
51 | zwsp_after_multibyte_character = f"ಫ"
| PLE2515
52 | zwsp_after_multicharacter_grapheme_cluster = "ಫ್ರಾನ್ಸಿಸ್ಕೊ "
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:52:60: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
50 | zwsp_after_multibyte_character = "ಫ"
51 | zwsp_after_multibyte_character = f"ಫ"
52 | zwsp_after_multicharacter_grapheme_cluster = "ಫ್ರಾನ್ಸಿಸ್ಕೊ "
| PLE2515
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:52:61: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
50 | zwsp_after_multibyte_character = "ಫ"
51 | zwsp_after_multibyte_character = f"ಫ"
52 | zwsp_after_multicharacter_grapheme_cluster = "ಫ್ರಾನ್ಸಿಸ್ಕೊ "
| PLE2515
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:53:61: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
51 | zwsp_after_multibyte_character = f"ಫ"
52 | zwsp_after_multicharacter_grapheme_cluster = "ಫ್ರಾನ್ಸಿಸ್ಕೊ "
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
| PLE2515
54 |
55 | nested_fstrings = f{f'{f'}'}'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:53:62: PLE2515 [*] Invalid unescaped character zero-width-space, use "\u200B" instead
|
51 | zwsp_after_multibyte_character = f"ಫ"
52 | zwsp_after_multicharacter_grapheme_cluster = "ಫ್ರಾನ್ಸಿಸ್ಕೊ "
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
| PLE2515
54 |
55 | nested_fstrings = f{f'{f'}'}'
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:55:21: PLE2510 [*] Invalid unescaped character backspace, use "\b" instead
|
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
54 |
55 | nested_fstrings = f{f'{f'}'}'
| PLE2510
56 |
57 | # https://github.com/astral-sh/ruff/issues/7455#issuecomment-1741998106
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:55:25: PLE2512 [*] Invalid unescaped character SUB, use "\x1A" instead
|
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
54 |
55 | nested_fstrings = f{f'{f'}'}'
| PLE2512
56 |
57 | # https://github.com/astral-sh/ruff/issues/7455#issuecomment-1741998106
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:55:29: PLE2513 [*] Invalid unescaped character ESC, use "\x1B" instead
|
53 | zwsp_after_multicharacter_grapheme_cluster = f"ಫ್ರಾನ್ಸಿಸ್ಕೊ "
54 |
55 | nested_fstrings = f{f'{f'}'}'
| PLE2513
56 |
57 | # https://github.com/astral-sh/ruff/issues/7455#issuecomment-1741998106
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:58:12: PLE2512 [*] Invalid unescaped character SUB, use "\x1A" instead
|
57 | # https://github.com/astral-sh/ruff/issues/7455#issuecomment-1741998106
58 | x = f"""}}ab"""
| PLE2512
59 | # https://github.com/astral-sh/ruff/issues/7455#issuecomment-1741998256
60 | x = f"""}}a"""
|
= help: Replace with escape sequence
crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py:60:12: PLE2513 [*] Invalid unescaped character ESC, use "\x1B" instead
|
58 | x = f"""}}ab"""
59 | # https://github.com/astral-sh/ruff/issues/7455#issuecomment-1741998256
60 | x = f"""}}a"""
| PLE2513
|
= help: Replace with escape sequence
Found 21 errors.
[*] 21 fixable with the `--fix` option.
Note: We use labels to generate the changelog. Labeling our PRs helps the person doing the release tremendously. I added the internal
label because this is an internal refactor that, hopefully, isn't visible to users. You might want to change the label to CLI if there's some observable difference for users (e.g. do we support new environment variables?)
@@ -5,27 +5,27 @@ invalid_characters.py:30:16: PLE2513 [*] Invalid unescaped character ESC, use "\ | |||
| | |||
28 | sub_ok = f'\x1a' | |||
29 | | |||
30 | esc = 'esc esc ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of makes sense to me. I suspect that it is insta who renders the ESC
symbol for the whitespace only character. So the problem now is that anstream
filters the characters out before they're passed to anstream
.
This goes into the "show fixes" project but I think we should improve our code frames to replace non-printable characters with printable characters before printing.
https://github.com/biomejs/biome/blob/5505a1a0c7302f6aee1a9642bc83ed064e7820ea/crates/biome_diagnostics/src/display/frame.rs#L305-L406
I haven't checked if there's a Rust library that does that.
I'm okay merging this PR and creating an issue for this instead. I actually kind of like the output now better because I suspect it now shows what users see in the terminal. @carljm could you test if what the snapshot shows matches Ruff's (main) output when run on the console?
It does look like this is a regression; I presume that's because of the extra stream filtering done by
That's pretty significant; it's like 11%. Do we still want to go ahead with this change? I think we have two alternatives:
I can also dig into the cause of the regression more by generating a profile, but I'm not optimistic there will be much we can reasonably do to reduce the regression while still using the approach in this PR. It's not clear to me if there are strong motivations to move off of |
Oh wow. I'm surprised it's that significant -- thanks for benchmarking! (Maybe @MichaReiser will be less surprised?) I agree that 11% is way too much for something that's largely internal convenience with no user-facing benefit. Sorry for sending you on this path... That's my fault. For context, the motivation for moving to My vote would be to just implement |
I'm not surprised. I don't remember the exact numbers but I know it was significant enough that I didn't dare to put up a PR ;) (although my solution wasn't as good as @carljm's). I prefer to create a build of both versions and run hyperfine as a single command to reduce the chance that any other programs (or caches) influence the benchmark. I do this by building main and copy it to
It seems that the CPython is an edge case because it is a project with many diagnostics and Ruff prints all the diagnostics (it creates a 2MB diagnostic file using concise formatting). I strongly suspect that the slowest part of running ruff is my terminal rendering this wall of text. Do we see any regression when linting a project that has adopted Ruff (e.g. airflow?). I'm undecided how much the perf regression should bother me. I suspect that this use case will get slower when we switch to Long term, I would love to see us having our own @carljm would you mind creating an issue that the code frame and diff rendering should escape non-visible characters? |
Reading Micha's comment: I think it makes sense to do some basic testing to understand whether (1) the terminal detection is a contributor to the slowness (hardcode to always use a non-colored stream), and then (2) whether the stream-stripping is a contributor to the slowness (hardcode to always use a colored stream, which presumedly is zero-cost). But yeah, let's timebox it to an hour or so, since it seems unlikely to ship right now. |
Yeah, confirmed that the regression doesn't improve at all if we force no-color (always filter the streams):
And it (mostly) disappears if we run with color always on (no stream filtering):
I think this confirms that the If we really do like the For now I'll close this and put up a separate PR to just add I'll also open a separate issue about escaping non-printable chars in code frames / diffs. |
Just in case anyone comes back to this PR in future, I wanted to also record some of my discoveries about the invalid-characters behavior here. It is sort of true that this PR improves consistency between what users see and what insta sees, but only if the user is running in a no-color environment, so that anstream strips their output (like it does in tests). If the user is running in a color environment (probably the more common scenario), then this PR actually increases the divergence between what the user sees and what we see in Additionally, anstream's stripping is kind of unpredictable in the face of So I don't think the current behavior of this PR is something we'd want to merge, and I'm honestly not sure what we should do regarding non-printable chars (escaping them seems reasonable, but oddly neither |
Minor curiosity... why didn't Codspeed show this performance regression? |
I don't know the answer, but IIRC CodSpeed can't / doesn't measure the cost of allocations, so if this the regression came from increased allocations, that wouldn't have shown up. |
Spot-checking some of the benchmarks that Codspeed runs, it looks like they are library benchmarks, not full sub-process benchmarks (i.e. they benchmark calling some library function with some data), so I think they are completely blind to the performance of console I/O. I also grepped |
Oh yeah, that's interesting. |
That could be (I guess CodSpeed just measures CPU instructions?). But I would suspect that at least some of the cost here has to be actually scanning the strings for the ANSI escape codes (that can't be free!), and I would think CPU instrumentation would catch that. |
Some quick thoughts for anyone that comes back to this
|
Hi @epage It's an honor to see you chime in on this discussion. Regarding using
|
imo for something like Ruff, you likely don't need the generalization that was done in
While the original purpose of |
Refs #5499
Summary
This switches us from the
colored
crate to theowo-colors
andanstream
crates for handling terminal colors. The eventual goal here is to support theFORCE_COLOR
env var, but this PR doesn't actually do that yet; that'll come in a separate PR.The approach here is to separate concerns: we always write colored text (using
owo-colors
) and we wrap our output streams inanstream::AutoStream
, which can detect if the terminal supports colors, and strip out the ANSI color codes if not.Some points for reviewer attention:
colored
crate. It's also not clear what the equivalent of this workaround foranstream
would be. We need someone to test this on Windows 10, which I don't have available; if nobody else does either, I can get a VM.anstream
is doing something odd with them when it detects a non-terminal. This won't change things for most users who are running from a terminal. Need to look into this more to understand what's happening; I'd initially thought anstream was stripping out the invalid characters, but in the github diff view it doesn't look like that's quite what's happening. Not sure what we can do differently here, short of scrapping the whole approach of usinganstream
.let tip = "...".bold().green();
withlet tip = "...".bold(); let tip = tip.green();
. This is because withowo-colors
the former causes a compiler error:"...".bold()
creates a temporary that is freed beforetip
is used. Naming the temporary allows it to live long enough. This was only a problem one place, since most places we use chained styling methods, the result is used immediately, so there isn't a lifetime problem. Naming the temporary was the compiler-suggested fix; if there's a nicer way let me know!ruff_linter::colors;
module isn't strictly needed in this PR (at the moment it's purely passing through toanstream
), but I think it's more robust if we consolidate our color-choosing logic; this will be needed in the next PR that adds support forFORCE_COLOR
(see below).TODO:
Why doesn't this PR already support
FORCE_COLOR
?Anstream doesn't have any built-in support for
FORCE_COLOR
.owo-colors
has optional support for it, via thesupports-color
feature, but this requires using a less convenient (and likely less efficient)if-supports-color
API. And doing color support selection inowo-colors
is irrelevant if we are usinganstream
, because ifanstream
doesn't recognizeFORCE_COLOR
itself, it'll just happily strip out the color codes anyway. So my plan forFORCE_COLOR
is to add a check for it inruff_linter::colors::choice
function, forcing it to return "yes colors" regardless of terminal. But I think it's clearer to add that behavior change in a separate PR.Test Plan
Existing test suite passes.
I also verified that
cargo run -- check ...
does have colored output in my terminal by default, does not have colored output if I setNO_COLOR=1
, does not have colored output if I setCLICOLOR=0
, does not have colored output if I pipe toless
, and does have colored output if I setCLICOLOR_FORCE=1
and pipe toless
.