Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn about C-style octal literals #131309

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GrigorenkoPV
Copy link
Contributor

@GrigorenkoPV GrigorenkoPV commented Oct 5, 2024

Uplifts clippy::zero_prefixed_literal as leading_zeros_in_decimal_literals, warn-by-default.


The leading_zeros_in_decimal_literals lint detects decimal integral literals with leading zeros.

Example

fn is_executable(unix_mode: u32) -> bool {
    unix_mode & 0111 != 0

Explanation

In some languages (including the infamous C language and most of its family), a leading zero marks an octal constant. In Rust however, a 0o prefix is used instead.
Thus, a leading zero can be confusing for both the writer and a reader.

In Rust:

fn main() {
    let a = 0123;
    println!("{}", a);
}

prints 123, while in C:

#include <stdio.h>

int main() {
    int a = 0123;
    printf("%d\n", a);
}

prints 83 (as 83 == 0o123 while 123 == 0o173).


Notes

Compared to the Clippy lint, this will not warn if the literal is unambiguous, i.e. it has 8's or 9's in it (clearly not octal) or it is <10 (in this case the value does not depend on the base).

Closes #33448
r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 5, 2024
@rust-log-analyzer

This comment has been minimized.

@jieyouxu jieyouxu added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Oct 6, 2024
@bors

This comment was marked as resolved.

@GrigorenkoPV GrigorenkoPV changed the title suspicious_leading_zero lint for detecting C-style octals Warn on C-style octal literals Nov 23, 2024
@GrigorenkoPV GrigorenkoPV changed the title Warn on C-style octal literals Warn about C-style octal literals Nov 23, 2024
@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from fe17bb5 to def1383 Compare November 23, 2024 22:50
@rust-log-analyzer

This comment has been minimized.

@bors

This comment was marked as resolved.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from def1383 to f70a324 Compare December 3, 2024 23:34
@GrigorenkoPV
Copy link
Contributor Author

@rustbot ready

r? @Urgau

@GrigorenkoPV GrigorenkoPV marked this pull request as ready for review December 3, 2024 23:35
@rustbot
Copy link
Collaborator

rustbot commented Dec 3, 2024

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

@rust-log-analyzer

This comment has been minimized.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from f70a324 to e7881ba Compare December 4, 2024 13:19
@rust-log-analyzer

This comment has been minimized.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from e7881ba to ea90417 Compare December 4, 2024 16:16
@rust-log-analyzer

This comment has been minimized.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from ea90417 to 2bfd5cf Compare December 4, 2024 18:04
@rust-log-analyzer

This comment has been minimized.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from 2bfd5cf to 81b09ce Compare December 4, 2024 20:54
@rustbot
Copy link
Collaborator

rustbot commented Dec 4, 2024

The Miri subtree was changed

cc @rust-lang/miri

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from 81b09ce to 24e10ba Compare December 5, 2024 13:08
@Urgau
Copy link
Member

Urgau commented Dec 5, 2024

This looks fine to me. Let's nominate it for T-lang fcp (which could unfortunatly take months, due to T-lang backlog).

I've taken the liberty to adjust the PR description to include the lint description as well, in order to clearly see what the lint does. Feel free to update it if you update the one in the PR.

@rustbot labels -S-waiting-on-review +S-waiting-on-team +I-lang-nominated

@rustbot rustbot added I-lang-nominated Nominated for discussion during a lang team meeting. S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 5, 2024
@scottmcm
Copy link
Member

Maybe excluding numbers with 8's or 9's in them isn't a bad idea either.

That seems like a clear win to me, since that's obviously not coming from some copy-pasting octal constants from C.

In general, I'm strongly opposed to warning on any constant that isn't actually ambiguous. If it's not octal or it's interpreted the same in decimal and octal, any warning doesn't actually provide value.

Personally, add me to the "skeptical about doing this always" camp. Especially with rustfmt being so common and how it hates vertical alignment, using zeros like this seems an entirely reasonable way for people to tabulate.


In general, is this really a concern for anything other than 0755 and 0640 and such? Seems like maybe we could also restrict warning to only the zero-then-three-octal-digits case, because I don't think 010 is likely to be an octal mistake, for example.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch 2 times, most recently from f2ba593 to 7ce1ef1 Compare January 13, 2025 17:57
@GrigorenkoPV
Copy link
Contributor Author

GrigorenkoPV commented Jan 13, 2025

I've relaxed the lint to not warn about numbers <10 and about literals with 8's or 9's in them.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from 7ce1ef1 to 1ff80c4 Compare January 13, 2025 18:07
@joshtriplett
Copy link
Member

joshtriplett commented Jan 13, 2025

In general, is this really a concern for anything other than 0755 and 0640 and such? Seems like maybe we could also restrict warning to only the zero-then-three-octal-digits case, because I don't think 010 is likely to be an octal mistake, for example.

077 is reasonably likely to be an octal mistake, intended to be the group and other permission bits.

I would prefer to warn about every case that could be interpreted as octal and would have a different meaning.

We could also exclude any literals containing underscores.

I don't think it's unreasonable for people to have to allow a lint if they want to use zeroes for alignment. But also, it's easy to use rustfmt::skip and then align with spaces.

@rust-log-analyzer

This comment has been minimized.

@traviscross
Copy link
Contributor

traviscross commented Jan 13, 2025

Here's another thing. In C, I always found it mildly annoying you couldn't write decimals with leading zeros due to that weird octal literal rule. There may have been more times that I actually wanted to write leading zeros than that I wanted to write in octal.

So it'd be kind of darkly ironic if, in Rust, having not made the same mistake, we'd go out of our way to tell people these are discouraged due to C's weird octal literal rule. In a way, we would be teaching new generations of people -- who might not ever learn C themselves -- about C's weird octal literal rule.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from 1ff80c4 to c3305be Compare January 13, 2025 20:12
@rust-log-analyzer

This comment has been minimized.

@joshtriplett
Copy link
Member

joshtriplett commented Jan 14, 2025

@traviscross I can appreciate that. On the other hand, UNIX is the gift that keeps on giving, and octal lives on in permission bits, file modes, and (as a consequence) Git repository internals, among other places. Because of that, octal remains just common enough to be a potential hazard.

If the consequence is that we need to write one allow if we actually want to use leading zeroes (or switch to using leading spaces with a rustfmt::skip), that seems reasonable. The failure mode of "my personal style produces a warning and I need to change it or add an allow" seems less hazardous than the failure mode of "my permissions are wrong in a way that's extremely difficult to spot when looking at the code".

I think the mitigation of allowing two-digit zero padding (e.g. for dates) will help greatly in reducing how often people encounter this as a "false" positive. For the remainder, I think there are enough C programmers or recovering C programmers in the world that we could be nice to them by not breaking their mental lexers when they read other people's code. :)

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 14, 2025 via email

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 14, 2025 via email

@joshtriplett
Copy link
Member

I feel that targeting this lint specifically at unix permission bits would make sense -- something like 0[0-7][0-7][0-7].

077 is also likely to be permissions, and 0100644 is likely to be a file mode.

@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from c3305be to eacb589 Compare January 14, 2025 12:56
@joshtriplett
Copy link
Member

So, proposal:

I think we should do two separate FCPs here. First, let's do an FCP for approving the lint, as allow-by-default. Then, separately, we can work on consensus for making it warn.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 14, 2025 via email

@joshtriplett
Copy link
Member

Could @rust-lang/lang folks who don't object to having the lint as allow-by-default please check a box at #131309 (comment) ? When that's done we can do a separate PR and FCP for whether to warn by default, either with the current lint or one with additional scope reduction to specific number lengths.

@GrigorenkoPV
Copy link
Contributor Author

Does an allow-by-default rustc lint has better discoverability than a warn-by-default clippy lint?

@nikomatsakis
Copy link
Contributor

@rfcbot reviewed

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jan 15, 2025
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 15, 2025

@rfcbot concern allow-by-default

Per @joshtriplett's comment, the FCP has changed to allow-by-default, but the OP on the PR (and afaik the impl) is still warn-by-default. Filing this concern until those are updated.

@nikomatsakis
Copy link
Contributor

Regarding the lint, I think my ideal would probably be to have this lint as allow-by-default and a more narrowly targeted one as allow-by-default, with a "subtyping" relationship -- I guess this means a leading_zero lint group or something?

@bors
Copy link
Contributor

bors commented Jan 26, 2025

☔ The latest upstream changes (presumably #136070) made this pull request unmergeable. Please resolve the merge conflicts.

@traviscross traviscross added I-lang-radar Items that are on lang's radar and will need eventual work or consideration. and removed I-lang-nominated Nominated for discussion during a lang team meeting. labels Jan 26, 2025
@GrigorenkoPV GrigorenkoPV force-pushed the suspicious-leading-zero branch from eacb589 to b410dcc Compare January 28, 2025 15:11
Comment on lines +1 to +8
error: lint `clippy::zero_prefixed_literal` has been renamed to `leading_zeros_in_decimal_literals`
--> tests/ui/literals.rs:5:9
|
LL | #![warn(clippy::zero_prefixed_literal)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use the new name: `leading_zeros_in_decimal_literals`
|
= note: `-D renamed-and-removed-lints` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(renamed_and_removed_lints)]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for the Clippy lint needs to be updated as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lint against leading 0 in integer literals