Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pythonlib: add linting with ruff #4072

Merged
merged 17 commits into from
Jan 31, 2025
Merged

Conversation

jodersky
Copy link
Member

@jodersky jodersky commented Dec 4, 2024

Add a RuffModule, which serves as a code formatter and linter.

Note: Ruff can also serve as a drop-in replacement for Black, so we'll use that for formatting too.

Part of #3928

@jodersky jodersky added the pythonlib Issues related to Mill's python support label Dec 4, 2024
@himanshumahajan138
Copy link
Contributor

@jodersky Sir are you working on this or I can take over this to finish line

@jodersky
Copy link
Member Author

jodersky commented Dec 26, 2024 via email

@jodersky jodersky force-pushed the jo/python-linting branch 2 times, most recently from 3ed34ba to 03ed33f Compare January 19, 2025 20:05
@jodersky jodersky force-pushed the jo/python-linting branch 2 times, most recently from 89178c1 to d4cf900 Compare January 24, 2025 18:26
@jodersky
Copy link
Member Author

@lihaoyi, I think I'm happy with the linting and formatting part, would you mind giving it a first review? I'll finish up the coverage before final review however.

One question I had was about adding something like a ScalafmtModule's reformatAll global task. I'm not sure why it exists for scalafmt, since you can also do everything with wildcard selectors, e.g. __.reformat? My assumption is that it's related to avoiding the performance overhead of starting a scalafmt process for every module. Since ruff is a very fast command line utility, I just went with implementing regular tasks, and telling user to call __.ruffFormat if they want to reformat everything.

@lihaoyi
Copy link
Member

lihaoyi commented Jan 25, 2025

Thanks will take a look

One question I had was about adding something like a ScalafmtModule's reformatAll global task. I'm not sure why it exists for scalafmt, since you can also do everything with wildcard selectors, e.g. __.reformat? My assumption is that it's related to avoiding the performance overhead of starting a scalafmt process for every module. Since ruff is a very fast command line utility, I just went with implementing regular tasks, and telling user to call __.ruffFormat if they want to reformat everything.

The goal of the global ScalafmtModule.reformatAll is to let people use ScalaFmt without needing to mix in the ScalafmtModule trait into their modules. Just a bit of convenience since it's such a common need to format stuff

Comment on lines 28 to 31
private def configArgs: Task[Seq[os.Shellable]] = Task.Anon {
val cfg = ruffConfigFile()
if (os.exists(cfg.path)) Seq[os.Shellable]("--config", cfg) else Seq.empty[os.Shellable]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.Shellable is basically an implicit constructor for conveniently passing Seq[String]s, so we can probably just use T[Seq[String]] here instead

Copy link
Member Author

@jodersky jodersky Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I didn't go with a plain string seq, was to prevent caching: essentially, I want factor out the argument string --config, <path>, but also want to invalidate the result if the contents of the config file change. I didn't think this through indeed and somehow thought that shellable will handle this (but of course that's wrong).

Ideally this would be a helper function, not a task. However, I'm not sure how I can call others tasks in helper functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have helper functions that return anonymous tasks, or anoynymous tasks which return a function type

If you want to make a task that invalidates whenever the file changes, you want a Task.Input

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have helper functions that return anonymous tasks, or anoynymous tasks which return a function type

This sounds like what I'm looking for, thanks!

If you want to make a task that invalidates whenever the file changes, you want a Task.Input

But this wouldn't work if I want the dependents of the task to be invalidated, right?

def foo = Task.Input { "a" }
def bar = Task { foo() }

If I run bar(), then change "a" to "b", then re-run bar(), it would not actually be executed. Only foo would be reevaluated since it's an input, but since its output hasn't changed, bar would not be rerun afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use a Task.Source pointing at the config file, then the contents of the config file changing will cause the returned PathRef to have a different hash, and any task immediately downstream of the config file will be forced to re-evaluate.

Transitive downstream tasks then may or may not invalidate depending on whether the return value of the intermediate tasks changed, which is probably what you want most of the time

Copy link
Member Author

@jodersky jodersky Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, what was confusing me was that if I factored out the the command line construction (which contains only the file path, NOT the pathref) as an intermediary to the actual file contents, then it would effectively absorb any changes in the cache chain.

However, after some more reading through the task source code and testing, I think that a Task.Anon is actually sufficient here; there's no need for fancy function-returning tasks.

What I missed was that the results of Anon tasks are never cached, so if an upstream change causes an Anon task to be re-run, its dependents will also be re-run.

Thanks for your explanations!

@lihaoyi
Copy link
Member

lihaoyi commented Jan 25, 2025

I think this looks good for me. Apart from coverage, last thing to do would be to add the global mill.pythonlib.RuffModule/reformatAll command, for convenient usage without needing to inherit from any trait

@lihaoyi lihaoyi marked this pull request as ready for review January 31, 2025 10:07
@jodersky
Copy link
Member Author

As I mentioned directly to you, I've moved coverage to a separate PR

@lihaoyi lihaoyi merged commit 99638a6 into com-lihaoyi:main Jan 31, 2025
31 checks passed
@lefou lefou added this to the 0.12.6 milestone Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pythonlib Issues related to Mill's python support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants