Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Cache): add miniget command, a minimal version of get #20568

Closed
wants to merge 13 commits into from

Conversation

grunweg
Copy link
Collaborator

@grunweg grunweg commented Jan 8, 2025

Add a miniget command, a more minimal version of the get command. Instead of downloading the entire mathlib cache, it finds all .lean files in the current directory (excluding .lake directorires), reads all "import Mathlib X.Y.Z" lines in them, and runs lake exe cache get X/Y/Z.lean for these lines In short, this just downloads the cache necessary for a certain project.

Inspired by a zulip comment of kim-em.

A (hopefully less controversial) part of #20567.


Open in Gitpod

This downloads only cache files for every Lean file in some subdirectory.
@grunweg grunweg requested review from kim-em, Kha and arthurpaulino and removed request for Kha January 8, 2025 14:43
Copy link

github-actions bot commented Jan 8, 2025

PR summary d3593d5f13

Import changes for modified files

No significant changes to the import graph

Import changes for all files
Files Import difference

Declarations diff

No declarations were harmed in the making of this PR! 🐙

You can run this locally as follows
## summary with just the declaration names:
./scripts/declarations_diff.sh <optional_commit>

## more verbose report:
./scripts/declarations_diff.sh long <optional_commit>

The doc-module for script/declarations_diff.sh contains some details about this script.


No changes to technical debt.

You can run this locally as

./scripts/technical-debt-metrics.sh pr_summary
  • The relative value is the weighted sum of the differences with weight given by the inverse of the current value of the statistic.
  • The absolute value is the relative value divided by the total sum of the inverses of the current values (i.e. the weighted average of the differences).

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just change get to behave in this way?

@grunweg grunweg changed the title feat(Cache): add miniget command feat(Cache): make get only download the cache for every file imported from a .lean file Jan 8, 2025
@grunweg
Copy link
Collaborator Author

grunweg commented Jan 8, 2025

Good idea! I have changed get (and get! and get-) to behave this way instead, if they are passed no arguments.
get with additional arguments is unchanged.

Cache/Main.lean Outdated
Comment on lines 51 to 52
If no arguments are given, 'get', 'get!' and 'get-' download information about all files imported
in some .lean file in the current directory or a subdirectory thereof (ignoring .lake folders).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same as lake exe cache get .?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! Running this in sphere-eversion (i.e., a project downstream of mathlib) yields uncaught exception: Unknown package directory for . - so I would say no :-)

@grunweg
Copy link
Collaborator Author

grunweg commented Jan 8, 2025

By the way: running this in sphere-eversion confirms the code works. The new version only wants to download the cache for 2248 files, as opposed to 5833 files before.

@kim-em
Copy link
Contributor

kim-em commented Jan 9, 2025

This step adds about a second to my lake exe cache get.

I think we should disable it if we're in Mathlib.

@kim-em kim-em added the awaiting-author A reviewer has asked the author a question or requested changes label Jan 9, 2025
@grunweg
Copy link
Collaborator Author

grunweg commented Jan 9, 2025

This step adds about a second to my lake exe cache get.

I think we should disable it if we're in Mathlib.

Fair point. I don't think Cache knows about the current workspace, though - so don't see how to easily disable this. Perhaps it's better to make this a separate command, after all? What do you think?

@@ -83,8 +102,22 @@ def main (args : List String) : IO Unit := do
let goodCurl ← pure !curlArgs.contains (args.headD "") <||> validateCurl
if leanTarArgs.contains (args.headD "") then validateLeanTar
let get (args : List String) (force := false) (decompress := true) := do
let hashMap ← if args.isEmpty then pure hashMap else hashMemo.filterByFilePaths (toPaths args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hashMemo already contain the dependency information?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only contains this for all files it has memoized this for. This is mathlib and all extraRoots (i.e., arguments passed to it). Running just lake exe cache get in a dependent project does not parse the files in the downstream project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the same code that populates this in miniget?

@grunweg
Copy link
Collaborator Author

grunweg commented Jan 9, 2025

Thinking about this again and having discussed with @fpvandoorn, it seems better to not change the default behaviour of cache, and make this a separate command instead. I'll re-instance miniget.

@grunweg grunweg changed the title feat(Cache): make get only download the cache for every file imported from a .lean file feat(Cache): add miniget command, a minimal version of get Jan 9, 2025
@grunweg grunweg removed the awaiting-author A reviewer has asked the author a question or requested changes label Jan 9, 2025
mathlib-bors bot pushed a commit that referenced this pull request Jan 9, 2025
- document two definitions
- fix description of `get-`
- mention `get-`, which also takes arguments

Split from #20568.
@grunweg grunweg mentioned this pull request Jan 11, 2025
2 tasks
grunweg added a commit that referenced this pull request Jan 11, 2025
- document two definitions
- fix description of `get-`
- mention `get-`, which also takes arguments

Split from #20568.
@leanprover-community-bot-assistant leanprover-community-bot-assistant added the merge-conflict The PR has a merge conflict with master, and needs manual merging. (this label is managed by a bot) label Feb 11, 2025
@grunweg
Copy link
Collaborator Author

grunweg commented Feb 16, 2025

Closing in favour of #21238.

@grunweg grunweg closed this Feb 16, 2025
@grunweg grunweg deleted the MR-cache-miniget branch February 16, 2025 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-conflict The PR has a merge conflict with master, and needs manual merging. (this label is managed by a bot)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants