Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Cache): add miniget command, a minimal version of get #20568

Closed
wants to merge 13 commits into from
33 changes: 28 additions & 5 deletions Cache/Main.lean
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ Authors: Arthur Paulino
-/

import Cache.Requests
import Lean.Elab.ParseImportsFast
import Batteries.Data.String.Matcher

def help : String := "Mathlib4 caching CLI
Usage: cache [COMMAND]
Expand Down Expand Up @@ -33,9 +35,9 @@ Commands:
* Linked files refer to local cache files with corresponding Lean sources
* Commands ending with '!' should be used manually, when hot-fixes are needed

# The arguments for 'get' and 'get!'
# The arguments for 'get', 'get!' and 'get-'

'get' and 'get!' can process list of paths, allowing the user to be more
'get', 'get!' and 'get-' can process a list of paths, allowing the user to be more
specific about what should be downloaded. For example, with automatic glob
expansion in shell, one can call:

Expand All @@ -44,7 +46,11 @@ $ lake exe cache get Mathlib/Algebra/Field/*.lean Mathlib/Data/*.lean
Which will download the cache for:
* Every Lean file inside 'Mathlib/Algebra/Field/'
* Every Lean file inside 'Mathlib/Data/'
* Everything that's needed for the above"
* Everything that's needed for the above

If no arguments are given, 'get', 'get!' and 'get-' download information about all files imported
in some .lean file in the current directory or a subdirectory thereof (ignoring .lake folders).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same as lake exe cache get .?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! Running this in sphere-eversion (i.e., a project downstream of mathlib) yields uncaught exception: Unknown package directory for . - so I would say no :-)

"

open Lean System in
/-- Note that this normalizes the path strings, which is needed when running from a unix shell
Expand All @@ -57,13 +63,16 @@ def toPaths (args : List String) : List FilePath :=
else
mkFilePath (arg.toName.components.map Name.toString) |>.withExtension "lean"

/-- Commands which (potentially) call `curl` for downloading files -/
def curlArgs : List String :=
["get", "get!", "get-", "put", "put!", "put-unpacked", "commit", "commit!"]

/-- Commands which (potentially) call `leantar` for decompressing downloaded files -/
def leanTarArgs : List String :=
["get", "get!", "pack", "pack!", "unpack", "lookup"]

open Cache IO Hashing Requests System in
open System.FilePath in
def main (args : List String) : IO Unit := do
if Lean.versionString == "4.8.0-rc1" && Lean.githash == "b470eb522bfd68ca96938c23f6a1bce79da8a99f" then do
println "Unfortunately, you have a broken Lean v4.8.0-rc1 installation."
Expand All @@ -83,8 +92,22 @@ def main (args : List String) : IO Unit := do
let goodCurl ← pure !curlArgs.contains (args.headD "") <||> validateCurl
if leanTarArgs.contains (args.headD "") then validateLeanTar
let get (args : List String) (force := false) (decompress := true) := do
let hashMap ← if args.isEmpty then pure hashMap else hashMemo.filterByFilePaths (toPaths args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hashMemo already contain the dependency information?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only contains this for all files it has memoized this for. This is mathlib and all extraRoots (i.e., arguments passed to it). Running just lake exe cache get in a dependent project does not parse the files in the downstream project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the same code that populates this in miniget?

getFiles hashMap force force goodCurl decompress
let mut getArgs := toPaths args
if args.isEmpty then
-- Find all .lean files in subdirectories (excluding .lake) of the current directory.
let allFiles := System.FilePath.walkDir (← IO.Process.getCurrentDir)
(fun p ↦ pure (p.fileName != some ".lake"))
let leanFiles := (← allFiles).filter (fun p ↦ p.extension == some "lean")
-- For each file, find all imports starting with Mathlib.
let mut allModules := #[]
for fi in leanFiles do
let imports ← Lean.parseImports' (← IO.FS.readFile fi) ""
allModules := allModules.append <|
imports.map (fun imp ↦ imp.module) |>.filter (·.getRoot == `Mathlib)
-- and turn each "import Mathlib.X.Y.Z" into an argument "Mathlib.X.Y.Z.lean" to `get`.
getArgs := allModules.toList.map
fun mod ↦ mkFilePath (mod.components.map (fun s ↦ s.toString)) |>.addExtension "lean"
getFiles (← hashMemo.filterByFilePaths getArgs) force force goodCurl decompress
let pack (overwrite verbose unpackedOnly := false) := do
packCache hashMap overwrite verbose unpackedOnly (← getGitCommitHash)
let put (overwrite unpackedOnly := false) := do
Expand Down
Loading