-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect import dependencies during setup phase #2750
Labels
Comments
Merged
janmasrovira
added a commit
that referenced
this issue
May 14, 2024
- Contributes to #2750 # New commands: 1. `dev import-tree scan FILE`. Scans a single file and lists all the imports in it. 2. `dev import-tree print`. Scans all files in the package and its dependencies. Builds an import dependency tree and prints it to stdin. If the `--stats` flag is given, it reports the number of scanned modules, the number of unique imports, and the length of the longest import chain. Example: this is the truncated output of `juvix dev import-tree print --stats` in the `juvix-stdlib` directory. ``` [...] Stdlib/Trait/Partial.juvix imports Stdlib/Data/String/Base.juvix Stdlib/Trait/Partial.juvix imports Stdlib/Debug/Fail.juvix Stdlib/Trait/Show.juvix imports Stdlib/Data/String/Base.juvix index.juvix imports Stdlib/Cairo/Poseidon.juvix index.juvix imports Stdlib/Data/Int/Ord.juvix index.juvix imports Stdlib/Data/Nat/Ord.juvix index.juvix imports Stdlib/Data/String/Ord.juvix index.juvix imports Stdlib/Prelude.juvix Import Tree Statistics: ======================= • Total number of modules: 56 • Total number of edges: 193 • Height (longest chain of imports): 15 ``` Bot commands support the `--scan-strategy` flag, which determines which parser we use to scan the imports. The possible values are: 1. `flatparse`. It uses the low-level [FlatParse](https://hackage.haskell.org/package/flatparse-0.5.1.0/docs/FlatParse-Basic.html) parsing library. This parser is made specifically to only parse imports and ignores the rest. So we expect this to have a much better performance. It does not have error messages. 2. `megaparsec`. It uses the normal juvix parser and we simply collect the imports from it. 4. `flatparse-megaparsec` (default). It uses the flatparse backend and fallbacks to megaparsec if it fails. # Internal changes ## Megaparsec Parser (`Concrete.FromSource`) In order to be able to run the parser during the scanning phase, I've adjusted some of the effects used in the parser: 1. I've removed the `NameIdGen` and `Files` constraints, which were unused. 2. I've removed `Reader EntryPoint`. It was used to get the `ModuleId`. Now the `ModuleId` is generated during scoping. 3. I've replaced `PathResolver` by the `TopModuleNameChecker` effect. This new effect, as the name suggests, only checks the name of the module (same rules as we had in the `PathResolver` before). It is also possible to ignore the effect, which is needed if we want to use this parser without an entrypoint. ## `PathResolver` effet refactor 1. The `WithPath` command has been removed. 2. New command `ResolvePath :: ImportScan -> PathResolver m (PackageInfo, FileExt)`. Useful for resolving imports during scanning phase. 3. New command `WithResolverRoot :: Path Abs Dir -> m a -> PathResolver m a`. Useful for switching package context. 4. New command `GetPackageInfos :: PathResolver m (HashMap (Path Abs Dir) PackageInfo)` , which returns a table with all packages. Useful to scan all dependencies. The `Package.PathResolver` has been refactored to be more like to normal `PathResolver`. We've discussed with @paulcadman the possibility to try to unify both implementations in the near future. ## Misc 1. `Package.juvix` no longer ends up in `PackageInfo.packageRelativeFiles`. 1. I've introduced string definitions for `--`, `{-` and `-}`. 2. I've fixed a bug were `.juvix.md` was detected as an invalid extension. 3. I've added `LazyHashMap` to the prelude. I've also added `ordSet` to create ordered Sets, `ordMap` for ordered maps, etc. # Benchmarks I've profiled `juvix dev import-tree --scan-strategy [megaparsec | flatparse] --stats` with optimization enabled. In the images below we see that in the megaparsec case, the scanning takes 54.8% of the total time, whereas in the flatparse case it only takes 9.6% of the total time. - **Megaparsec**  - **Flatparse**  ## Hyperfine ``` hyperfine --warmup 1 'juvix dev import-tree print --scan-strategy flatparse --stats' 'juvix dev import-tree print --scan-strategy megaparsec --stats' --min-runs 20 Benchmark 1: juvix dev import-tree print --scan-strategy flatparse --stats Time (mean ± σ): 82.0 ms ± 4.5 ms [User: 64.8 ms, System: 17.3 ms] Range (min … max): 77.0 ms … 102.4 ms 37 runs Benchmark 2: juvix dev import-tree print --scan-strategy megaparsec --stats Time (mean ± σ): 174.1 ms ± 2.7 ms [User: 157.5 ms, System: 16.8 ms] Range (min … max): 169.7 ms … 181.5 ms 20 runs Summary juvix dev import-tree print --scan-strategy flatparse --stats ran 2.12 ± 0.12 times faster than juvix dev import-tree print --scan-strategy megaparsec --stats ``` In order to compare (almost) only the parsing, I've forced the scanning of each file to be performed 50 times (so that the cost of other parts get swallowed). Here are the results: ``` hyperfine --warmup 1 'juvix dev import-tree print --scan-strategy flatparse --stats' 'juvix dev import-tree print --scan-strategy megaparsec --stats' --min-runs 10 Benchmark 1: juvix dev import-tree print --scan-strategy flatparse --stats Time (mean ± σ): 189.5 ms ± 3.6 ms [User: 161.7 ms, System: 27.6 ms] Range (min … max): 185.1 ms … 197.1 ms 15 runs Benchmark 2: juvix dev import-tree print --scan-strategy megaparsec --stats Time (mean ± σ): 5.113 s ± 0.023 s [User: 5.084 s, System: 0.035 s] Range (min … max): 5.085 s … 5.148 s 10 runs Summary juvix dev import-tree print --scan-strategy flatparse --stats ran 26.99 ± 0.52 times faster than juvix dev import-tree print --scan-strategy megaparsec --stats ```
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The first step towards #2749, is to build the import tree during the setup phase. The strategy is to create a fast parser that scans all import statements in a Juvix file. In the pipeline, when we've added all dependencies, we'll use this parser to create the import tree. The import tree should have the paths already resolved (i.e. for each import statement, we know exactly to what other file in the system in points to).
The text was updated successfully, but these errors were encountered: