Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.Digest is declared in Rust #10905

Merged
merged 2 commits into from
Oct 7, 2020

Conversation

stuhood
Copy link
Member

@stuhood stuhood commented Oct 5, 2020

Problem

Our fundamental filesystem types are declared in Python, but declaring them in Rust would cut down on FFI costs, and allow for better error messages by removing some usage of the externs::project_* functions.

Solution

Port fs.Digest to Rust. An advantage in the case of Digest is that unless the Python side calls Digest.fingerprint, we'll never bother to encode the fingerprint as hex (it stays binary in the private hashing::Digest instance. And likewise, if the engine is given back a Digest object that it had previously given out, it doesn't have to decode hex at the boundary. While it would have been possible to do this with Python code (by having a private bytes field for the fingerprint rather than a string), it would have involved redundant logic for validation.

Additionally adds initial Python typestubs for the native_engine module, which will likely be the eventual home for documentation on native types.

Result

For 100 runs:

before:
    real    22m40.378s
    user    24m58.706s
    sys     20m28.889s

after:
    real    20m37.329s
    user    23m0.305s
    sys     17m34.746s

[ci skip-build-wheels]
@jsirois
Copy link
Contributor

jsirois commented Oct 5, 2020

LGTM.

A (significant) disadvantage

I don't see how that's significant. The universe of intrinsic types will be small by definition. This is a disadvantage we should be willing to put up with since it really only affects a few core developers unlike...

The only real disadvantage I can see would be losing easy to find / idiomatic API information for rule authors. That's already non-ideal since there is no easy way to know what intrinsic input -> output mappings there are without looking at high level docs. Here we lose a natural place to hang field and method docs for data types (unless ecosystem tooling knows to pick docs off .pyi stubs when otherwise missing?). These are clearly solvable problems though.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! I agree with John. My main concern was that users can no longer look at engine/fs.py to see a simple dataclass to see how the type is implemented. The type stubs mitigate my concerns.

I'm not too concerned about the custom eq and hash. Sure, it's some boilerplate, but we had lots of boilerplate to begin with.

sources=['native.py'],
sources=[
'native.py',
'native_engine.pyi',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woot! This is exciting to see used.

Comment on lines +83 to +85
product=get.output_type,
declared_subject=get.input_type,
subject=get.input,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, fine to not fix, but I thought I had fixed most of the Rust code to call this input and output.

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood stuhood marked this pull request as ready for review October 6, 2020 04:14
@stuhood
Copy link
Member Author

stuhood commented Oct 6, 2020

Reviewable.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 0.0% when pulling 77bfe5f on stuhood:stuhood/add-and-use-file-digest into 923f461 on pantsbuild:master.

@stuhood stuhood merged commit 478110e into pantsbuild:master Oct 7, 2020
@stuhood stuhood deleted the stuhood/add-and-use-file-digest branch October 7, 2020 02:36
@Eric-Arellano Eric-Arellano mentioned this pull request Oct 11, 2020
Eric-Arellano added a commit that referenced this pull request Oct 12, 2020
Internal only changes left off from the changelog:

* Use cpython types in Rust functions that manipulate python objects (#10942)
  `PR #10942 <https://github.com/pantsbuild/pants/pull/10942>`_

* update libz-sys version to fix macOS compile error (#10941)
  `PR #10941 <https://github.com/pantsbuild/pants/pull/10941>`_

* Upgrade to Rust stable 1.47.0. (#10933)
  `PR #10933 <https://github.com/pantsbuild/pants/pull/10933>`_

* Finish CreateDigest Directory cleanup. (#10935)
  `PR #10935 <https://github.com/pantsbuild/pants/pull/10935>`_

* Hotfix broken import from merge conflict (#10934)
  `PR #10934 <https://github.com/pantsbuild/pants/pull/10934>`_

* Revert "Port nailgun client to rust (#10865)" (#10929)
  `PR #10929 <https://github.com/pantsbuild/pants/pull/10929>`_

* An ExternalTool for downloading the grpc_python_plugin. (#10927)
  `PR #10927 <https://github.com/pantsbuild/pants/pull/10927>`_

* Port nailgun client to rust (#10865)
  `PR #10865 <https://github.com/pantsbuild/pants/pull/10865>`_

* print stacktraces during import errors (#10906)
  `PR #10906 <https://github.com/pantsbuild/pants/pull/10906>`_

* fs.Digest is declared in Rust (#10905)
  `PR #10905 <https://github.com/pantsbuild/pants/pull/10905>`_

* add requests_ca_bundle to settable_env_vars (#10909)
  `PR #10909 <https://github.com/pantsbuild/pants/pull/10909>`_

[ci skip-rust]
stuhood added a commit to stuhood/pants that referenced this pull request Dec 17, 2020
…antsbuild#10905.

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
stuhood added a commit that referenced this pull request Dec 17, 2020
### Problem

Our fundamental filesystem types are declared in Python, but declaring them in Rust would cut down on FFI costs, and allow for better error messages by removing some usage of the `externs::getattr*` functions.

Additionally, the previous change to move `fs.Digest` to Rust in #10905 missed the optimization opportunity of directly extracting the inner Rust `Digest` instance.

### Solution

In two commits: port `fs.Snapshot` to Rust, and optimize the conversion of a `PyDigest` to a `Digest`.

### Result

6% faster runs of `./pants dependencies --transitive ::`.

[ci skip-build-wheels]

fingerprint: str
serialized_bytes_length: int
You can use `await Get(Snapshot, Digest)` to set the file names referred to, or use `await
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/set/get/ ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants