-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git repo cache #7424
Git repo cache #7424
Conversation
Inspired by the following pull request and done in collaboration with @aehlig (Klaus Aehlig) at the Bazel Hackathon 2018. #5928 Lots of open issues remaining: * Unit tests needs to be adapted to the fact that there is now a cache. * Klaus would like it to be optional only (I agree). * 'git worktree add ...' does not work on second run. --> tests failing because of that.
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
@siedentop could you please confirm the CLA - thanks :) |
Hi @Globegitter and @googlebot , I have confirmed the CLA a few years back. This was under my old handle "unapiedra" but still the same GitHub account. Can I do anything to confirm it again? |
@Globegitter I've just seen your comment regarding the TODO in the An alternative to hashing would be simply to sanitize the remote URL to work as a file name. This has the benefit that it stays readable. |
@siedentop I cannot find any Google CLAs signed by this current username. I believe you'll have to sign it again at https://cla.developers.google.com/ while logged in to the email used by this GitHub account, and for the commits. |
@jin I had to update the username. Could you please check again? |
Looks good! |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
I am seeing the tests are just failing on Ubuntu 14.04 - is there any way I can look at the logs @aehlig |
@Globegitter you can find the logs under the "Artifacts" tab on Build Kite: And the specific log is here: https://storage.googleapis.com/bazel-untrusted-buildkite-artifacts/35f70797-c976-405f-9098-c84e461744e7/src/test/shell/bazel/skylark_git_repository_test/attempt_3.log |
Great thanks for that pointer @siedentop |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the Googlers can find more info about SignCLA and this PR by following this link. |
The error on 14.04 is |
The test is now skipped if |
Is skipping the test entirely the right thing to do? Yes, we only expect caching, if the version of git supports it, but we still should ensure that we can fall back to a normal checkout if an older version of git is installed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your pull request and I'm sorry for the long delay. I generally like the idea of having a cache for git repositories. However, I think your patch does not yet provide what one usually expects of a cache; can you please address the comments.
Thanks,
Klaus
|
||
# Do not run tests if the git worktree command does not exist. | ||
# Testing without if check - is windows timing out due to that? | ||
# if git worktree --help; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this condition commented out? Should it be there? Or should we actually run the test even in this case, and just skip the verification that the correct hash ended up in the cache.
do_git_repository_test $commit_hash | ||
|
||
cache_commit_hash="$(cat $cache_dir/worktrees/pluto/HEAD)" | ||
assert_equals $commit_hash $cache_commit_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A verification of caching usually verifies that we can do a clone from cache only in an "offline" situation, if we successfully filled the cache first, and attempt to check out a fixed hash (not a branch). I don't see your test verifying that main property of a cache.
_populate_cache(ctx, git_cache, remote_name, ref, shallow) | ||
st = _get_repository_from_cache(ctx, directory, ref, git_cache) | ||
|
||
if st.return_code: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail or fall back to using no cache?
st = _get_repository_from_cache(ctx, directory, ref, git_cache) | ||
|
||
if st.return_code: | ||
_populate_cache(ctx, git_cache, remote_name, ref, shallow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach has the problem, that git fetch
is called unconditionally, and git fetch
tries to contact the remote site, even if we only need a commit already in cache. That means, we cannot use the cache when being offline, which would be a main use case for a cache.
This can be avoided by first trying to check out the correct commit and only if that fails, trying to update the cache.
Closing as activity on this seem to have stalled. PR authors: please address the review comments and rebase off master. |
I unfortunately will not have time in the foreseeable future to address the PR comments and bring this in a mergable state. So feel free to close for now. If I do hopefully get time again I am happy to reopen this as a new PR. But would be great if anyone else interested in this could pick this up. |
Closing again for staleness. |
This picks up the work mentioned here #5928 (comment) and should fix #5116
I refactored things a bit and made sure that the git cache actually works offline as well as without a cache and added a simple test. I have tested it in our monorepo and it seems to be working well.
Not sure what the
TODO needs a proper implementation.
in the_hash
method refers to maybe you now @aehlig or @siedentop ?