-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub Actions cache is surprisingly slow #1485
Comments
I might be wrong but I think it also includes the compilation time. |
The From what I've seen, it looks like compile time is included in the write time. Cache reads are fast at
|
Also, did you set the environmental variable From the README:
|
FWIW, these logs are wrong. PR #1495 will hopefully fix that. |
I do not set (Also, I'm not receiving emails for updates in this repository for some reason. I suspect they may be getting routed to my defunct mozilla.com email. But I don't see any references to that email in my GitHub account. So unsure what's going on. Apologies if I'm tardy to reply - I probably didn't see the activity!) |
I'm seeing similar issues with HTTP error 429, and can confirm that we're setting
I'm surprised that it still reports that the cache write is finished though 🤔 |
@rajivshah3 in the future, please strip the color string, it makes the log harder to read ;) For cache write, we don't need to be super fast. Maybe a queue would do. |
Sure! I will add rate limit handling after implement apache/opendal#1111 |
apache/opendal#1111 has passed all opendal's tests now! It will come into sccache soon. |
@Xuanwo you are auto approving your patches in opendal ? |
Yep, most PRs are merged by me after all check passed. |
ok :( |
I expect to improve this part while we have more maintainers~ |
yes, please do, this should not be done on important project |
Hi, @indygreg, I re-produced you problem with the v0.4.0-pre.5 release at https://github.com/Xuanwo/apple-platform-rs/actions/runs/3871176193/jobs/6598713731 The real problem here is ghac will return a very long The full error message here:
Returns a cache miss or give up this cache write maybe a better choice. |
Hi, @indygreg. Here are our updates on addressing this issue. I have a test with my two patches: #1557 & #1550 on apple-platform-rs. The biggest lesson I have learned from this is that sccache is a cache service, and we don't need to ensure every write/read succeeds. So instead of waiting to retry limited requests, we will skip this step and keep going. Here are the differences: BeforeAfterNo failed build anymore The action is here Xuanwo/apple-platform-rs#1 |
(I'm still not receiving email notifications from Mozilla orgs due to a GitHub bug involving an orphaned email forwarding rule leftover from when I worked at Mozilla.) The updates here look great and I hope to test out the GHA backend again soon. The new times from the previous comment appear no worse than what I'm getting with 0.3.3 on S3 today. Considering GHA enables me to enable sccache on PRs without leaking my cloud provider credentials, this looks like a very compelling feature now! Thanks for all your work! |
@indygreg Please take v0.4.0-pre.6 a try~ |
Hi @Xuanwo , is there anything that can be done to improve this? On v0.4.0-pre.6 we're still seeing a lot of 429 errors in the logs (on both reads and writes) and there isn't much of a difference in compilation time (33 min uncached vs 29 min cached). The logs from |
This is the github's limit and I don't how to overcome them so far.
Sadly, sccache can't cache the heaviest dep
The time could be very long( |
Hi! Did the caching behavior change a lot from 0.3.3 to 0.4.0-pre.7? Locally, running
Yields these stats:
But on CI (see paradigmxyz/reth#1355) these numbers are vastly different:
The cache seems to be populated, writes and reads are reported as pretty fast, but the build is a lot slower than using rust-cache - some of our jobs went from 5 mins build time to 12-13 mins (see main CI). I based the workflow changes off of Xuanwo/apple-platform-rs#1 |
I have tried this command over Stopping sccache server...
Compile requests 733
Compile requests executed 537
Cache hits 0
Cache misses 535
Cache misses (C/C++) 24
Cache misses (Rust) 511
Cache timeouts 0
Cache read errors 0
Forced recaches 0
Cache write errors 0
Compilation failures 2
Cache errors 0
Non-cacheable compilations 0
Non-cacheable calls 189
Non-compilation calls 7
Unsupported compiler calls 0
Average cache write 0.776 s
Average cache read miss 0.776 s
Average cache read hit 0.000 s
Failed distributed compilations 0
Non-cacheable reasons:
crate-type 168
unknown source language 11
- 6
argument parse 3
missing input 1
Cache location Local disk: "/home/xuanwo/.cache/sccache"
Cache size 680 MiB
Max cache size 10 GiB The compile requests is the same. Are you running the same command? By the way, maybe it's better to build tests with sccache, upload to artifacts, and than running in different hash part. |
@Xuanwo Unfortunately using artifacts is a lot slower since the artifact upload speed is abysmal :( The artifact upload takes as long (or longer) as the build does. I am running the same command locally, I'll try a fresh clone, maybe something is messed with my local copy of the repository I am building Do you know/have any pointers on what could make sccache slower than rust-cache for the linked PR? |
No ideas so far. Maybe we are hitting the github action cache's rate limit. |
I thought so too, but |
I was excited to see the introduction of the GitHub Actions cache backend. However, when I tried it on a test branch, the results were surprisingly slow.
See the initial run at https://github.com/indygreg/apple-platform-rs/actions/runs/3723171428/jobs/6314513850. Here, a build job took ~1 hour. (It normally takes ~10m.) The
Stop sccache
step displays sccache server stats. This particular job shows 44cache errors
. Unsure if that is possibly the cause of problems?I did another CI run where I cranked up logging to debug and printed the log file after the build. See e.g. https://github.com/indygreg/apple-platform-rs/actions/runs/3723354206/jobs/6314814370 for a slow build. Interestingly, some of the jobs in this run were fast! See https://github.com/indygreg/apple-platform-rs/actions/runs/3723354206/jobs/6314814955 for a fast one. Spot checking the logs, I think there is a positive correlation between the
cache errors
count and job time.Poking around the logs, I see occurrences of
DEBUG sccache::compiler::compiler] [aws_http]: Cache write error: Cache service responded with 429 Too Many Requests: {"$id":"1","innerException":null,"message":"Request was blocked due to exceeding usage of resource 'Count' in namespace ''.","typeName":"Microsoft.TeamFoundation.Framework.Server.RequestBlockedException, Microsoft.TeamFoundation.Framework.Server","typeKey":"RequestBlockedException","errorCode":0,"eventId":3000}
. So I guess my project is too big to use GitHub Actions cache???More worrisome though is that some of the cache writes take 10+s. e.g.
And
I'm not sure how to interpret these logs. But multi-second values for
Cache write finished in
seem wrong: I would expect a cache write to effectively be an HTTP request and for inserts to take dozens to hundreds of milliseconds. But maybe the GitHub Actions cache latency is just really high?I'm unsure what actions are needed for this issue. I'm unsure if there's any code-level bugs. Maybe GitHub Actions cache is just prohibitively slow? Maybe this slowness/behavior should be documented?
The text was updated successfully, but these errors were encountered: