-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test dataset downloadability on a schedule #2617
Comments
@seemethere how do you think would be a good way for a scheduled CI job to signal us that a test stopped working (due to an external server going down / stopping to work)? Sending an e-mail is one option, but is there a better one? |
We could also possibly have it just a send a message in a slack channel as well. That way community members as well as maintainers can have a way to see the signal in a more public space. Another good middle ground is to have a github action that posts a comment to a specific issue every time a dataset is not able to be downloaded |
I'm currently looking into |
It's entirely possible, but we haven't had much success trying to create an alerting system through CircleCI |
A simple solution could be this https://github.com/JasonEtco/create-an-issue. If we test the download as part of a GitHub Actions workflow, this could simply create an issue from a given template if the workflow fails. @seemethere @fmassa Is GitHub Actions a alternative to CircleCI for this? |
I've created a proof of concept repo. With this minimal setup it will create an issue like this https://github.com/pmeier/test-issue-on-fail/issues/9 every time the workflow fails. I think that might already be enough. |
Re-opening this, since #2665 only laid the ground work, but not the actual testing on a schedule. |
I think this looks pretty good! One thing to check is to see if the bot creates a repeated issue every day if the CI is not fixed the day it fails. |
Unfortunately, right now it will. I have more time at the end of September / early October to build a proper bot. Until then, we have stick to close it manually. That being said, with the retry functionality and wait time between the individual requests, I don't think this will fail often. Actual dead links or broken downloads are quite rare. Should I send a PR for this? |
Hum, that would be a bit annoying to be spammed by a known-problem by the bot. But I suppose having an initial PR would be good to have |
Given that I'm responsible for the datasets, it will mostly spam me 😉 |
Currently we rely on user feedback to detect failing dataset downloads. In #2610 I've included a downloadability test by asserting a head request to every URL used in the dataset is successful. We should expand these test for all downloadable datasets.
That being said, I don't think we should run this as part of every PR or push, but rather on a schedule (for example daily). In that case we need some way to inform us about failed tests.
cc @seemethere @pmeier
The text was updated successfully, but these errors were encountered: