-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further improve how we minimize the possibility of GitHub AbuseException #188
Comments
Probably yes, according to GitHub's docs:
So maybe a random wait between One other thing you might want to consider is to read the From the docs:
|
|
@patriksvensson correct me if I'm wrong by I think headers returned from GitHub API contains rate limit info, so one might not need to resort to random, but know when to retry. Edit: saw @augustoproiete already linked to those docs. |
@devlead Correct. If we're doing many calls against the same endpoint and resource, we can also cache the ETag between calls. Using an ETag in a GitHub API call counts as 0 (zero) against the rate limit. |
Yes headers contain rate limit info and as I previously indicated I already respect them. Also, there's nothing to cache because we trigger the abuse detection mostly when commiting, creating an issue or submitting a PR. The issue is not retrieving from the API, it's creating, commiting, etc. I wish the exception from GitHub said something like: |
Maybe solution is to run it more often so fewer PRs created at a time? |
@devlead that might indeed be a good solution to fall back on if I can't prevent AbuseExceptions: reduce the number of yaml file synchronized per run but run the process more often. GitHub's rate limits are on a "per-hour" basis, therefore we couldn't run our process more often than once an hour. Maybe 4 times a day would be more reasonable? I still want to figure out why we get this exception and how to prevent it in the first place. It's bothering me! By the way, one theory I mentioned in my original comment is that there may be another process running under the same |
Could having a separate addin discover user be an alternative? |
We seem to be triggering GitHub's abuse detection more frequently as of late and the result is that our automated process raises an issue to add/modify/delete a given yaml file but we don't submit the corresponding PR.
It could be caused by another process running under the
cake-contrib-bot
user and making GitHub API calls around the same time the discoverer is scheduled (need to ask the @cake-contrib/cake-team if any new process is running), or maybe GitHub has changed the heuristics they use to determine if a process is abusing their API. To my knowledge, they don't disclose what these heuristics and therefore not much we can do about it. Either @devlead or @gep13 (it's been a long time, so I don't remember who it was) talked to someone at GitHub about it at some point and they acknowledged that they had heuristics in place, above and beyond the publicly disclosed "rate limits", but they have not disclosed what they are.That's why I added some defense mechanisms in the discoverer such as:
.GetLastApiInfo()
and stop processing if there are less than an arbitrary but reasonable (as determined by me) number of calls remainingBut evidently, these are not sufficient, so I propose the following:
AbuseException
. e.g.: what was the remaining number of API calls prior to the exception, how many yaml file were processed prior to the exception, etc. Maybe this could help us figure out why we are triggering it.Now that I think about it, I should implement the change to display additional info in the log before changing anything else. Maybe this new information will help me understand the root cause of this problem.
The text was updated successfully, but these errors were encountered: