-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hex badges show up as "invalid" #1285
Comments
Confirmed, these are all showing invalid.
Probably an API change. Would someone like to look into it? Here are the tests. |
I'll have a look into this issue sometime this week. |
I run Shields locally and all Hex.pm badges are working correctly. Service-tests are working as well. |
Could you try the deployed commit too? It's possible, if unlikely, that it's been fixed since the deploy. |
I checked the API response, and it seems to be consistent with the processing done in the code. Tests are working fine as well. I also fired a server up with the currently deployed commit on my local machine (Node 8.9.1). Hex badges are generated as expected: Therefore I'm unsure why we are getting such errors on these badges. After a quick look into the hexpm repository, throttling/address blocking seems to be implemented on their side. Could we be hitting the rate limitations and trying to parse bogus responses, leading to "invalid" badges? Trying to make an API request directly from the production server may help us out here. |
If you provide a list of IPs that your production servers use I can can check if they have hit the rate limiting on Hex.pm. |
Making a request to the production servers is a good idea. I don't have access yet, however. Here are the three IPs:
|
All of these IP addresses have been blocked because they consistently exceeded 100 requests/min to the Hex.pm API. We can unblock them but my guess is that they will hit the rate limiting again. I have suggested before that shields.io should do conditional HTTP requests and do request collapsing. As an example when I load http://shields.io/ 3 individual API requests are made for each Hex.pm plug package badge. Why are these requests not collapsed into a single request, why is the cache-control, etag, and last-modified headers ignored? This is ignoring the hundreds of other badges that are loaded from different services, refreshing http://shields.io/ is a great way to have your own little DOS service. If shields will not improve its caching I guess we have to build a special endpoint that is cheaper and only returns the data you need and that we don't have to rate limit. If you let us know what endpoints you hit on the Hex.pm API and the fields you need we can build this endpoint for you. |
I’m happy to discuss solutions. I joined the project several months ago so I wasn't part of the previous discussion. The Shields servers serve about 10k requests per minute, and while I don’t have per-service stats, I’m not surprised that the servers could at times make more than 100 req/min to Hex.pm. The caching in Shields is based on the request. That means subsequent requests for the same badge will be cached for a while, though requests for different badges (e.g. license vs. version vs. downloads) will not. So the home page probably is not the problem. Once those badges have generated once, they will not make new requests until they are invalidated. It would be nice to add caching for the service requests! Since a lot of projects will display multiple badges which pull the same data, we could save a sizable number of requests this way. I could see implementing it as part of the service rewrite. Again I don't have exact numbers, but my impression is that Shields gets by on a tiny hosting budget, relying on optimized code, and avoiding any compute-intensive work. See this conversation on Twitter. The in-memory cache is size-limited to avoid OOM conditions, so I’m guessing to add a more sizable cache we’d need to add some hosting budget and back it with Redis or Memcache. The data the current badges use is:
You could make a batch process that dumps this to a static file, which we could grab once an hour (or once a day) and keep in the cache. Another thought… would it be possible to add caching behind your endpoint? What makes this request so expensive? |
You might also consider issuing an API key for Shields, as some other services have done. |
Shields.io webpage currently displays about 320 badges. If it's visited frequently (more frequent requests from shields.io than from other pages) all badges should be in cache. Do you know how much RAM does shields servers have? I would like to compare in memory caching made by shields code with Varnish Cache https://varnish-cache.org/intro/ |
I have whitelisted the shields.io IPs so they should not be blocked in the future.
Thanks for this. I will create an optimized endpoint that only returns the data that shields uses, I will let you know when it's live.
It's not super expensive, it's just that if we make a specific endpoint that ignores the rate limiting I feel like it should be optimized.
Sounds like a good idea, this way we don't have to rely on whitelisting specific IPs.
This is not what I am seeing, if I grep our access logs and reload http://shields.io I see 3 new requests every time. |
Thanks for this info! So they are not in shields cache (but we have 3 servers - 3 caches). |
Is that still happening after the whitelisting? As @platan aptly observes, we do have three servers, and they don't share a cache. The requests should trickle to zero, as each of the three accumulates the five badges in its cache. There are five Hex.pm badges on the page, so if nothing were being cached I'd expect to see five.
Thank you! |
@paulmelnikow @platan (cc @ericmj) The badges are still showing up as invalid, I think it hasn't really changed since last year. This affects all Erlang and Elixir projects that are using Shields.io for Hex.pm 🙁 |
It was definitely fixed last year and they work for me when I go to shields.io although the badges loads slowly. What badges are invalid for you? |
Badges on https://github.com/eproxus/meck/blob/master/README.md Looking at the requests it seems requests to to camo.githubusercontent.com times out:
Looking further, GitHub generates the following HTML for the README: <a href="https://hex.pm/packages/meck" rel="nofollow">
<img src="https://camo.githubusercontent.com/cc57adb0caefa2a016a54b863eded43f96dcd269/68747470733a2f2f696d672e736869656c64732e696f2f686578706d2f762f6d65636b2e7376673f7374796c653d666c61742d737175617265"
alt="Hex.pm Version"
data-canonical-src="https://img.shields.io/hexpm/v/meck.svg?style=flat-square"
style="max-width:100%;">
</a> Accessing https://img.shields.io/hexpm/v/meck.svg?style=flat-square actually works, but I think the issue is that it is too slow (13.69 s) so probably GitHub's in-between caching layer breaks. A request to https://hex.pm/api/packages/meck is very fast, 137 ms (I assume this is the endpoint Shields.io is using). Looks like the issue is the request speed to img.shields.io. |
Seems to be the case of #1568. Closing this since it was just related to Hex.pm, which is now no longer the problem. |
See examples on http://shields.io or in https://github.com/eproxus/meck/blob/master/README.md
The text was updated successfully, but these errors were encountered: