Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a CDN #1880

Closed
chris48s opened this issue Aug 10, 2018 · 10 comments
Closed

Use a CDN #1880

chris48s opened this issue Aug 10, 2018 · 10 comments
Labels
performance-improvement Related to performance or throughput of the badge servers

Comments

@chris48s
Copy link
Member

Since we started setting cache-control headers, our badges are now cached by GitHub's camo proxy when badges are viewed on GitHub and in the user's browser when viewed directly. This has given us a noticeable step forward in performance.

Some significant proportion of our traffic goes through GitHub's camo proxy but a lot of traffic comes from sources such as NPM, PyPI, Packagist etc which don't independently proxy requests.

We can see by making requests to https://img.shields.io/servertime.svg using a client which does not cache (e.g: curl) that these direct requests aren't being cached by CloudFlare.

Truncated example:

$ curl "https://img.shields.io/servertime.svg"
<svg ..... textLength="2410">Fri Aug 10 2018 07:45:19 GMT-0400 (EDT)</text></g> </svg>

wait a few seconds..

$ curl "https://img.shields.io/servertime.svg"
<svg ..... textLength="2410">Fri Aug 10 2018 07:45:25 GMT-0400 (EDT)</text></g> </svg>

By using a CDN, this would probably allow us to make another similar step forward in performance similar to the one we've just made by further reducing the number of requests made for badges relating to popular projects.

One option would be CloudFlare. I think we are currently on a free CloudFlare account, but if we mail [email protected] with details of the project they offer to upgrade qualifying Open Source projects using their service to a pro plan for free (see https://blog.cloudflare.com/cloudflare-open-source-your-upgrade-is-on-the-house/ ). This would allow us to use them as a CDN.

So far:

Given the impact we've seen from some badges being cached by camo I think its worth trying to re-open the conversation around this as there is a lot of potential for positive impact on users. If we're concerned about CF's use of cookies, are the other solutions we could consider here?

@chris48s chris48s added the performance-improvement Related to performance or throughput of the badge servers label Aug 10, 2018
@paulmelnikow
Copy link
Member

I have been pleasantly surprised at how much impact we've seen from camo's and other caches. I'd of course like to shorten the expiry time a bit so users see fresh results (e.g. for tests and when new versions are published), though we could also lengthen it for badges like license and static badges which rarely change.

Another option to consider, as we consider my proposal to move to a PaaS, is the Zeit CDN. It's Cloudflare under the hood, but it's tightly integrated into Zeit. It would be free to us. The person I've spoken with at Zeit says the CDN would be a really effective for our application, and also conveyed that, while we can put Zeit behind our own Cloudflare, we lose the benefits of their integration like automatic flushing on deploys.

In the meantime, turning on caching in our Cloudflare makes sense to me. I believe our Cloudflare account is not dedicated to Shields, but shared with other services Thaddée runs. I wonder if we can still use their pro plan for free.

@paulmelnikow
Copy link
Member

Have been discussing this with @espadrine today. This is turned this on experimentally!

@paulmelnikow
Copy link
Member

paulmelnikow commented Aug 18, 2018

Can confirm using @chris48s's test that the second servertime request shows the same time:

$ curl "https://img.shields.io/servertime.svg"
<svg ...>Sat Aug 18 2018 13:48:42 GMT-0400 (EDT)</text></g> </svg>

(wait a few seconds)

$ curl "https://img.shields.io/servertime.svg"
<svg ...>Sat Aug 18 2018 13:48:42 GMT-0400 (EDT)</text></g> </svg>

After waiting a few minutes, if I re-fetch I get a new time on the first request, then the same time on subsequent requests.

$ curl -v "https://img.shields.io/servertime.svg"
*   Trying 104.27.188.207...
* TCP_NODELAY set
* Connected to img.shields.io (104.27.188.207) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: OU=Domain Control Validated; OU=PositiveSSL Multi-Domain; CN=sni89405.cloudflaressl.com
*  start date: Apr 27 00:00:00 2018 GMT
*  expire date: Nov  3 23:59:59 2018 GMT
*  subjectAltName: host "img.shields.io" matched cert's "*.shields.io"
*  issuer: C=GB; ST=Greater Manchester; L=Salford; O=COMODO CA Limited; CN=COMODO ECC Domain Validation Secure Server CA 2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7ffedc803200)
> GET /servertime.svg HTTP/2
> Host: img.shields.io
> User-Agent: curl/7.54.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200
< date: Sat, 18 Aug 2018 17:51:28 GMT
< content-type: image/svg+xml;charset=utf-8
< set-cookie: __cfduid=d371cbce0f250eb847e3dbfa0b361cf661534614688; expires=Sun, 18-Aug-19 17:51:28 GMT; path=/; domain=.shields.io; HttpOnly
< cache-control: max-age=120
< expires: Sat, 18 Aug 2018 17:53:28 GMT
< cf-cache-status: EXPIRED
< expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< server: cloudflare
< cf-ray: 44c63a8a5e02923c-EWR
<
* Connection #0 to host img.shields.io left intact
<svg ...>Sat Aug 18 2018 13:51:28 GMT-0400 (EDT)</text></g> </svg>
$ curl -v "https://img.shields.io/servertime.svg"
<svg ...>Sat Aug 18 2018 13:51:28 GMT-0400 (EDT)</text></g> </svg>
$ curl -v "https://img.shields.io/servertime.svg"
<svg ...>Sat Aug 18 2018 13:51:28 GMT-0400 (EDT)</text></g> </svg>

Cache headers look good:

< date: Sat, 18 Aug 2018 17:51:28 GMT
< cache-control: max-age=120
< expires: Sat, 18 Aug 2018 17:53:28 GMT

Added: I've added additional monitors:

https://status.shields-server.com/

@espadrine
Copy link
Member

How to know whether we hit the CloudFlare cache:

curl -vk 'https://img.shields.io/gitter/room/nwjs/nw.js.svg?maxAge=60' 2>&1 | grep -i cf-cache-status
< CF-Cache-Status: HIT

Small note that the behavior detailed in this comment from a CloudFlare support engineer has changed as we requested:

curl -vk 'https://img.shields.io/gitter/room/nwjs/nw.js.svg?maxAge=60' 2>&1 | grep Cache-Control
< Cache-Control: max-age=120

@paulmelnikow
Copy link
Member

paulmelnikow commented Aug 18, 2018

We're seeing a significant number of requests handled by the CDN.

screen shot 2018-08-18 at 4 32 43 pm

I'd suggest we re-consider the cache headers on the static badges. That would be a good way to drive that number up.

That code is in flight in #1802 though the changes are unrelated.

Note also that the website https://shields.io/ is no longer going "through" CloudFlare; the DNS is at CloudFlare but points directly to Github Pages. We may see a slight dip in traffic as those numbers are no longer included, though come to think of it, the traffic on the website probably dwarfs the traffic on the badge servers.

@espadrine
Copy link
Member

image

@paulmelnikow
Copy link
Member

The trend is continuing with peak traffic:

screen shot 2018-08-20 at 12 45 11 pm

@chris48s
Copy link
Member Author

Looks like we've seen a big drop in average response time since this was enabled on Saturday :) Is there any follow-up outstanding on this, or should we close this issue?

@paulmelnikow
Copy link
Member

Yea, it seems really stable! Think this can be closed.

The follow-up work could be tracked elsewhere:

@ghost

This comment has been minimized.

@badges badges locked as resolved and limited conversation to collaborators Feb 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
performance-improvement Related to performance or throughput of the badge servers
Projects
None yet
Development

No branches or pull requests

4 participants