Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict *RateLimitError, return immediately without network call (9700x speedup when rate limit exceeded). #347

Merged
merged 3 commits into from
May 4, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 78 additions & 7 deletions github/github.go
Original file line number Diff line number Diff line change
Expand Up @@ -86,8 +86,9 @@ type Client struct {
// User agent used when communicating with the GitHub API.
UserAgent string

rateMu sync.Mutex
rate Rate // Rate limit for the client as determined by the most recent API call.
rateMu sync.Mutex
rateLimits [categories]Rate // Rate limits for the client as determined by the most recent API calls.
mostRecent rateLimitCategory

// Services used for talking to different parts of the GitHub API.
Activity *ActivityService
Expand Down Expand Up @@ -319,11 +320,13 @@ func parseRate(r *http.Response) Rate {

// Rate specifies the current rate limit for the client as determined by the
// most recent API call. If the client is used in a multi-user application,
// this rate may not always be up-to-date. Call RateLimits() to check the
// current rate.
// this rate may not always be up-to-date.
//
// Deprecated: Use the Response.Rate returned from most recent API call instead.
// Call RateLimits() to check the current rate.
func (c *Client) Rate() Rate {
c.rateMu.Lock()
rate := c.rate
rate := c.rateLimits[c.mostRecent]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems potentially problematic, since if a single client is being used for both search and non-search calls, the "most recent category" could get switched out from under you by another go routine before you have a chance to call Rate. What if Rate always returns the core rate (like it did before), and we provide a different mechanism to access to other rates? Or alternatively, we just instruct callers to pull that rate off the response? checkRateLimitBeforeDo would still handle both categories as it does currently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually yeah, we don't really need to provide another mechanism... we already instruct users to call client.RateLimits for the most up to date values. That should be sufficient, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a single client is being used for both search and non-search calls, the "most recent category" could get switched out from under you by another go routine before you have a chance to call Rate.

If I'm not mistaken, that cannot happen. Both rateLimits and mostRecent are in a critical section protected by rateMu mutex. That mutex is very important to prevent this problem. Is that not the case?

Copy link
Member Author

@dmitshur dmitshur May 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I see what you meant. You weren't referring to a race condition in the code, but a higher-level "logical" race condition. E.g., a client makes a core category API call and then calls Rate expecting to see rate limit for that category, but may instead get rate limit for the search category.

I agree that's potentially bad and should be considered. I'll think about this and get back to you later.

One observation is that each API call already returns a Response which contains the rate limit from that very API call. So there's really no need for client to call Rate. You mentioned that in "we just instruct callers to pull that rate off the response".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if Rate always returns the core rate (like it did before)

Now that I think about it more, are you sure that's what Rate currently does? Doesn't it already always return the latest rate limit from either of two categories, whichever was called last? In other words, this PR might be currently preserving previous behavior.

If you're sure that the current Rate only contains core category rate limits, can you show me where that happens and what prevents a call to search from updating rate?

It looks to me that search API calls would update it also. See here:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, looks like you're right... search is already updating the client rate, which leads to this same problem. Maybe we should just deprecate this method entirely since it's likely to cause problems, and just instruct people to use the rate limit on the response, or to call client.RateLimits? What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just deprecate this method entirely since it's likely to cause problems, and just instruct people to use the rate limit on the response, or to call client.RateLimits? What do you think?

I am absolutely in favor of doing that, given those 2 better alternatives exist, and the current behavior is pretty unreliable (it can give you core or search rate limit, effectively randomly).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Could you go ahead and update the doc for this method then as part of this PR?

Copy link
Member Author

@dmitshur dmitshur May 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 13c2701. (I've also squashed the previous commits that were implementing code review suggestions to have a clean, logical history.)

c.rateMu.Unlock()
return rate
}
Expand All @@ -332,8 +335,16 @@ func (c *Client) Rate() Rate {
// JSON decoded and stored in the value pointed to by v, or returned as an
// error if an API error has occurred. If v implements the io.Writer
// interface, the raw response body will be written to v, without attempting to
// first decode it.
// first decode it. If rate limit is exceeded and reset time is in the future,
// Do returns *RateLimitError immediately without making a network API call.
func (c *Client) Do(req *http.Request, v interface{}) (*Response, error) {
rateLimitCategory := category(req.URL.Path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about pulling this entire section (~22 lines) into its own separate function call?

As it stands, I think it draws attention away from someone looking at Do to find out what it does... but if it were in a separate function like checkRateLimit (or whatever you think is better) then I think this section would be much easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea, I'll do it. It'll look something like this:

if err := c.checkRateLimit(req); err != nil {
    return nil, err
}

Copy link
Member Author

@dmitshur dmitshur Apr 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in faca548.


// If we've hit rate limit, don't make further requests before Reset time.
if err := c.checkRateLimitBeforeDo(req, rateLimitCategory); err != nil {
return nil, err
}

resp, err := c.client.Do(req)
if err != nil {
return nil, err
Expand All @@ -348,7 +359,8 @@ func (c *Client) Do(req *http.Request, v interface{}) (*Response, error) {
response := newResponse(resp)

c.rateMu.Lock()
c.rate = response.Rate
c.rateLimits[rateLimitCategory] = response.Rate
c.mostRecent = rateLimitCategory
c.rateMu.Unlock()

err = CheckResponse(resp)
Expand All @@ -372,6 +384,33 @@ func (c *Client) Do(req *http.Request, v interface{}) (*Response, error) {
return response, err
}

// checkRateLimitBeforeDo does not make any network calls, but uses existing knowledge from
// current client state in order to quickly check if *RateLimitError can be immediately returned
// from Client.Do, and if so, returns it so that Client.Do can skip making a network API call unneccessarily.
// Otherwise it returns nil, and Client.Do should proceed normally.
func (c *Client) checkRateLimitBeforeDo(req *http.Request, rateLimitCategory rateLimitCategory) error {
c.rateMu.Lock()
rate := c.rateLimits[rateLimitCategory]
c.rateMu.Unlock()
if !rate.Reset.Time.IsZero() && rate.Remaining == 0 && time.Now().Before(rate.Reset.Time) {
// Create a fake response.
resp := &http.Response{
Status: http.StatusText(http.StatusForbidden),
StatusCode: http.StatusForbidden,
Request: req,
Header: make(http.Header),
Body: ioutil.NopCloser(strings.NewReader("")),
}
return &RateLimitError{
Rate: rate,
Response: resp,
Message: fmt.Sprintf("API rate limit of %v still exceeded until %v, not making remote request.", rate.Limit, rate.Reset.Time),
}
}

return nil
}

/*
An ErrorResponse reports one or more errors caused by an API request.

Expand Down Expand Up @@ -528,6 +567,8 @@ type RateLimits struct {
// The rate limit for non-search API requests. Unauthenticated
// requests are limited to 60 per hour. Authenticated requests are
// limited to 5,000 per hour.
//
// GitHub API docs: https://developer.github.com/v3/#rate-limiting
Core *Rate `json:"core"`

// The rate limit for search API requests. Unauthenticated requests
Expand All @@ -542,6 +583,25 @@ func (r RateLimits) String() string {
return Stringify(r)
}

type rateLimitCategory uint8

const (
coreCategory rateLimitCategory = iota
searchCategory

categories // An array of this length will be able to contain all rate limit categories.
)

// category returns the rate limit category of the endpoint, determined by Request.URL.Path.
func category(path string) rateLimitCategory {
switch {
default:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious... why do you list default: first? I don't remember seeing it that way before.

Copy link
Member Author

@dmitshur dmitshur Apr 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a style I've iterated towards after using switch statements more for multiplexing across multiple (usually non-overlapping) branching decisions in Go. I prioritize the order of cases, returning coreCategory and searchCategory, to be more consistent with how they're defined, over the conditions that lead to those return values.

I read it as:

For all paths, return coreCategory, except when it has /search/ prefix, then return searchCategory.

I personally like it because I think the cases themselves are more important than the conditions to trigger them, but I know this is a less common style. If people feel this is worse, I can revert to a more traditional style of sorting by conditions (putting default last).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other people reading, I want to note that the order of case expressions in Go is important... they are evaluated left-to-right and top-to-bottom and the first one that matches wins (except for default which wins when no others match).
https://golang.org/ref/spec#Switch_statements

Keeping it this way is fine with me.

return coreCategory
case strings.HasPrefix(path, "/search/"):
return searchCategory
}
}

// Deprecated: RateLimit is deprecated, use RateLimits instead.
func (c *Client) RateLimit() (*Rate, *Response, error) {
limits, resp, err := c.RateLimits()
Expand All @@ -567,6 +627,17 @@ func (c *Client) RateLimits() (*RateLimits, *Response, error) {
return nil, nil, err
}

if response.Resources != nil {
c.rateMu.Lock()
if response.Resources.Core != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious... did you add any unit test for this section? I was interested in seeing how these .Core and .Search responses were returned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet. Should I?

I was interested in seeing how these .Core and .Search responses were returned.

What do you mean by "how they were returned"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds weird... not sure what I meant by that... I guess it would just be nice to see a test case for one or both of these. Your call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in faca548.

c.rateLimits[coreCategory] = *response.Resources.Core
}
if response.Resources.Search != nil {
c.rateLimits[searchCategory] = *response.Resources.Search
}
c.rateMu.Unlock()
}

return response.Resources, resp, err
}

Expand Down
69 changes: 68 additions & 1 deletion github/github_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,13 @@ func TestNewClient(t *testing.T) {
}
}

// Ensure that length of Client.rateLimits is the same as number of fields in RateLimits struct.
func TestClient_rateLimits(t *testing.T) {
if got, want := len(Client{}.rateLimits), reflect.TypeOf(RateLimits{}).NumField(); got != want {
t.Errorf("len(Client{}.rateLimits) is %v, want %v", got, want)
}
}

func TestNewRequest(t *testing.T) {
c := NewClient(nil)

Expand Down Expand Up @@ -478,6 +485,60 @@ func TestDo_rateLimit_rateLimitError(t *testing.T) {
}
}

// Ensure a network call is not made when it's known that API rate limit is still exceeded.
func TestDo_rateLimit_noNetworkCall(t *testing.T) {
setup()
defer teardown()

reset := time.Now().UTC().Round(time.Second).Add(time.Minute) // Rate reset is a minute from now, with 1 second precision.

mux.HandleFunc("/first", func(w http.ResponseWriter, r *http.Request) {
w.Header().Add(headerRateLimit, "60")
w.Header().Add(headerRateRemaining, "0")
w.Header().Add(headerRateReset, fmt.Sprint(reset.Unix()))
w.Header().Set("Content-Type", "application/json; charset=utf-8")
w.WriteHeader(http.StatusForbidden)
fmt.Fprintln(w, `{
"message": "API rate limit exceeded for xxx.xxx.xxx.xxx. (But here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)",
"documentation_url": "https://developer.github.com/v3/#rate-limiting"
}`)
})

madeNetworkCall := false
mux.HandleFunc("/second", func(w http.ResponseWriter, r *http.Request) {
madeNetworkCall = true
})

// First request is made, and it makes the client aware of rate reset time being in the future.
req, _ := client.NewRequest("GET", "/first", nil)
client.Do(req, nil)

// Second request should not cause a network call to be made, since client can predict a rate limit error.
req, _ = client.NewRequest("GET", "/second", nil)
_, err := client.Do(req, nil)

if madeNetworkCall {
t.Fatal("Network call was made, even though rate limit is known to still be exceeded.")
}

if err == nil {
t.Error("Expected error to be returned.")
}
rateLimitErr, ok := err.(*RateLimitError)
if !ok {
t.Fatalf("Expected a *RateLimitError error; got %#v.", err)
}
if got, want := rateLimitErr.Rate.Limit, 60; got != want {
t.Errorf("rateLimitErr rate limit = %v, want %v", got, want)
}
if got, want := rateLimitErr.Rate.Remaining, 0; got != want {
t.Errorf("rateLimitErr rate remaining = %v, want %v", got, want)
}
if rateLimitErr.Rate.Reset.UTC() != reset {
t.Errorf("rateLimitErr rate reset = %v, want %v", rateLimitErr.Rate.Reset.UTC(), reset)
}
}

func TestDo_noContent(t *testing.T) {
setup()
defer teardown()
Expand Down Expand Up @@ -628,7 +689,6 @@ func TestRateLimit(t *testing.T) {
if m := "GET"; m != r.Method {
t.Errorf("Request method = %v, want %v", r.Method, m)
}
//fmt.Fprint(w, `{"resources":{"core": {"limit":2,"remaining":1,"reset":1372700873}}}`)
fmt.Fprint(w, `{"resources":{
"core": {"limit":2,"remaining":1,"reset":1372700873},
"search": {"limit":3,"remaining":2,"reset":1372700874}
Expand Down Expand Up @@ -684,6 +744,13 @@ func TestRateLimits(t *testing.T) {
if !reflect.DeepEqual(rate, want) {
t.Errorf("RateLimits returned %+v, want %+v", rate, want)
}

if got, want := client.rateLimits[coreCategory], *want.Core; got != want {
t.Errorf("client.rateLimits[coreCategory] is %+v, want %+v", got, want)
}
if got, want := client.rateLimits[searchCategory], *want.Search; got != want {
t.Errorf("client.rateLimits[searchCategory] is %+v, want %+v", got, want)
}
}

func TestUnauthenticatedRateLimitedTransport(t *testing.T) {
Expand Down