Skip to content

Commit

Permalink
Retry on proxy errors according to retry schedule
Browse files Browse the repository at this point in the history
  • Loading branch information
jwolski committed Feb 24, 2015
1 parent 57d7565 commit 9a007f8
Show file tree
Hide file tree
Showing 9 changed files with 428 additions and 41 deletions.
59 changes: 30 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,31 +59,23 @@ ringpop.on('changed', function() {
});
```

# Forwarding a request
Ringpop will typically be used by handing over request routing to it. As described earlier, the "process or forward" pattern is used to decide whether a request should be processed by the node that received the request or by another node. As an alternative, `handleOrForward` can be used to encapsulate that repetitive pattern. Here's an example of its use:
# Request Proxying
ringpop provides request routing as a convenience. When a request arrives at your services' public endpoints, you'll want to use ringpop to decide whether request processing should take place on the node that received the request or elsewhere. If elsewhere, ringpop will proxy your request to the correct destination.

```javascript
// Let's say this is an endpoint handler in your web application
function endpoint(incoming, opts, cb) {
function handle() {
// process request
}

function forwarded(err, resp, body) {
cb(err, {
statusCode: resp && resp.statusCode
});
}

var requestToForward = {
method: 'POST',
path: '/supply/1',
headers: { 'Content-Type: application/json' },
body: JSON.stringify({ /* fill body here */ }),
timeout: 1000
};

ringpop.handleOrForward(incoming.params.uuid, handle, requestToForward, forwarded);
Upon arrival of a proxied request at its destination, membership checksums of the sender and receiver will be compared. The request will be refused if checksums differ. Mismatches are to be expected when nodes are entering or exiting the cluster due to deploys, added/removed capacity or failures. The cluster will eventually converge on one membership checksum, therefore, refused requests are best handled by retrying them.

ringpop's request proxy has retries built in and can be tuned using 2 parameters provided at the time ringpop is instantiated: `requestProxyMaxRetries` and `requestProxyRetrySchedule` or per-request with: `maxRetries` and `retrySchedule`. The first parameter is an integer representing the number of times a particular request is retried. The second parameter is an array of integer or floating point values representing the delay in between consecutive retries.

ringpop has codified the aforementioned routing pattern in the `handleOrProxy` function. It returns true when `key` hashes to the "current" node and false otherwise. `false` results in the request being proxied to the correct destination. Its usage looks like this:

```js
var opts = {
maxRetries: 3,
retrySchedule: [0, 0.5, 2]
};

if (ringpop.handleOrProxy(key, req, res, opts)) {
// handle request
}
```

Expand Down Expand Up @@ -121,6 +113,9 @@ These counts are emitted when:
* `ping.send` - a ping is sent
* `ping-req.recv` - a ping-req is received
* `ping-req.send` - a ping is sent
* `requestProxy.retry.attempted` - a proxied request retry is attempted
* `requestProxy.retry.failed` - a proxied request is retried up to the maximum number of retries and fails
* `requestProxy.retry.succeeded` - a proxied request is retried and succeeds

## Gauges
These gauges represent:
Expand Down Expand Up @@ -158,11 +153,17 @@ All other properties should be considered private. Any mutation of properties no
* `whoami()` - Returns the address of the running node

## Events

* `ready` - Ringpop is ready
* `changed` - Ring state has changed (DEPRECATED)
* `membershipChanged` - Membership state has changed for one or more members, either their status or incarnation number. A membership change may result in a ring change.
* `ringChanged` - Ring state has changed for one or more nodes: a node has been added to or removed from the cluster. All ring changes are also member changes, but not vice versa.
These events are emitted when:

* `ready` - ringpop has been bootstrapped
* `changed` - ring or membership state is changed (DEPRECATED)
* `membershipChanged` - membership state has changed (status or incarnation number). A membership change may result in a ring change.
* `requestProxy.checksumsDiffer` - a proxied request arrives at its destination and source/destination checksums differ
* `requestProxy.retryAttempted` - a scheduled retry expires and a retry is attempted
* `requestProxt.retryScheduled` - a retry is scheduled, but not yet attempted
* `requestProxy.retrySucceeded` - a request that is retried succeeds
* `requestProxy.retryFailed` - a request is retried up to the maximum number of retries and fails
* `ringChanged` - ring state has changed for one or more nodes either having joined or left the cluster. All ring changes are member changes, but not vice versa.

## Installation

Expand Down
7 changes: 6 additions & 1 deletion index.js
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,11 @@ function RingPop(options) {
this.membershipUpdateFlushInterval = options.membershipUpdateFlushInterval ||
MEMBERSHIP_UPDATE_FLUSH_INTERVAL;

this.requestProxy = new RequestProxy(this);
this.requestProxy = new RequestProxy({
ringpop: this,
maxRetries: options.requestProxyMaxRetries,
retrySchedule: options.requestProxyRetrySchedule
});
this.ring = new HashRing();
this.dissemination = new Dissemination(this);
this.membership = new Membership(this);
Expand Down Expand Up @@ -129,6 +133,7 @@ RingPop.prototype.destroy = function destroy() {
this.gossip.stop();
this.suspicion.stopAll();
this.membershipUpdateRollup.destroy();
this.requestProxy.destroy();

this.clientRate.m1Rate.stop();
this.clientRate.m5Rate.stop();
Expand Down
135 changes: 131 additions & 4 deletions lib/request-proxy.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ var TypedError = require('error/typed');

var safeParse = require('./util').safeParse;

var RETRY_SCHEDULE = [0, 1, 3.5];

var InvalidCheckSumError = TypedError({
type: 'ringpop.request-proxy.invalid-checksum',
message: 'Expected the remote checksum to match local ' +
Expand All @@ -36,19 +38,55 @@ var InvalidCheckSumError = TypedError({
expected: null
});

var MaxRetriesExceeded = TypedError({
type: 'ringpop.request-proxy.max-retries-exceeded',
message: 'Max number of retries ({maxRetries}) to fulfill request have been exceeded',
maxRetries: null
});

module.exports = RequestProxy;

function RequestProxy(ringpop) {
this.ringpop = ringpop;
function numOrDefault(num, def) {
return typeof num === 'number' ? num : def;
}

function RequestProxy(opts) {
this.ringpop = opts.ringpop;
this.retrySchedule = opts.retrySchedule || RETRY_SCHEDULE;
this.maxRetries = numOrDefault(opts.maxRetries, this.retrySchedule.length);

// Simply a convenience over having to index into `retrySchedule` each time
this.maxRetryTimeout = this.retrySchedule[this.retrySchedule.length - 1];

this.retryTimeouts = [];
}

var proto = RequestProxy.prototype;

proto.clearRetryTimeout = function clearRetryTimeout(timeout) {
var indexOf = this.retryTimeouts.indexOf(timeout);

if (indexOf !== -1) {
this.retryTimeouts.splice(indexOf, 1);
}
};

proto.destroy = function destroy() {
this.retryTimeouts.forEach(function eachTimeout(timeout) {
clearTimeout(timeout);
});

this.retryTimeouts = [];
};

proto.proxyReq = function proxyReq(opts) {
var self = this;
var keys = opts.keys;
var dest = opts.dest;
var req = opts.req;
var res = opts.res;
var maxRetries = opts.maxRetries;
var retrySchedule = opts.retrySchedule;

var ringpop = this.ringpop;
var url = req.url;
Expand Down Expand Up @@ -86,8 +124,9 @@ proto.proxyReq = function proxyReq(opts) {
ringpop.logger.trace('requestProxy sending tchannel proxy req', {
url: req.url
});
ringpop.channel.send(options, '/proxy/req',
head, rawBody, onProxy);

var sendIt = sendHandler();
sendIt(options, head, rawBody, onProxy);
}

function onProxy(err, res1, res2) {
Expand Down Expand Up @@ -122,6 +161,93 @@ proto.proxyReq = function proxyReq(opts) {
});
res.end(res2);
}

function sendHandler() {
// Per-request parameters override instance-level
maxRetries = numOrDefault(maxRetries, self.maxRetries);
retrySchedule = retrySchedule || self.retrySchedule;

var numRetries = 0;
var reqStartTime = Date.now();
var retryStartTime = null;

return function sendIt(options, head, body, callback) {
if (maxRetries === 0) {
ringpop.channel.send(options, '/proxy/req',
head, body, callback);
return;
}

ringpop.channel.send(options, '/proxy/req', head, body, onSend);

function onSend(err, res1, res2) {
if (!err) {
if (numRetries > 0) {
ringpop.stat('increment', 'requestProxy.retry.succeeded');
ringpop.logger.info('ringpop request proxy retry succeeded', {
numRetries: numRetries,
maxRetries: maxRetries,
reqMethod: req.method,
reqUrl: req.url,
reqStartTime: reqStartTime,
reqTotalTime: Date.now() - reqStartTime,
retrySchedule: retrySchedule,
retryStartTime: retryStartTime,
retryTotalTime: Date.now() - retryStartTime
});
ringpop.emit('requestProxy.retrySucceeded');
}

callback(null, res1, res2);
return;
}

if (numRetries >= maxRetries) {
ringpop.stat('increment', 'requestProxy.retry.failed');
ringpop.logger.warn('ringpop request proxy retry exceeded max retries and failed', {
maxRetries: maxRetries,
maxRetryTimeout: self.maxRetryTimeout,
reqMethod: req.method,
reqUrl: req.url,
reqStartTime: reqStartTime,
reqTotalTime: Date.now() - reqStartTime,
retrySchedule: retrySchedule,
retryStartTime: retryStartTime,
retryTotalTime: Date.now() - retryStartTime
});
ringpop.emit('requestProxy.retryFailed');

callback(MaxRetriesExceeded({
maxRetries: maxRetries
}));
return;
} else {
if (numRetries === 0) {
retryStartTime = Date.now();
}

// TODO Can we make this timeout value responsive/adaptive to previously
// recorded cluster converge-times?
var delay = numOrDefault(retrySchedule[numRetries] * 1000,
self.maxRetryTimeout);
var timeout = setTimeout(function onTimeout() {
numRetries++;

self.clearRetryTimeout(timeout);

sendIt(options, head, body, callback);

ringpop.stat('increment', 'requestProxy.retry.attempted');
ringpop.emit('requestProxy.retryAttempted');
}, delay);

self.retryTimeouts.push(timeout);

ringpop.emit('requestProxy.retryScheduled');
}
}
};
}
};

proto.handleRequest = function handleRequest(head, body, cb) {
Expand All @@ -141,6 +267,7 @@ proto.handleRequest = function handleRequest(head, body, cb) {
err: err,
url: url
});
ringpop.emit('requestProxy.checksumsDiffer');
return cb(err);
}

Expand Down
Loading

0 comments on commit 9a007f8

Please sign in to comment.