Bluebird leaks domains on exception thrown in a then() function. #1125

jdmarshall · 2016-06-08T00:20:09Z

See https://gist.github.com/jdmarshall/e16424c6f039fb6bf4dab1951cb32129

I've tried this with node 0.10, 0.12, 4.4 and 6.0, and bluebird 2.9 and 3.4.

At process exit there are several domains remaining on the stack, and any data associated with the domain is kept around.

In all of the versions listed above, this code results in 2 orphaned domains, but in 0.10 it results in only one. This causes a pretty big issue if, for instance, you are using domains and promises to service requests in express.

The problem is that domain.bind(), which bluebird uses to service promises that were created in a domain, does not have a try/finally block around the domain.enter(), domain.exit() pairing.

Bluebird needs to do its own cleanup on errors, which it doesn't do.

petkaantonov · 2016-06-13T09:47:19Z

It sounds like a bug in domain.bind implementation and the best solution would be to fix it at the source.

jdmarshall · 2016-06-14T18:19:33Z

I disagree.

bind() clearly expects to receive a function that doesn't throw unhandled exceptions. It has always been thus and through 3 refactorings of that code it has always been so. That consistency to me implies intent.

Or to look at it another way, fixing it in Node means it only gets fixed for future versions of Node. If bluebird only works properly on some future version of Node, and Node is planning to replace Bluebird with an ES promises implementation anyway, then the niche for bluebird is effectively tiny. If you iron out the error handling on this end, then bluebird continues to be indispensible for everyone using Node LTS or older.

petkaantonov · 2016-06-14T18:37:30Z

The point of bind is that exceptions thrown from the bound function are forwarded to the domain. That's what the docs say: https://nodejs.org/api/domain.html#domain_domain_bind_callback.

benjamingr · 2016-06-14T18:41:38Z

Domains are deprecated anyway.

petkaantonov · 2016-06-14T18:42:44Z

Still seems pretty critical bug that it's unfit for its only purpose

jdmarshall · 2016-06-14T19:32:31Z

True enough on the bind() docs, but domains are undead at this point. They exist and there is no viable replacement, and without them Node's async I/O code is more of a footnote than a killer feature, because if you're running one request per node instance then who cares if your IO is non-blocking? Who are you blocking?

But that still leaves the practical matter that nobody will ever be able to use Bluebird cleanly with any version of Node that currently exists, unless they use one request per process.

spion · 2016-06-14T23:30:25Z

Why one request per process?

Are you using domains as continuation local storage?

benjamingr · 2016-06-15T10:17:17Z

@petkaantonov it is unfit for its only purpose. @trevnorris wrote about it in nodejs/node#6159 - basically, the problem is that you can't escape a domain once you entered it when going to node code.

For example - if express wraps requests in a domain and then you call fs and you are still inside the domain inside its callback then your domain is catching errors from fs without giving it a chance to clean up. Promises for example - don't have that problem since you never throw into code running in another context. Domains are therefor inherently broken for error handling.

Work on Zones is undergoing to replace domains at the language level - but that's still very early on and might suffer from the same problem. async_wrap on the other hand is safe and well thought out (IMO) but it doesn't do (or pretend to do) error handling at all.

jdmarshall · 2016-06-27T18:21:01Z

@spion After a fashion, yes. It's an internal framework, and we're doing the pattern of 'protect the framework from client code misbehaving'. There's a bit of request state that accumulates and we have services that accumulate and/or repackage data from other services, so the promise chains tend to get big enough to warrant a diagram.

As for my one request per process comment, I sometimes jump too many steps in a logical progression, apologies for not being more detail oriented. This is getting a little off-track but I should explain myself anyway. Without domains, then you're accepting that a 500 error in one request should cause a 500 error in all of the other in-flight requests on the same process, and I'm in agreement with my team that this is not acceptable. So the only ways I'm aware of to isolate an error to a single request are domains, or one process per request.

Most of the errors we get should be treated at 400 errors, anything from missing auth to missing cookie to missing resource. They're intended to be caught and handled within our promise callback chains and indeed they usually are. Usually.

But my thinking on the matter is that node and OSS can deprecate libraries all they want, but until they've provided an alternative it's just a blame assignment trick and doesn't actually address the pain points.

My read of the bluebird code is that the problem comes from a bit of inverted logic in the domain code in bluebird. Rather than bind()ing the function and then wrapping it in a try/catch, you should be wrapping it in a try/catch and then bind()ing it. Then your fail() code happens independent of what Node is or isn't doing.

spion · 2016-06-28T00:33:35Z

Domains aren't deprecated because they are "not good enough". They are a horribly broken design. If you catch a domain error, node internals and/or other callback based libraries that build on top of them can get into an undefined state. By undefined state, I mean something like letting a dangling file handle or connection open, or some other kind of internal state being bad (non-working event emitter on a server, etc). After a few of these your process may become unusable, consume 100% cpu, refuse to process further requests... the possibilities are endless.

For a contrived example, read http://blog.izs.me/post/65712662830/restart-nodejs-servers-on-domain-errors-sensible

In projects where we use bluebird, we settled for

allowing crashes in non-promise code, and
using promises everywhere we possibly can

as a good-enough combination to deal with this.

No sensible alternative is possible for current callback-based code.

petkaantonov · 2016-08-31T05:28:41Z

Reverted in 3.4.5, this is impossible to implement

petkaantonov added bug 3.0.0 labels Aug 24, 2016

petkaantonov closed this as completed in 40ed56c Aug 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bluebird leaks domains on exception thrown in a then() function. #1125

Bluebird leaks domains on exception thrown in a then() function. #1125

jdmarshall commented Jun 8, 2016 •

edited

Loading

petkaantonov commented Jun 13, 2016

jdmarshall commented Jun 14, 2016

petkaantonov commented Jun 14, 2016

benjamingr commented Jun 14, 2016

petkaantonov commented Jun 14, 2016

jdmarshall commented Jun 14, 2016 •

edited

Loading

spion commented Jun 14, 2016 •

edited

Loading

benjamingr commented Jun 15, 2016

jdmarshall commented Jun 27, 2016

spion commented Jun 28, 2016 •

edited

Loading

petkaantonov commented Aug 31, 2016

Bluebird leaks domains on exception thrown in a then() function. #1125

Bluebird leaks domains on exception thrown in a then() function. #1125

Comments

jdmarshall commented Jun 8, 2016 • edited Loading

petkaantonov commented Jun 13, 2016

jdmarshall commented Jun 14, 2016

petkaantonov commented Jun 14, 2016

benjamingr commented Jun 14, 2016

petkaantonov commented Jun 14, 2016

jdmarshall commented Jun 14, 2016 • edited Loading

spion commented Jun 14, 2016 • edited Loading

benjamingr commented Jun 15, 2016

jdmarshall commented Jun 27, 2016

spion commented Jun 28, 2016 • edited Loading

petkaantonov commented Aug 31, 2016

jdmarshall commented Jun 8, 2016 •

edited

Loading

jdmarshall commented Jun 14, 2016 •

edited

Loading

spion commented Jun 14, 2016 •

edited

Loading

spion commented Jun 28, 2016 •

edited

Loading