Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CommandHandler.exceptionCaught does incorrect behavior when occuring exception in addToStack() #788

Open
hyunwookwi opened this issue Jun 5, 2018 · 11 comments
Labels
type: bug A general bug
Milestone

Comments

@hyunwookwi
Copy link

hyunwookwi commented Jun 5, 2018

lettuce version: 4.4.2.Final

when exception occured before stack.add(commandToUse) in addToStack(), Netty call CommandHandler.exceptionCaught().
CommandHandler.exceptioCaught() process stack.poll();

I think this behavior seems to be wrong. because The command that failed due to an exception is not yet on the stack.
Please check your code.

Currently, I am using lettuce to get/set cache data with redis.
My application send set command using lettuce, and I expected "OK" response.
But sometimes, my app received response of previously get command.
And previously sent get command occur TimeoutException.

@mp911de
Copy link
Collaborator

mp911de commented Jun 5, 2018

Good catch. Do you have an example of an exception that may happen in the code path before the command is added to the protocol stack?

@mp911de mp911de added the type: bug A general bug label Jun 5, 2018
@hyunwookwi
Copy link
Author

If my application doesn't have enough memory, JRE throw OutOfMemory while processing stack.add(commandToUse);

Thank you.

@mp911de
Copy link
Collaborator

mp911de commented Jun 5, 2018

Lettuce can't account for VM errors. If an Error is thrown, then the whole driver gets into an undefined state.

@hyunwookwi
Copy link
Author

However, I think lettuce should handle it when all kinds of exceptions are occured.
Otherwise, any subsequent Commands cannot receive a expected response anymore.

Or you can disconnect channel and rebuild it.

@mp911de
Copy link
Collaborator

mp911de commented Jun 5, 2018

To quote a recommendation:

It may work, but it is generally a bad idea. There is no guarantee that your application will succeed in recovering, or that it will know if it has not succeeded. For example:

There really may be not enough memory to do the requested tasks, even after taking recovery steps like releasing block of reserved memory. In this situation, your application may get stuck in a loop where it repeatedly appears to recover and then runs out of memory again.

The OOME may be thrown on any thread. If an application thread or library is not designed to cope with it, this might leave some long-lived data structure in an incomplete or inconsistent state.

If threads die as a result of the OOME, the application may need to restart them as part of the OOME recovery. At the very least, this makes the application more complicated.

Suppose that a thread synchronizes with other threads using notify/wait or some higher level mechanism. If that thread dies from an OOME, other threads may be left waiting for notifies (etc) that never come ... for example. Designing for this could make the application significantly more complicated.

However, let's see whether we can do anything here at all.

@hyunwookwi
Copy link
Author

I hope my application receive an error or exception at all rather than an wrong response value.

have a nice day.

@734839030
Copy link

734839030 commented Jan 3, 2023

I hope my application receive an error or exception at all rather than an wrong response value.

have a nice day.

i agree. Incorrect data is very dangerous to business programs.

@be-hase
Copy link
Contributor

be-hase commented Dec 8, 2023

I strongly agree with @hyunwookwi and @734839030 .

@mp911de
In fact, I have seen a few cases of mixed incorrect response data due to OOME.
This is a very dangerous situation in business. It could lead to the leakage of personal information.

In the case of OOME due to heap exhaustion, it can be avoided by specifying ExitOnOutOfMemoryError.
However, in the case of OOME due to direct buffer exhaustion, ExitOnOutOfMemoryError does not work, and the application will continue to operate with incorrect response data returned.

Of course, it is difficult for a library to do the right thing for OOME, but can we stop returning wrong responses?
It would be preferable for the application to crush rather than return incorrect response data.

@mp911de
Copy link
Collaborator

mp911de commented Dec 8, 2023

The exception handling and connection handling have been revised quite a few times since the original report. Do you have stack traces handy?

@be-hase
Copy link
Contributor

be-hase commented Dec 8, 2023

Sorry, unfortunately I do not have the stack traces at hand.
I am trying to reproduce it, but I can't seem to write a small reproduction code.

Maybe it is the same as this one.
#2528

For example, I think both heap and direct buffer could be OOME during encode.
https://github.com/lettuce-io/lettuce-core/blob/c8d9011d5299dd6816c52cc404945fbfcf1157ac/src/main/java/io/lettuce/core/codec/StringCodec.java#L134-L148
https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/Unpooled.java#L185-L211

@be-hase
Copy link
Contributor

be-hase commented Dec 8, 2023

Hmmm. But maybe it's different because when it's OOME, it doesn't go into this block.
https://github.com/lettuce-io/lettuce-core/blob/main/src/main/java/io/lettuce/core/protocol/CommandEncoder.java#L95-L97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A general bug
Projects
None yet
Development

No branches or pull requests

4 participants