-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.util.concurrent.TimeoutException: Futures timed out after [10 seconds] #85
Comments
Hi @alejandrod, well, it is hard to tell, but if you say there is Brando under the hood that means you use old version of this library. Recently, I was forced to switch the engine to scredis instead due to the serious bug and no support in Brando. Running it on AWS suggests, this might be the same case. Look at #29 #44, this might help. Let me know the result. Best, |
Thanks for the reply! We are using |
Well, Only one other thing occurs to me, which might be related. How big is the traffic? I believe that all requests to Do you think this is the case? That could explain the non-deterministic behavior. If so, do you have how Akka works internally? Is there some need for manual dispatching to run the actors in multiple threads? Otherwise, I'll have to look at it. |
Oh, may be I am wrong about the default configuration, but still, there is no load balancer or dispatcher and I don't know what is the behavior of default akka dispatcher. I believe it runs just in one thread |
So, brando should not be the problem as we use scredis:
Will check the dispatcher thing. Even so 1 thread with 10seconds of timeout for a single server with a single node with 100 keys sounds like enough. |
I don't think that's it but you may try. I use this library in like 10-20 web apps on-premise and never seen anything like that. I use timeout 3 seconds but my
How long are running these? Is it like seconds or so? Or is it much faster, like milliseconds? This is where I would start digging. Multiple invocations of this and it timeouts right away. The only situation when I sometimes run into timeouts is running all tests at once, which might be related to low throughput. |
Mainly DB request so should be ms. But, there might be a few http calls. Now, depending how this is being executed internally, one long request might cause issues on other requests. |
Alright. If the dispatcher is the solution, it has to be part of the play-redis, not scredis. The timeout would be caused by this library. Scredis is asynchronous and there should be requests only within ms. Those long running task are executed within play-redis. Let me know when you find how to support multiple threads, thanks! |
For sure. Will take a look and be back to you |
Ok, after a bit more digging I think that the problem is that scredis reconnects every now and then.
In any case, it recovers. Now, I'd like to address a more important factor. Maybe you can help me with the following questions: 1 - How do I make sure that the call to the cache never affects the app. Worst case we perform the underlying request/app logic. Thanks! |
Hi, I am sorry to hear about the issue with scredis. Honestly, I cannot imagine what causes that. However, due to several other issues, I am thinking of changing the connector again. So, in the upcoming release there won't be scredis anymore, I believe. To the second part of your question. Yes, I agree that the cache should not affect the application flow. To resolve your first question, there is Recovery policy exactly for this reason. I check and the default value is As of your second question, you are right. I'll look at it and try to figure out some solution. However, this might be very difficult. The timeout is applied to convert async API to sync API on all futures regardless the operation. Honestly, I don't think it is possible to apply it only on the underlying calls. And I think (not sure) that Scala API requires the definition of some timeout when waiting for futures to be resolved, which is your case. The only thing I have in mind is to split the timeout into the connection timeout and the synchronization timeout. What do you think? Would any of these help you? |
I can confirm it fails and it doesn't call the Regarding the timeout. Just to be clear, the timeout should be for the cache retrieval part I think. Too bad that the connector has to be changed. |
I'll try to set up some scenario with timeout and we will see what will happen. Well, one issue is here. There has to be some timeout. However, I'll browse the code and try to figure out how to introduce another internal timeout just for the cache connector. Regarding changing the connector, I really dislike the solution, but it turned out there are significant issues with the repository maintenance, the lib wasn't released for a long time, there are unreleased features and bugfixes and it does not seem to get better. I think there is no choice, Scala 2.12 is |
Alright, I reproduced your issue, I hope. Can you check it? Note: Default timeout in tests is 3 secs. "long running orElse" in {
Cache.get[ String ]( s"$prefix-test-timeout-in-orelse" ) must beNone
Cache.getOrElse[ String ]( s"$prefix-test-timeout-in-orelse" ) {
Thread.sleep( 5000 ) // sleep 5 secs to timeout
"some value"
} must beEqualTo( "some value" )
} This test used to fail, now I fixed it a bit (locally, not published) and it works although it shouldn't. I have to investigate it. Regarding the different connection timeout and invocation timeout. As you are using The internal timeout for communication with the Redis can be set directly in scredis configuration. I don't see any need to override this in If I am not wrong, the only thing left to deal with is the reconnection to the Redis, which I am worried I cannot help you with. It seems like an issue related to your environment, there are no other reported issues neither in this nor in the scredis repositories right now. However, when I finished writing this answer, one thing occurred to me. The Is it clear or should I explain something better? Does it help you and resolve your issues? |
I believe I could improve the documentation clarity when speaking of the timeout. |
I have investigated the scredis code base and the only place where the |
Ok, just to be sure we are in sync. // This test should always pass. Even assuming redis is not working
"long running orElse" in {
Cache.get[ String ]( s"$prefix-test-timeout-in-orelse" ) must beNone
Cache.getOrElse[ String ]( s"$prefix-test-timeout-in-orelse" ) {
// Regardless of how long it takes, this should not fail. It's the app code.
// If a timeout is needed, should be set to a long value.
Thread.sleep( 5000 )
"some value"
} must beEqualTo( "some value" )
}
|
This is not possible with |
I think the solution you are looking for is following:
Do this and you are fine I think. |
Those are the default values. I haven't changed them. I only play with The problem might be that the whole call is wrapped with a Future. The redis and the app logic. class MySyncCache extends CacheApi {
val someAsyncCache = new CacheAsyncApi
val redisCacheTimeout = 3 seconds
override def getOrElse[A](key: String, expiration: Duration)(orElse: => A)
(implicit evidence$1: ClassManifest[A]): A = {
try {
Await.result(someAsyncCache.get(key), redisCacheTimeout).get
} catch {
case _: Exception =>
// Some error getting from async cache
val v = orElse
setCacheEntrySomehowSafeNoError(v)
v
}
}
} |
Well, you are hitting here a significant limitation - the whole library is intentionally async, including the By the way, you could reach the same effect without changing the code, just by the proper configuration. Right now, there are 3 timeouts.
|
Can you review PR #86? |
@alejandrod, FYI, I'm implementing API for Play 2.6.x right now and found this. The Play framework implements |
Well, the thing is what |
It's not important what it does because, in the end, the whole statement is wrapped into |
Will be released as |
Thanks! |
Hi folks,
I'm hoping to get some help. Maybe it's an issue, maybe I'm missing something.
I'm using the sync version of the API and I get the timeout errors quite often (10 seconds timeout). Not always.
At the beginning I thought it was just slow code calculating the entry on cache.getOrElse. But, then I even saw a timeout on removing a key. Why we would get a timeout here at all??
We run on AWS and use a single ElasticCache node. Before the cache was in memory and was just fine.
I know Brando is used under the hood. I'm wondering If I should change something on the akka system.
P/S: I can't provide a test for this as it's not deterministic. It's a simple use case with a small single server with a single redis node.
Thanks for your help
The text was updated successfully, but these errors were encountered: