Did release 2.17.0 break call concurrency? #206

Abdullah-l · 2024-02-22T20:13:40Z

Please describe the issue

Hey guys @splittingred @leonid-shevtsov, check this out:
Create two RPCs

Client streaming
Anything else (I tried server streaming and unary)

Place a log in both RPCs
Call RPC 1 then 2

On gruf 2.16.2 it behaves normally by printing both logs

On gruf 2.17.0 it does not print the log in RPC 2. The server does not acknowledge the call at all, the client keeps loading indefinitely unless I cancel RPC 1.

This blocking behaviour I suspect was introduced in #178 as it's the only PR in that release that touches threading.

The issue is reproducible on the latest version too (2.19.0)

Anything else we should know?

gRPC version - 1.61
Ruby version - All, 2.7, 3.0, 3.1
Rails version 6.0 & 7.1
OS - macOS 14.1

splittingred · 2024-02-22T20:38:26Z

Hi @Abdullah-l - I'm not able to replicate this; can you provide a repository/code sample of this occurring?

splittingred · 2024-02-22T20:54:10Z

Further illustrated no apparent issue here: #207

Abdullah-l · 2024-02-22T23:33:53Z

Hey @splittingred, I managed to reproduce it in your rake task :)

I created two branches with the exact same changes

The issue is easily seen in the logs
Before 2.17.0

I, [2024-02-23T02:24:05.015640 #20703]  INFO -- : Received get1 at 2024-02-23 02:24:05 +0300
I, [2024-02-23T02:24:06.221235 #20703]  INFO -- : Received get2 at 2024-02-23 02:24:06 +0300
I, [2024-02-23T02:24:06.325923 #20703]  INFO -- : Received get3 at 2024-02-23 02:24:06 +0300

After 2.17.0

I, [2024-02-23T02:25:00.553379 #21885]  INFO -- : Received get1 at 2024-02-23 02:25:00 +0300
I, [2024-02-23T02:25:30.554698 #21885]  INFO -- : Received get2 at 2024-02-23 02:25:30 +0300
I, [2024-02-23T02:25:30.554768 #21885]  INFO -- : Received get3 at 2024-02-23 02:25:30 +0300

As you can see after 2.17.0 get2 and get3 requests are blocked for a while there. They arrived 30 seconds later (after get1 timed out).

splittingred · 2024-02-22T23:47:21Z

@Abdullah-l can you provide more information about your working environment:

Ruby version
Rubygems version
gRPC version installed
Anything else you feel relevant

Abdullah-l · 2024-02-22T23:55:14Z

Ruby version - I tried 2.7, 3.0, 3.1
Rubygems version - 3.3.3
gRPC version installed - 1.61

I'm like 99.99% sure that 2.17.0 broke it

I hope you can reproduce it :)

You can just checkout the two branches I linked in the previous message and run:

./spec/demo_server
bundle exec rake gruf:demo:concurrent

on each one and see the logs

leonid-shevtsov · 2024-03-05T11:07:16Z

And to clarify, is this under development mode or production mode? (What does Gruf.development? say?)

Basically the new code uses a read-write lock; any number of requests can concurrently "read", that is, access the controller - but when the controller is reloaded the lock is "write" locked and no concurrency is allowed.

I'd add some logging statements to see the order of operations around reload, the write lock acquire/release and the read lock acquire release.

Abdullah-l · 2024-03-05T15:43:15Z

Using the rake task @splittingred wrote, Gruf.development? is true

However the issue still occurs when I used it with rails on a staging environment.

@leonid-shevtsov regarding adding logs, I didn't fully understand where to add them and what are we looking for 😅
is it here?

leonid-shevtsov · 2024-03-06T10:09:47Z

is it here?

yeah, here and also in lines 54-56; before the block (locking), at the beginning of the block (locked), and after it (unlocked); normally, you should not see a very long write locked period (that would be breaking concurrency indeed)

hmm... perhaps what's happening is the first request locks it for reading, and then the second wants to lock it for writing and hence has to wait until the first one completes? but this should only happen in a development environment.

could it be possible that you're running it in dev mode on staging? dev mode enables code reloading and possibly other things that you generally don't need or want outside of a dev environment.

and looks like the fallback environment is development:

gruf/lib/gruf/configuration.rb

Lines 205 to 222 in 2a583e5

    
           def development? 
        
             environment == 'development' 
        
           end 
        
           private 
        
           ## 
        
           # Automatically determine environment 
        
           # 
        
           # @return [String] The current Ruby environment 
        
           # 
        
           def environment 
        
             if defined?(::Rails) 
        
               ::Rails.env.to_s 
        
             else 
        
               ENV.fetch('RACK_ENV') { ENV.fetch('RAILS_ENV', 'development') }.to_s 
        
             end 
        
           end

Abdullah-l · 2024-07-19T19:10:07Z

Closing because I gave up on this 😅

splittingred added the cannot-reproduce label Feb 22, 2024

Abdullah-l closed this as completed Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Did release 2.17.0 break call concurrency? #206

Did release 2.17.0 break call concurrency? #206

Abdullah-l commented Feb 22, 2024 •

edited

Loading

splittingred commented Feb 22, 2024

splittingred commented Feb 22, 2024

Abdullah-l commented Feb 22, 2024

splittingred commented Feb 22, 2024

Abdullah-l commented Feb 22, 2024

leonid-shevtsov commented Mar 5, 2024

Abdullah-l commented Mar 5, 2024

leonid-shevtsov commented Mar 6, 2024 •

edited

Loading

Abdullah-l commented Jul 19, 2024

Did release 2.17.0 break call concurrency? #206

Did release 2.17.0 break call concurrency? #206

Comments

Abdullah-l commented Feb 22, 2024 • edited Loading

Please describe the issue

Anything else we should know?

splittingred commented Feb 22, 2024

splittingred commented Feb 22, 2024

Abdullah-l commented Feb 22, 2024

splittingred commented Feb 22, 2024

Abdullah-l commented Feb 22, 2024

leonid-shevtsov commented Mar 5, 2024

Abdullah-l commented Mar 5, 2024

leonid-shevtsov commented Mar 6, 2024 • edited Loading

Abdullah-l commented Jul 19, 2024

Abdullah-l commented Feb 22, 2024 •

edited

Loading

leonid-shevtsov commented Mar 6, 2024 •

edited

Loading