Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency issues in 1.3.0+ #315

Closed
lomeroe opened this issue May 2, 2017 · 3 comments
Closed

Concurrency issues in 1.3.0+ #315

lomeroe opened this issue May 2, 2017 · 3 comments

Comments

@lomeroe
Copy link

lomeroe commented May 2, 2017

I'm seeing concurrency issues in 1.3.0+ that do not occur with 1.2.0. They seem to be related to the "wait_until_volumes_ready" function (or at least that function is where things stall).

The instances get created and then the process will hang "waiting for volumes to be ready". This will continue on until the timeout, then they will be deleted. Checking the EC2 instance manually via the console/etc shows it is alive/ready.

Waited 350/600s for instance <i-xxx1> volumes to be ready.
<snip>
Ran out of time waiting for the server with id [i-xxx1] volumes to be ready, attempting to destroy it

For example, if I start with a concurrency of 5, 2 may work and 3 may hang waiting for volumes (I've gone down to a concurrency of 2 and the problem remains). This seems to only happen on the first initial set of EC2s created. For example, say if have 10 platform/suite combinations, with a concurrency of 5. Some number of the first 5 will fail, but the remaining 5 will all start and continue on successfully. I'm guessing that whatever race condition is happening gets lost by the slight delays in terminating the existing "failed" instances and starting up the next instances.

Is anyone else seeing this?

@lomeroe
Copy link
Author

lomeroe commented May 3, 2017

This seems to be due to my use of multiple regions. Adding a second describe_volumes in the wait_until_volumes_ready w/no filter will return all the volumes for the wrong region.

I can work around it by always recreating the ec2 client (there is probably a better way to do it than just blindly re-instantiating the object each time):

      def ec2
        @ec2 = Aws::Client.new(
          config[:region],
          config[:shared_credentials_profile],
          config[:aws_access_key_id],
          config[:aws_secret_access_key],
          config[:aws_session_token],
          config[:http_proxy],
          config[:retry_limit],
          config[:ssl_verify_peer]
        )
      end

@naunga
Copy link

naunga commented Sep 21, 2017

I have seen this behavior as well. It seems to occur when I am trying to test an instance created from an instance-store backed AMI.

If I use an EBS-backed AMI there are no problems.

@cheeseplus
Copy link
Contributor

This should also be fixed by #343 at least the root issue being seen, there are other concurrency demons but they are already tracked elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants