Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10 Second Polling on Elastic Beanstalk causes API limit issues. #6870

Closed
tecnobrat opened this issue May 25, 2016 · 7 comments
Closed

10 Second Polling on Elastic Beanstalk causes API limit issues. #6870

tecnobrat opened this issue May 25, 2016 · 7 comments

Comments

@tecnobrat
Copy link

In 0.6.16 when you update an elastic beanstalk environment it polls every 10 seconds waiting for the environment to finish updating. If you have many environments that need updating at the same time, it ends up hitting API limits.

Can we get this as a configurable value?

@dharrisio
Copy link
Contributor

I think that this would be a good thing to add as well. Any thoughts on the name for that attribute? Maybe something like wait_for_ready_poll_frequency?

@tecnobrat
Copy link
Author

Sounds good .. however I wonder if there is a more global setting, this could be needed for any number of resources. Perhaps someone like @catsby could comment on whether it should be a globally name variable to deal with the polling interval like in here.

@catsby
Copy link
Contributor

catsby commented Jul 6, 2016

Hey Friends – I'm sorry to hear that you're hitting API limits here. Can you give me an idea how many Beanstalk Environments you're managing, and what kind of updates you're doing? I'm trying to see this in person with 20 environments but I'm not yet hitting API limits. Of course, I'm only testing Environments and working in a test environment, so this isn't exactly your situation(s).

Regardless, there are some ideas here.

The easiest and most isolated is to increase the MinTimeout in both the create and update methods:

It's currently set at 10; by bumping it to 20 we can cut the API calls in half just for Beanstalk Environments.

The suggestion of a configuration for the polling timeouts is a request we've received before. We faced the same dilemma as you touched on here; is this a global timeout configuration (the aws provider), per resource type (aws_elasticbeanstalk_environment), or per resource (aws_elasticbeanstalk_environment.env-1)? In your case, maybe it's just the later, but that requires implementation in each resource. In other cases, our users seem to want to change this in one place for their large infrastructure.

So, which option do you think would suit you best?
I opened #7523 as a band-aid for ElasticBeanstalk specifically, but as I mentioned I wasn't able to hit this limit myself so perhaps not.

@catsby catsby added the thinking label Jul 6, 2016
@tecnobrat
Copy link
Author

Hey @catsby we have about 25 environments. The limit is hit more often when we do a mass update to all of them, for instance updating to the latest solution_stack_name.

But its also compounded by deploys which we do to the same environment throughout the day (that one has a 90 second delay between checks, however). Both of these things are calling DescribeEnvironments and DescribeEnvironmentConfigurations every X seconds, and thats where we hit the issue. Its not so much the update / create calls, but the read calls which are the "are you done yet" calls.

Because the rate limit is global for an entire AWS account (I have confirmed with AWS), when you have 20+ environments updating, and they take 15+ minutes to update (due to our deployment strategy we have on AWS), there is two API calls, it ends up making about 3000-4000 API calls.

AWS uses a token bucket model for their API rate limits. So by using this many calls upfront, it can cause issues with API requests even hours later in some cases. Due to the way they have their token buckets configured.

@tecnobrat
Copy link
Author

@catsby so this is a little out of context with this issue, and I'm happy to open another issue to track this specific issue.

However, it appears that atlas / terraform makes a query that looks like this:

{"eventVersion":"1.04","userIdentity": {"type":"IAMUser","principalId":"--REMOVED--","arn":"arn:aws:iam::--REMOVED--:user/atlas","accountId":"--REMOVED--","accessKeyId":"--REMOVED--","userName":"atlas"},"eventTime":"2016-07-15T17:25:21Z","eventSource":"elasticbeanstalk.amazonaws.com","eventName":"DescribeConfigurationSettings","awsRegion":"us-east-1","sourceIPAddress":"--REMOVED--","userAgent":"terraform/0.6.16 () aws-sdk-go/1.1.23 (go1.6.2; linux; amd64)","requestParameters":{"environmentName":"tf-cog","applicationName":"tf-cog"},"responseElements":null,"requestID":"1b761982-4ab1-11e6-b00b-cf3e6fb0a28d","eventID":"d8351152-cff5-4533-96ec-7fbd6232305e","eventType":"AwsApiCall","recipientAccountId":"--REMOVED--"},

This appears to be during the "refresh state" phase. This is the DescribeConfigurationSettings call, but it also does the same for DescribeApplications, DescribeEnvironments and DescribeEnvironmentResources.

Only the DescribeConfigurationSettings has a required attribute of applicationName. Which means for DescribeApplications, DescribeEnvironments and DescribeEnvironmentResources you can do a SINGLE API call and get the entire state of the AWS account returned. This would reduce the number of API calls pretty significantly.

We have 20ish environments, and PR against our terraform repo very often. This is causing a bunch of API calls to be wasted.

@catsby
Copy link
Contributor

catsby commented Jul 15, 2016

Hey @tecnobrat thanks for the info. I created a beanstalk app and environment to test this. If I manually remove the app from state to just refresh the environment, I found these calls:

$ cat refresh_beanstalk.txt | grep "Request elasticbeanstalk/Describe"
2016/07/15 14:52:36 [DEBUG] plugin: terraform: aws-provider (internal) 2016/07/15 14:52:36 [DEBUG] [aws-sdk-go] DEBUG: Request elasticbeanstalk/DescribeEnvironments Details:
2016/07/15 14:52:37 [DEBUG] plugin: terraform: aws-provider (internal) 2016/07/15 14:52:37 [DEBUG] [aws-sdk-go] DEBUG: Request elasticbeanstalk/DescribeEnvironmentResources Details:
2016/07/15 14:52:38 [DEBUG] plugin: terraform: aws-provider (internal) 2016/07/15 14:52:38 [DEBUG] [aws-sdk-go] DEBUG: Request elasticbeanstalk/DescribeConfigurationSettings Details:

That is what is required for one Environment managed by Terraform.

Which means for DescribeApplications, DescribeEnvironments and DescribeEnvironmentResources you can do a SINGLE API call and get the entire state of the AWS account returned. This would reduce the number of API calls pretty significantly.

Terraform handles each resource in parallel, as many as it can. In this case, unless one environment depends on another, each environment will then be ran in parallel (up to a point, when others will then queue and wait their turn).

Each resource being managed has no knowledge of any other resource, so at time of writing there is no means of batching these kinds of API calls. In this example, the core of Terraform knows that it needs to create n numbers of this thing. Terraform core itself doesn't know what APIs are involved; it simply adds 3 of them to the graph and then they in turn get executed in parallel, so each resource is going to make those API calls.

Aggregating these API calls would be ideal here, but it goes against the parallel and isolated nature of Terraform, and would be a non-trivial architectural change.

I have a pr open (#7523) that will allow you to adjust the polling interval for Elastic Beanstalk. In the future, we'd like to expose this kind of control for all resources, but this is the first step I can take to helping you out here.

Please let me know if that doesn't address any question you may have. Thanks!

@ghost
Copy link

ghost commented Apr 10, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants