Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS IAM auth method doesn't retry sts:GetCallerIdentity request if it fails #8721

Closed
daveadams opened this issue Apr 10, 2020 · 1 comment · Fixed by #8727
Closed

AWS IAM auth method doesn't retry sts:GetCallerIdentity request if it fails #8721

daveadams opened this issue Apr 10, 2020 · 1 comment · Fixed by #8727
Assignees
Labels
auth/aws bug Used to indicate a potential bug

Comments

@daveadams
Copy link
Contributor

We are seeing failures like this happen on a small percentage of AWS IAM login attempts:

Code: 400. Errors:
  * error making upstream request: error making request:
    Post https://sts.amazonaws.com//:
    net/http: TLS handshake timeout

An immediate retry is successful, and this is an intermittent issue. Obviously the problem is between Vault (which is running in an EC2 instance) and sts.amazonaws.com. I know that behind the scenes, Vault takes a signed sts:GetCallerIdentity request and submits it to STS on behalf of the user in order to validate that the user has the necessary IAM credentials. So generally we would have expected SDK retries to cover any temporal failures with contacting the AWS APIs.

However, looking at the code (https://github.com/hashicorp/vault/blob/master/builtin/credential/aws/path_login.go#L1558), it appears that the SDK is not being used to make that final request. And I don't see any logic in path_login.go nor in the go-cleanhttp library to implement any retries.

Obviously we can implement retry logic on our end, but this seems like something Vault should be handling since it's talking directly to the AWS APIs. Thanks!

  • Vault Server Version: 1.3.4
  • Vault CLI Version: 1.3.4
  • Server Operating System/Architecture: Linux - Ubuntu 18.04
@tyrannosaurus-becks tyrannosaurus-becks added auth/aws bug Used to indicate a potential bug labels Apr 10, 2020
@tyrannosaurus-becks
Copy link
Contributor

Ah yes, thanks for bringing this to our attention, it does seem like it would be a fairly simple fix to put that call inside a retry loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auth/aws bug Used to indicate a potential bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants