Web Identity Token / EKS IAM Role Service Account (IRSA) support #112

ahmad-hamade · 2020-11-20T02:00:01Z

I was able to configure and run the provider successfully from my local machine but running the same from CI server returning health check timeout: no Elasticsearch node available.

I tried to run a simple python script that connects and uploads a document into ES from the CI server and it works just fine which eliminates any issue related to ES IAM role policy or security groups rules.

Any idea what could be the issue here?

The text was updated successfully, but these errors were encountered:

phillbaker · 2020-11-20T12:30:00Z

Can you please share the provider configuration and resource configuration? If you're using AWS auth, please share the any relevant config files on the machine in question.

ahmad-hamade · 2020-11-20T13:24:15Z

Hi @phillbaker,

I'm using TF0.13.

terraform {
  required_version = ">= 0.13"
  required_providers {
    elasticsearch = {
      source  = "phillbaker/elasticsearch"
      version = ">= 1.5"
    }
  }
}

Provider:

provider "elasticsearch" {
  url = var.es_endpoint
}

resource "elasticsearch_index_template" "template" {
  count = var.index_template != null ? 1 : 0
  name  = var.index_template.name
  body  = var.index_template.body
}

TF_Vars:

es_endpoint = format("https://%s", module.es.outputs.elasticsearch_endpoint)

index_template = {
  name = "logging_template"
  body = templatefile("index.json",
    {
      number_of_replicas = 0
      index_patterns     = format("%s-*", local.environment_name)
  })
}

I don't specify any AWS auth in my terraform AWS provider as I'm using saml2aws to assume and login to AWS using MFA.

As mentioned previously, everything works just fine on my laptop but in the CI server getting health check timeout: no Elasticsearch node available

PS. My CI is running in a Pod inside the EKS cluster and I'm using IRSA for authentication to access AWS resources.

All my other modules are working just fine inside the Pod except the module that has phillbaker/elasticsearch.
I tested connecting to the AWS ES instance from inside the Pod and I was able to upload dummy docs to a default index so my connectivity and IAM roles are not an issue.

phillbaker · 2020-11-20T23:57:40Z

As mentioned previously, everything works just fine on my laptop but in the CI server getting health check timeout

So seems like a networking or permissions issue from the CI environment 😀 . May be related to #89

I tested connecting to the AWS ES instance from inside the Pod and I was able to upload dummy docs to a default index so my connectivity and IAM roles are not an issue.

How did you do this test? If you used a curl can you share that?

Did you try setting healthcheck to false in the provider?

ahmad-hamade · 2020-11-21T23:04:14Z

If I set healthcheck to false then I will be getting exactly the same behavior as what is mentioned in #89 terraform plan/apply is waiting forever.

The way how I tested the connectivity from inside CI pod is by running the below script (which will also return to me the document that I added) and it's working just fine:

from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3

host = 'MY_ES_URL'
region = 'eu-west-1'

service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

es = Elasticsearch(
    hosts = [{'host': host, 'port': 443}],
    http_auth = awsauth,
    use_ssl = True,
    verify_certs = True,
    connection_class = RequestsHttpConnection
)

document = {
    "title": "testing_tf"
}

es.index(index="dummy", doc_type="doc", id="2", body=document)

print(es.get(index="dummy", doc_type="doc", id="2"))

So I don't think it's related to a network connectivity or permissions.

phillbaker · 2020-11-25T17:35:43Z

In order to narrow down the issue, can you try the following steps:

hardcode the ES url in the provider block, e.g.:

provider "elasticsearch" {
  url = "https://....:9200"
}

explicitly pass aws access, secret keys and token, region if necessary via the provider block: https://registry.terraform.io/providers/phillbaker/elasticsearch/latest/docs#aws_access_key, similar to what was generated in your script
run terraform with debug logs: TF_LOG=DEBUG terraform apply | grep provider-elasticsearch

My guess is that there are slight differences in how this provider handles AWS authentication versus the terraform AWS provider.

ahmad-hamade · 2020-11-29T12:17:36Z

Thanks, @phillbaker for your feedback.

I've ran aws sts assume-role --role-arn <MY_CI_ROLE> --role-session-name default to get the exported AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables in my pod and the terraform plan worked just fine.

I've noticed you've added a new provider variable aws_assume_role_arn in release 1.5.0 so I tried the following configuration with no luck:

provider "elasticsearch" {
  url = "https://vpc-<MASKED>.eu-west-1.es.amazonaws.com"

  aws_region          = "eu-west-1"
  aws_assume_role_arn = "arn:aws:iam::******:role/*******"
}

Here you can find the terraform logs:

2020-11-29T11:47:29.263Z [INFO]  plugin: configuring client automatic mTLS
2020-11-29T11:47:29.292Z [DEBUG] plugin: starting plugin: path=.terraform/plugins/registry.terraform.io/hashicorp/aws/3.18.0/linux_amd64/terraform-provider-aws_v3.18.0_x5 args=[.terraform/plugins/registry.terraform.io/hashicorp/aws/3.18.0/linux_amd64/terraform-provider-aws_v3.18.0_x5]
2020-11-29T11:47:29.293Z [DEBUG] plugin: plugin started: path=.terraform/plugins/registry.terraform.io/hashicorp/aws/3.18.0/linux_amd64/terraform-provider-aws_v3.18.0_x5 pid=8441
2020-11-29T11:47:29.293Z [DEBUG] plugin: waiting for RPC address: path=.terraform/plugins/registry.terraform.io/hashicorp/aws/3.18.0/linux_amd64/terraform-provider-aws_v3.18.0_x5
2020-11-29T11:47:29.326Z [INFO]  plugin.terraform-provider-aws_v3.18.0_x5: configuring server automatic mTLS: timestamp=2020-11-29T11:47:29.326Z
2020-11-29T11:47:29.359Z [DEBUG] plugin.terraform-provider-aws_v3.18.0_x5: plugin address: address=/tmp/plugin770720847 network=unix timestamp=2020-11-29T11:47:29.359Z
2020-11-29T11:47:29.359Z [DEBUG] plugin: using plugin: version=5
2020-11-29T11:47:29.509Z [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = transport is closing"
2020-11-29T11:47:29.512Z [DEBUG] plugin: plugin process exited: path=.terraform/plugins/registry.terraform.io/hashicorp/aws/3.18.0/linux_amd64/terraform-provider-aws_v3.18.0_x5 pid=8441
2020-11-29T11:47:29.512Z [DEBUG] plugin: plugin exited
2020-11-29T11:47:29.512Z [INFO]  plugin: configuring client automatic mTLS
2020-11-29T11:47:29.541Z [DEBUG] plugin: starting plugin: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 args=[.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0]
2020-11-29T11:47:29.543Z [DEBUG] plugin: plugin started: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 pid=8450
2020-11-29T11:47:29.543Z [DEBUG] plugin: waiting for RPC address: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0
2020-11-29T11:47:29.551Z [INFO]  plugin.terraform-provider-elasticsearch_v1.5.0: configuring server automatic mTLS: timestamp=2020-11-29T11:47:29.550Z
2020-11-29T11:47:29.580Z [DEBUG] plugin.terraform-provider-elasticsearch_v1.5.0: plugin address: address=/tmp/plugin657421152 network=unix timestamp=2020-11-29T11:47:29.580Z
2020-11-29T11:47:29.580Z [DEBUG] plugin: using plugin: version=5
2020-11-29T11:47:29.644Z [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
2020-11-29T11:47:29.647Z [DEBUG] plugin: plugin process exited: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 pid=8450
2020-11-29T11:47:29.647Z [DEBUG] plugin: plugin exited
2020/11/29 11:47:29 [INFO] terraform: building graph: GraphTypeValidate
2020/11/29 11:47:29 [DEBUG] ProviderTransformer: "elasticsearch_index_template.template" (*terraform.NodeValidatableResource) needs provider["registry.terraform.io/phillbaker/elasticsearch"]
2020/11/29 11:47:29 [DEBUG] ProviderTransformer: "elasticsearch_opendistro_ism_policy.ism" (*terraform.NodeValidatableResource) needs provider["registry.terraform.io/phillbaker/elasticsearch"]
2020/11/29 11:47:29 [DEBUG] pruning unused provider["registry.terraform.io/hashicorp/aws"]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "elasticsearch_opendistro_ism_policy.ism" references: [var.ism_template var.ism_template var.ism_template]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "output.es_ism_policy_id (expand)" references: [elasticsearch_opendistro_ism_policy.ism]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.es_endpoint" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.ism_template" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.index_template" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "elasticsearch_index_template.template" references: [var.index_template var.index_template var.index_template]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "output.es_index_template_id (expand)" references: [elasticsearch_index_template.template]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "provider[\"registry.terraform.io/phillbaker/elasticsearch\"]" references: []
2020/11/29 11:47:29 [DEBUG] Starting graph walk: walkValidate
2020-11-29T11:47:29.650Z [INFO]  plugin: configuring client automatic mTLS
2020-11-29T11:47:29.679Z [DEBUG] plugin: starting plugin: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 args=[.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0]
2020-11-29T11:47:29.680Z [DEBUG] plugin: plugin started: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 pid=8459
2020-11-29T11:47:29.680Z [DEBUG] plugin: waiting for RPC address: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0
2020-11-29T11:47:29.689Z [INFO]  plugin.terraform-provider-elasticsearch_v1.5.0: configuring server automatic mTLS: timestamp=2020-11-29T11:47:29.689Z
2020-11-29T11:47:29.718Z [DEBUG] plugin.terraform-provider-elasticsearch_v1.5.0: plugin address: address=/tmp/plugin875629567 network=unix timestamp=2020-11-29T11:47:29.718Z
2020-11-29T11:47:29.718Z [DEBUG] plugin: using plugin: version=5
2020-11-29T11:47:29.780Z [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
2020-11-29T11:47:29.785Z [DEBUG] plugin: plugin process exited: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 pid=8459
2020-11-29T11:47:29.785Z [DEBUG] plugin: plugin exited
2020/11/29 11:47:29 [INFO] backend/local: apply calling Refresh
2020/11/29 11:47:29 [INFO] terraform: building graph: GraphTypeRefresh
2020/11/29 11:47:29 [DEBUG] pruning unused provider["registry.terraform.io/phillbaker/elasticsearch"]
2020/11/29 11:47:29 [DEBUG] pruning unused provider["registry.terraform.io/hashicorp/aws"]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.es_endpoint" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.ism_template" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.index_template" references: []
2020/11/29 11:47:29 [WARN] ReferenceTransformer: reference not found: "elasticsearch_index_template.template"
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "output.es_index_template_id (expand)" references: []
2020/11/29 11:47:29 [WARN] ReferenceTransformer: reference not found: "elasticsearch_opendistro_ism_policy.ism"
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "output.es_ism_policy_id (expand)" references: []
2020/11/29 11:47:29 [DEBUG] Starting graph walk: walkRefresh
2020/11/29 11:47:29 [INFO] backend/local: apply calling Plan
2020/11/29 11:47:29 [INFO] terraform: building graph: GraphTypePlan
2020/11/29 11:47:29 [DEBUG] ProviderTransformer: "elasticsearch_opendistro_ism_policy.ism (expand)" (*terraform.nodeExpandPlannableResource) needs provider["registry.terraform.io/phillbaker/elasticsearch"]
2020/11/29 11:47:29 [DEBUG] ProviderTransformer: "elasticsearch_index_template.template (expand)" (*terraform.nodeExpandPlannableResource) needs provider["registry.terraform.io/phillbaker/elasticsearch"]
2020/11/29 11:47:29 [DEBUG] pruning unused provider["registry.terraform.io/hashicorp/aws"]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "output.es_index_template_id (expand)" references: [elasticsearch_index_template.template (expand)]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "output.es_ism_policy_id (expand)" references: [elasticsearch_opendistro_ism_policy.ism (expand)]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.index_template" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "elasticsearch_index_template.template (expand)" references: [var.index_template var.index_template var.index_template]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "elasticsearch_opendistro_ism_policy.ism (expand)" references: [var.ism_template var.ism_template var.ism_template]
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.es_endpoint" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "var.ism_template" references: []
2020/11/29 11:47:29 [DEBUG] ReferenceTransformer: "provider[\"registry.terraform.io/phillbaker/elasticsearch\"]" references: []
2020/11/29 11:47:29 [DEBUG] Starting graph walk: walkPlan
2020-11-29T11:47:29.788Z [INFO]  plugin: configuring client automatic mTLS
2020-11-29T11:47:29.818Z [DEBUG] plugin: starting plugin: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 args=[.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0]
2020-11-29T11:47:29.818Z [DEBUG] plugin: plugin started: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 pid=8469
2020-11-29T11:47:29.818Z [DEBUG] plugin: waiting for RPC address: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0
2020-11-29T11:47:29.827Z [INFO]  plugin.terraform-provider-elasticsearch_v1.5.0: configuring server automatic mTLS: timestamp=2020-11-29T11:47:29.827Z
2020-11-29T11:47:29.857Z [DEBUG] plugin.terraform-provider-elasticsearch_v1.5.0: plugin address: address=/tmp/plugin184895428 network=unix timestamp=2020-11-29T11:47:29.857Z
2020-11-29T11:47:29.857Z [DEBUG] plugin: using plugin: version=5
2020-11-29T11:47:29.919Z [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
2020-11-29T11:47:29.921Z [DEBUG] plugin.terraform-provider-elasticsearch_v1.5.0: 2020/11/29 11:47:29 [INFO] Using AWS: eu-west-1
2020/11/29 11:47:35 [ERROR] eval: *terraform.EvalConfigProvider, err: health check timeout: Head "https://<<<MASKED>>>.eu-west-1.es.amazonaws.com": RequestCanceled: request context canceled
caused by: context deadline exceeded: no Elasticsearch node available
2020/11/29 11:47:35 [ERROR] eval: *terraform.EvalSequence, err: health check timeout: Head "https://<<<MASKED>>>.eu-west-1.es.amazonaws.com": RequestCanceled: request context canceled
caused by: context deadline exceeded: no Elasticsearch node available
2020/11/29 11:47:35 [ERROR] eval: *terraform.EvalOpFilter, err: health check timeout: Head "https://<<<MASKED>>>.eu-west-1.es.amazonaws.com": RequestCanceled: request context canceled
caused by: context deadline exceeded: no Elasticsearch node available
2020/11/29 11:47:35 [ERROR] eval: *terraform.EvalSequence, err: health check timeout: Head "https://<<<MASKED>>>.eu-west-1.es.amazonaws.com": RequestCanceled: request context canceled
caused by: context deadline exceeded: no Elasticsearch node available

2020/11/29 11:47:35 [DEBUG] [aws-sdk-go] DEBUG: Request dynamodb/GetItem Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: dynamodb.eu-west-1.amazonaws.com
User-Agent: aws-sdk-go/1.31.9 (go1.14.7; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.13.5
Content-Length: 241
Accept-Encoding: identity
Authorization: <<<MASKED>>>
Content-Type: application/x-amz-json-1.0
X-Amz-Date: 20201129T114735Z
X-Amz-Security-Token: <<<MASKED>>>
X-Amz-Target: DynamoDB_20120810.GetItem

{"ConsistentRead":true,"Key":{"LockID":{"S":"804335263071-terraform-state/non-prod/dev/base-infra/logging/logging-es-config/es-config-logs/terraform.tfstate"}},"ProjectionExpression":"LockID, Info","TableName":"804335263071-terraform-locks"}
-----------------------------------------------------
Error: health check timeout: Head "https://<<<MASKED>>>.eu-west-1.es.amazonaws.com": RequestCanceled: request context canceled
caused by: context deadline exceeded: no Elasticsearch node available

  on main.tf line 1, in provider "elasticsearch":
   1: provider "elasticsearch" {


2020/11/29 11:47:35 [DEBUG] [aws-sdk-go] DEBUG: Response dynamodb/GetItem Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 200 OK
Connection: close
Content-Length: 494
Content-Type: application/x-amz-json-1.0
Date: Sun, 29 Nov 2020 11:47:35 GMT
Server: Server
X-Amz-Crc32: 1092377990
X-Amzn-Requestid: <<<MASKED>>>


-----------------------------------------------------
2020/11/29 11:47:35 [DEBUG] [aws-sdk-go] {"Item":{"LockID":{"S":"804335263071-terraform-state/non-prod/dev/base-infra/logging/logging-es-config/es-config-logs/terraform.tfstate"},"Info":{"S":"{\"ID\":\"0fcc478c-63bd-e52e-cbd7-ad64b0623ab1\",\"Operation\":\"OperationTypeApply\",\"Info\":\"\",\"Who\":\"runner@runner-infra-kfg25-f62sj\",\"Version\":\"0.13.5\",\"Created\":\"2020-11-29T11:47:29.116770918Z\",\"Path\":\"804335263071-terraform-state/non-prod/dev/base-infra/logging/logging-es-config/es-config-logs/terraform.tfstate\"}"}}}
2020/11/29 11:47:35 [DEBUG] [aws-sdk-go] DEBUG: Request dynamodb/DeleteItem Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: dynamodb.eu-west-1.amazonaws.com
User-Agent: aws-sdk-go/1.31.9 (go1.14.7; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.13.5
Content-Length: 181
Accept-Encoding: identity
Authorization: <<<MASKED>>>
Content-Type: application/x-amz-json-1.0
X-Amz-Date: 20201129T114735Z
X-Amz-Security-Token: <<<MASKED>>>
X-Amz-Target: DynamoDB_20120810.DeleteItem

{"Key":{"LockID":{"S":"804335263071-terraform-state/non-prod/dev/base-infra/logging/logging-es-config/es-config-logs/terraform.tfstate"}},"TableName":"804335263071-terraform-locks"}
-----------------------------------------------------
2020/11/29 11:47:36 [DEBUG] [aws-sdk-go] DEBUG: Response dynamodb/DeleteItem Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 200 OK
Connection: close
Content-Length: 2
Content-Type: application/x-amz-json-1.0
Date: Sun, 29 Nov 2020 11:47:36 GMT
Server: Server
X-Amz-Crc32: 2745614147
X-Amzn-Requestid: <<<MASKED>>>


-----------------------------------------------------
2020/11/29 11:47:36 [DEBUG] [aws-sdk-go] {}
2020-11-29T11:47:36.022Z [DEBUG] plugin: plugin process exited: path=.terraform/plugins/registry.terraform.io/phillbaker/elasticsearch/1.5.0/linux_amd64/terraform-provider-elasticsearch_v1.5.0 pid=8469
2020-11-29T11:47:36.022Z [DEBUG] plugin: plugin exited

ahmad-hamade · 2020-11-29T12:31:45Z

As a workaround, I've to run the following script before terraform and issue would be resolved:

aws sts assume-role-with-web-identity \
 --role-arn $AWS_ROLE_ARN \
 --role-session-name tmp_es \
 --web-identity-token file://$AWS_WEB_IDENTITY_TOKEN_FILE \
 --duration-seconds 1000 > /tmp/irp-cred.txt
export AWS_ACCESS_KEY_ID="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.AccessKeyId")"
export AWS_SECRET_ACCESS_KEY="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.SecretAccessKey")"
export AWS_SESSION_TOKEN="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.SessionToken")"
rm /tmp/irp-cred.txt

terraform plan

phillbaker · 2020-11-30T03:44:12Z

Looks very similar to the workaround suggested in this comment: hashicorp/terraform#22992 (comment).

That issue was closed by hashicorp/aws-sdk-go-base#33 and hashicorp/terraform#25134 and references aws/aws-sdk-go#3101.

I'll have to review the changes there, for now I would suggest the workaround posted above.

phillbaker · 2020-11-30T04:18:21Z

@ahmad-hamade can you confirm that you've set the following environment variables in your pod:

AWS_WEB_IDENTITY_TOKEN_FILE
AWS_SDK_LOAD_CONFIG=1

ahmad-hamade · 2020-11-30T11:21:54Z

@phillbaker yes the below are exists and added by default since I'm using IAM Roles for Service Accounts (IRSA) in EKS except for AWS_SDK_LOAD_CONFIG

AWS_ROLE_ARN=arn:aws:iam::MASKED:role/infra-role
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token

ahmad-hamade · 2020-11-30T11:32:04Z

Do you suggest me exporting AWS_SDK_LOAD_CONFIG and try again?

phillbaker · 2020-12-01T01:16:09Z

Yes, please try it!

…

On Mon, Nov 30, 2020 at 6:32 AM Ahmad Hamade ***@***.***> wrote: Do you suggest me exporting AWS_SDK_LOAD_CONFIG and try again? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#112 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAXCKO2KLGJEXZRRWKA4ATSSN7ENANCNFSM4T4FQH4Q> .

ahmad-hamade · 2020-12-04T05:22:02Z

As per your suggestion and the reported issue aws/aws-sdk-go#2828, exporting AWS_SDK_LOAD_CONFIG must solve the issue to make awsession.NewSessionWithOptions support reading from web_identity_provider.

Unfortunately, I've tested that in my environment but it didn't help.

perhaps something being overridden to the below line which ignoring the behavior of reading the web_identity_provider even after settingAWS_SDK_LOAD_CONFIG to 1?

terraform-provider-elasticsearch/es/provider.go

Line 356 in fb7b335

return awssession.Must(awssession.NewSessionWithOptions(sessOpts))

ahmad-hamade · 2020-12-04T05:43:41Z

So terraform AWS provider has fixed this issue by reading the shared credentials file which includes the current session token so maybe we can implement the same?

https://github.com/hashicorp/terraform-provider-aws/blob/bd828729b19030b366a64e4225eeb71e6d5eb0c2/vendor/github.com/hashicorp/aws-sdk-go-base/awsauth.go#L203

phillbaker · 2020-12-05T17:39:04Z

Thanks for the link, I'll take a look at the current session token.

phillbaker · 2020-12-06T22:52:33Z

After reading the code, it looks like one difference in configuration is that we're not sending a SharedCredentialsProvider, but that doesn't seem related to web identity credentials which are handled by the underlying AWS SDK Go.

@ahmad-hamade to clarify, based on the cluster URL you provide, it looks like you're using a VPC cluster. Can you confirm whether your access policies specify IAM users or roles? If so, requests would need to be signed with credentials and so the provider would need sign_aws_requests set. Since we have a recent version of the AWS SDK, a configuration like the following should work:

provider "elasticsearch" {
  url = "https://vpc-<MASKED>.eu-west-1.es.amazonaws.com"

  aws_region             = "eu-west-1" # must be set if the `url` is not of the form <region>.es.amazonaws.com
  sign_aws_requests = true
}

In your example script where you generate credentials with aws sts assume-role-with-web-identity before running terraform plan, were you setting sign_aws_requests?

ahmad-hamade · 2020-12-06T23:33:15Z

Thanks @phillbaker for your further investigation.

I'm using AWS private ES cluster (accessible within my VPC resources) and my IAM role is allowed to connect, upload, and access indices.

To summarize what testing I've done so far:

My IAM role used in Pod has full access to my AWS ES private cluster
I'm using Web Identity Token in my K8s pod that is running in EKS (in the VPC where my ES is running)
I was able to successfully writing a dummy document to ES using a simple python script (shared above) running from inside my pod (and the SDK was able to get the web identity token in boto3.Session().get_credentials()!)
Terraform plan/apply automation running just fine in my Pod for all other AWS resources
I've tried to set values to aws_region and sign_aws_requests (which is true by default) with no success

The only way to get the provider able to connect to my private ES was by exporting AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN variables with values consumed from assume-role-with-web-identity and terraform-provider-elasticsearch works just fine without even setting any value to aws_region or sign_aws_requests.

phillbaker · 2020-12-06T23:46:20Z

I've tried to set values to aws_region and sign_aws_requests (which is true by default) with no success

Ah, quite right, I forgot about that.

Can you confirm if there's a ~/.aws/config file on your pod that specifies a web_identity_token_file parameter?

ahmad-hamade · 2020-12-07T00:09:28Z

No, there is no ~/.aws/config exists in the pod.

The only environment variables injected by EKS Pod Identity Webhook are AWS_ROLE_ARN and
AWS_WEB_IDENTITY_TOKEN_FILE that has a path a file contains token value /var/run/secrets/eks.amazonaws.com/serviceaccount/token

phillbaker · 2020-12-07T02:42:19Z

Well, here's what I see the code doing, which looks correct:

ES provider:

terraform-provider-elasticsearch/es/provider.go

Lines 211 to 213 in 068a12c

    
           if m := awsUrlRegexp.FindStringSubmatch(parsedUrl.Hostname()); m != nil && signAWSRequests { 
        
           	log.Printf("[INFO] Using AWS: %+v", m[1]) 
        
           	opts = append(opts, elastic7.SetHttpClient(awsHttpClient(m[1], d)), elastic7.SetSniff(false))

terraform-provider-elasticsearch/es/provider.go

Line 360 in 068a12c

signer := awssigv4.NewSigner(awsSession(region, d).Config.Credentials)

terraform-provider-elasticsearch/es/provider.go

Line 356 in 068a12c

return awssession.Must(awssession.NewSessionWithOptions(sessOpts))

In the SDK, AWS_WEB_IDENTITY_TOKEN_FILE environment variable is evaluated only in resolveCredentials, which in turn is only invoked in mergeConfigSrcs, which in turn is only invoked in newSession, which is invoked in NewSessionWithOptions.

https://github.com/aws/aws-sdk-go/blob/38c74caea1398949b67da14dfaa79cabe704a57f/aws/session/session.go#L333

https://github.com/aws/aws-sdk-go/blob/38c74caea1398949b67da14dfaa79cabe704a57f/aws/session/session.go#L459-L461

https://github.com/aws/aws-sdk-go/blob/38c74caea1398949b67da14dfaa79cabe704a57f/aws/session/session.go#L629-L630

https://github.com/aws/aws-sdk-go/blob/b6ab7f8d2ef9cce9ffe55475af0aae9445e4ec98/aws/session/credentials.go#L35-L41

https://github.com/aws/aws-sdk-go/blob/b6ab7f8d2ef9cce9ffe55475af0aae9445e4ec98/aws/session/credentials.go#L71-L78

idallas456 · 2021-09-16T17:02:53Z

I too am seeing this issue with EKS roles. It would be nice to get this addressed so the workaround mentioned in here (#112 (comment)) isn't necessary. That workaround does fix the issue but is not an ideal solution

phillbaker · 2021-09-17T03:02:24Z

I believe this actually is working. I've managed to test this on AWS using the following setup:

an AWS ES cluster (Opensearch 1.0)
a (self hosted) Kubernetes using IRSA
terraform v0.15.5
provider version v2.0.0-beta.1

a terraform file of the following:

terraform {
  required_providers {
    elasticsearch = {
      source = "phillbaker/elasticsearch"
      version = "2.0.0-beta.1"
    }
  }
}

provider "elasticsearch" {
  url = "https://vpc-terraform-XXX.us-east-2.es.amazonaws.com:443"
  aws_assume_role_arn = "arn:aws:iam::XXX:role/terraform-elasticsearch"
}

resource "elasticsearch_index" "test" {
  name = "terraform-test"
  number_of_shards = 1
  number_of_replicas = 1
}

The ES cluster has the following IAM access policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::XXX:role/terraform-elasticsearch"
        ]
      },
      "Action": "es:*",
      "Resource": "arn:aws:es:us-east-2:XXX:domain/terraform/*"
    }
  ]
}

with the role having the attached policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "es:*",
            "Resource": "*"
        }
    ]
}

and a trust relationship with the to the OIDC endpoint, e.g.:

    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::xxx:oidc-provider/oidc-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxx.s3.amazonaws.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity"
    }

Note - one difference between the script in #112 (comment) and this provider, is that boto (and the AWS terraform provider) appear to respect the AWS_ROLE_ARN environmental variable, this provider currently does not, so it's required to set aws_assume_role_arn in the provider config.

I don't have access to an EKS cluster to verify this there, but if this does not work for you, please include:

confirm whether the workaround above addresses the issue
provider version
elasticsearch version (and opendistro version if relevant, including whether fine grained access control is enabled)
redacted version of the terraform provider and resource configuration
terraform provider logs by setting TF_LOG_CORE=INFO TF_LOG_PROVIDER=TRACE
simplified AWS IAM roles and ES cluster access policy

phillbaker · 2021-10-03T22:40:32Z

I believe the issue with environmental variables has been fixed in 64f21df, please see some of the discussion in #124 (comment).

I'm going to close this issue for now, please let me know if there are further issues with IRSA.

phillbaker added the question label Nov 20, 2020

phillbaker mentioned this issue Nov 23, 2020

Assume role credentials should use the profile passed in #114

Closed

phillbaker added enhancement and removed question labels Nov 30, 2020

ahmad-hamade changed the title ~~Error: health check timeout: no Elasticsearch node available~~ Web Identity Token / EKS IAM Role Service Account (IRSA) support Dec 4, 2020

phillbaker mentioned this issue Dec 30, 2020

Regression issue on 1.5.1 #124

Closed

phillbaker mentioned this issue Mar 6, 2021

Assume role configuration doesn't seem to work #149

Closed

This was referenced Sep 7, 2021

context deadline exceeded: no Elasticsearch node available #213

Closed

AWS ES Domain with Cognito Pool returns Error 403 (Forbidden) #217

Closed

phillbaker closed this as completed Oct 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web Identity Token / EKS IAM Role Service Account (IRSA) support #112

Web Identity Token / EKS IAM Role Service Account (IRSA) support #112

ahmad-hamade commented Nov 20, 2020

phillbaker commented Nov 20, 2020

ahmad-hamade commented Nov 20, 2020 •

edited

Loading

phillbaker commented Nov 20, 2020

ahmad-hamade commented Nov 21, 2020

phillbaker commented Nov 25, 2020 •

edited

Loading

ahmad-hamade commented Nov 29, 2020

ahmad-hamade commented Nov 29, 2020

phillbaker commented Nov 30, 2020

phillbaker commented Nov 30, 2020

ahmad-hamade commented Nov 30, 2020 •

edited

Loading

ahmad-hamade commented Nov 30, 2020

phillbaker commented Dec 1, 2020 via email

ahmad-hamade commented Dec 4, 2020

ahmad-hamade commented Dec 4, 2020 •

edited

Loading

phillbaker commented Dec 5, 2020

phillbaker commented Dec 6, 2020

ahmad-hamade commented Dec 6, 2020 •

edited

Loading

phillbaker commented Dec 6, 2020

ahmad-hamade commented Dec 7, 2020

phillbaker commented Dec 7, 2020

idallas456 commented Sep 16, 2021

phillbaker commented Sep 17, 2021 •

edited

Loading

phillbaker commented Oct 3, 2021

Web Identity Token / EKS IAM Role Service Account (IRSA) support #112

Web Identity Token / EKS IAM Role Service Account (IRSA) support #112

Comments

ahmad-hamade commented Nov 20, 2020

phillbaker commented Nov 20, 2020

ahmad-hamade commented Nov 20, 2020 • edited Loading

phillbaker commented Nov 20, 2020

ahmad-hamade commented Nov 21, 2020

phillbaker commented Nov 25, 2020 • edited Loading

ahmad-hamade commented Nov 29, 2020

ahmad-hamade commented Nov 29, 2020

phillbaker commented Nov 30, 2020

phillbaker commented Nov 30, 2020

ahmad-hamade commented Nov 30, 2020 • edited Loading

ahmad-hamade commented Nov 30, 2020

phillbaker commented Dec 1, 2020 via email

ahmad-hamade commented Dec 4, 2020

ahmad-hamade commented Dec 4, 2020 • edited Loading

phillbaker commented Dec 5, 2020

phillbaker commented Dec 6, 2020

ahmad-hamade commented Dec 6, 2020 • edited Loading

phillbaker commented Dec 6, 2020

ahmad-hamade commented Dec 7, 2020

phillbaker commented Dec 7, 2020

idallas456 commented Sep 16, 2021

phillbaker commented Sep 17, 2021 • edited Loading

phillbaker commented Oct 3, 2021

ahmad-hamade commented Nov 20, 2020 •

edited

Loading

phillbaker commented Nov 25, 2020 •

edited

Loading

ahmad-hamade commented Nov 30, 2020 •

edited

Loading

ahmad-hamade commented Dec 4, 2020 •

edited

Loading

ahmad-hamade commented Dec 6, 2020 •

edited

Loading

phillbaker commented Sep 17, 2021 •

edited

Loading