-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Web Identity Token / EKS IAM Role Service Account (IRSA) support #112
Comments
Can you please share the provider configuration and resource configuration? If you're using AWS auth, please share the any relevant config files on the machine in question. |
Hi @phillbaker, I'm using TF0.13. terraform {
required_version = ">= 0.13"
required_providers {
elasticsearch = {
source = "phillbaker/elasticsearch"
version = ">= 1.5"
}
}
} Provider: provider "elasticsearch" {
url = var.es_endpoint
}
resource "elasticsearch_index_template" "template" {
count = var.index_template != null ? 1 : 0
name = var.index_template.name
body = var.index_template.body
} TF_Vars: es_endpoint = format("https://%s", module.es.outputs.elasticsearch_endpoint)
index_template = {
name = "logging_template"
body = templatefile("index.json",
{
number_of_replicas = 0
index_patterns = format("%s-*", local.environment_name)
})
} I don't specify any AWS auth in my terraform AWS provider as I'm using As mentioned previously, everything works just fine on my laptop but in the CI server getting PS. My CI is running in a Pod inside the EKS cluster and I'm using IRSA for authentication to access AWS resources. All my other modules are working just fine inside the Pod except the module that has |
So seems like a networking or permissions issue from the CI environment 😀 . May be related to #89
How did you do this test? If you used a curl can you share that? Did you try setting |
If I set The way how I tested the connectivity from inside CI pod is by running the below script (which will also return to me the document that I added) and it's working just fine: from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3
host = 'MY_ES_URL'
region = 'eu-west-1'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
es = Elasticsearch(
hosts = [{'host': host, 'port': 443}],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
document = {
"title": "testing_tf"
}
es.index(index="dummy", doc_type="doc", id="2", body=document)
print(es.get(index="dummy", doc_type="doc", id="2")) So I don't think it's related to a network connectivity or permissions. |
In order to narrow down the issue, can you try the following steps:
My guess is that there are slight differences in how this provider handles AWS authentication versus the terraform AWS provider. |
Thanks, @phillbaker for your feedback. I've ran I've noticed you've added a new provider variable provider "elasticsearch" {
url = "https://vpc-<MASKED>.eu-west-1.es.amazonaws.com"
aws_region = "eu-west-1"
aws_assume_role_arn = "arn:aws:iam::******:role/*******"
} Here you can find the terraform logs:
|
As a workaround, I've to run the following script before terraform and issue would be resolved: aws sts assume-role-with-web-identity \
--role-arn $AWS_ROLE_ARN \
--role-session-name tmp_es \
--web-identity-token file://$AWS_WEB_IDENTITY_TOKEN_FILE \
--duration-seconds 1000 > /tmp/irp-cred.txt
export AWS_ACCESS_KEY_ID="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.AccessKeyId")"
export AWS_SECRET_ACCESS_KEY="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.SecretAccessKey")"
export AWS_SESSION_TOKEN="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.SessionToken")"
rm /tmp/irp-cred.txt
terraform plan |
Looks very similar to the workaround suggested in this comment: hashicorp/terraform#22992 (comment). That issue was closed by hashicorp/aws-sdk-go-base#33 and hashicorp/terraform#25134 and references aws/aws-sdk-go#3101. I'll have to review the changes there, for now I would suggest the workaround posted above. |
@ahmad-hamade can you confirm that you've set the following environment variables in your pod:
|
@phillbaker yes the below are exists and added by default since I'm using IAM Roles for Service Accounts (IRSA) in EKS except for
|
Do you suggest me exporting |
Yes, please try it!
…On Mon, Nov 30, 2020 at 6:32 AM Ahmad Hamade ***@***.***> wrote:
Do you suggest me exporting AWS_SDK_LOAD_CONFIG and try again?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#112 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAXCKO2KLGJEXZRRWKA4ATSSN7ENANCNFSM4T4FQH4Q>
.
|
As per your suggestion and the reported issue aws/aws-sdk-go#2828, exporting Unfortunately, I've tested that in my environment but it didn't help. perhaps something being overridden to the below line which ignoring the behavior of reading the
|
So terraform AWS provider has fixed this issue by reading the shared credentials file which includes the current session token so maybe we can implement the same? |
Thanks for the link, I'll take a look at the current session token. |
After reading the code, it looks like one difference in configuration is that we're not sending a @ahmad-hamade to clarify, based on the cluster URL you provide, it looks like you're using a VPC cluster. Can you confirm whether your access policies specify IAM users or roles? If so, requests would need to be signed with credentials and so the provider would need
In your example script where you generate credentials with |
Thanks @phillbaker for your further investigation. I'm using AWS private ES cluster (accessible within my VPC resources) and my IAM role is allowed to connect, upload, and access indices. To summarize what testing I've done so far:
The only way to get the provider able to connect to my private ES was by exporting |
Ah, quite right, I forgot about that. Can you confirm if there's a |
No, there is no The only environment variables injected by EKS Pod Identity Webhook are AWS_ROLE_ARN and |
Well, here's what I see the code doing, which looks correct: ES provider: terraform-provider-elasticsearch/es/provider.go Lines 211 to 213 in 068a12c
In the SDK, |
I too am seeing this issue with EKS roles. It would be nice to get this addressed so the workaround mentioned in here (#112 (comment)) isn't necessary. That workaround does fix the issue but is not an ideal solution |
I believe this actually is working. I've managed to test this on AWS using the following setup:
a terraform file of the following: terraform {
required_providers {
elasticsearch = {
source = "phillbaker/elasticsearch"
version = "2.0.0-beta.1"
}
}
}
provider "elasticsearch" {
url = "https://vpc-terraform-XXX.us-east-2.es.amazonaws.com:443"
aws_assume_role_arn = "arn:aws:iam::XXX:role/terraform-elasticsearch"
}
resource "elasticsearch_index" "test" {
name = "terraform-test"
number_of_shards = 1
number_of_replicas = 1
} The ES cluster has the following IAM access policy: {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::XXX:role/terraform-elasticsearch"
]
},
"Action": "es:*",
"Resource": "arn:aws:es:us-east-2:XXX:domain/terraform/*"
}
]
} with the role having the attached policy: {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "es:*",
"Resource": "*"
}
]
} and a trust relationship with the to the OIDC endpoint, e.g.: {
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::xxx:oidc-provider/oidc-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxx.s3.amazonaws.com"
},
"Action": "sts:AssumeRoleWithWebIdentity"
} Note - one difference between the script in #112 (comment) and this provider, is that boto (and the AWS terraform provider) appear to respect the I don't have access to an EKS cluster to verify this there, but if this does not work for you, please include:
|
I believe the issue with environmental variables has been fixed in 64f21df, please see some of the discussion in #124 (comment). I'm going to close this issue for now, please let me know if there are further issues with IRSA. |
I was able to configure and run the provider successfully from my local machine but running the same from CI server returning
health check timeout: no Elasticsearch node available
.I tried to run a simple python script that connects and uploads a document into ES from the CI server and it works just fine which eliminates any issue related to ES IAM role policy or security groups rules.
Any idea what could be the issue here?
The text was updated successfully, but these errors were encountered: