-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for IRSA credentials for the s3 storage mode #7419
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sklarsa for the PR. I like the changes, left few comments.
@@ -77,6 +88,7 @@ type S3Config struct { | |||
SignatureVersion string `yaml:"signature_version"` | |||
SSEConfig bucket_s3.SSEConfig `yaml:"sse"` | |||
BackoffConfig backoff.Config `yaml:"backoff_config"` | |||
UseIrsa bool `yaml:"use_irsa"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to call it UseIRSA
(on the struct field) just to be explicit about the acronym. (use_irsa
is fine for yaml).
production/helm/loki/values.yaml
Outdated
@@ -212,6 +212,7 @@ loki: | |||
s3ForcePathStyle: false | |||
insecure: false | |||
http_config: {} | |||
use_irsa: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be a breaking change no?. Breaks the existing helm users who uses ACCESSKEY based S3 credentials.
I would prefer to keep false
as default and document this additional configs on the S3Config page (use_irsa + additional ENVs that need to be set)
https://grafana.com/docs/loki/latest/configuration/#s3_storage_config
Thanks for the review! Based on this comment in the issue I raised, I still have hope that this is just a configuration issue and won't require code changes. I'll keep you updated, and make those fixes in the meantime |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.6% |
It looks like this was just a case of user error and misconfiguration :-( I guess the AWS golang client is smart enough to use the environment's default credential chain if no credentials are explicitly passed to the client config. |
@sklarsa could you show your result config? |
the loki:
storage:
type: s3
bucketNames:
chunks: loki-chunks-bucket
ruler: loki-ruler-bucket
admin: loki-admin-bucket
s3:
region: us-east-2
storage_config:
boltdb_shipper:
cache_ttl: 48h
compactor:
retention_enabled: true |
What this PR does / why we need it:
Per my issue from yesterday and an older issue I found, loki seems to be incompatible with AWS IRSA Roles, or IAM Roles for Service Accounts. These are the AWS-recommended way to access AWS resources running inside EKS. Effectively, a properly-annotated k8s ServiceAccount will run AssumeRoleWithWebIdentity inside a pod to grant the pod specified AWS IAM permissions.
On pod startup, AWS injects the
AWS_ROLE_ARN
andAWS_WEB_IDENTITY_TOKEN_FILE
values into its environment. AWS calls made by the pod use these credentials to assume the IAM role that is annotated on the pod's ServiceAccount.Currently, loki expects an AWS access key and secret key and does not have the ability to use the environment variables that are set inside the pod.
This is tricky to test, since it requires a live EKS cluster to properly test. I have created Docker image based off my forked branch, sklarsa/loki-test:0.0.1. I'm currently running this container in a cluster and it appears to be connecting to S3. I'm getting other errors, but I'm not sure if they're related or not.
I can work on updating tests, but I'd like to get some initial feedback to make sure that I'm on the right track here.
Which issue(s) this PR fixes:
Fixes #7403
Fixes #3069
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guideCHANGELOG.md
updateddocs/sources/upgrading/_index.md