Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for IRSA credentials for the s3 storage mode #7419

Closed
wants to merge 7 commits into from

Conversation

sklarsa
Copy link

@sklarsa sklarsa commented Oct 13, 2022

What this PR does / why we need it:
Per my issue from yesterday and an older issue I found, loki seems to be incompatible with AWS IRSA Roles, or IAM Roles for Service Accounts. These are the AWS-recommended way to access AWS resources running inside EKS. Effectively, a properly-annotated k8s ServiceAccount will run AssumeRoleWithWebIdentity inside a pod to grant the pod specified AWS IAM permissions.

On pod startup, AWS injects the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE values into its environment. AWS calls made by the pod use these credentials to assume the IAM role that is annotated on the pod's ServiceAccount.

Currently, loki expects an AWS access key and secret key and does not have the ability to use the environment variables that are set inside the pod.

This is tricky to test, since it requires a live EKS cluster to properly test. I have created Docker image based off my forked branch, sklarsa/loki-test:0.0.1. I'm currently running this container in a cluster and it appears to be connecting to S3. I'm getting other errors, but I'm not sure if they're related or not.

I can work on updating tests, but I'd like to get some initial feedback to make sure that I'm on the right track here.

Which issue(s) this PR fixes:
Fixes #7403
Fixes #3069

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@sklarsa sklarsa requested a review from a team as a code owner October 13, 2022 22:08
Copy link
Contributor

@kavirajk kavirajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sklarsa for the PR. I like the changes, left few comments.

@@ -77,6 +88,7 @@ type S3Config struct {
SignatureVersion string `yaml:"signature_version"`
SSEConfig bucket_s3.SSEConfig `yaml:"sse"`
BackoffConfig backoff.Config `yaml:"backoff_config"`
UseIrsa bool `yaml:"use_irsa"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to call it UseIRSA (on the struct field) just to be explicit about the acronym. (use_irsa is fine for yaml).

@@ -212,6 +212,7 @@ loki:
s3ForcePathStyle: false
insecure: false
http_config: {}
use_irsa: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be a breaking change no?. Breaks the existing helm users who uses ACCESSKEY based S3 credentials.

I would prefer to keep false as default and document this additional configs on the S3Config page (use_irsa + additional ENVs that need to be set)
https://grafana.com/docs/loki/latest/configuration/#s3_storage_config

@sklarsa
Copy link
Author

sklarsa commented Oct 14, 2022

Thanks for the review! Based on this comment in the issue I raised, I still have hope that this is just a configuration issue and won't require code changes. I'll keep you updated, and make those fixes in the meantime

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
-               loki	-0.6%

@sklarsa
Copy link
Author

sklarsa commented Oct 14, 2022

It looks like this was just a case of user error and misconfiguration :-(

I guess the AWS golang client is smart enough to use the environment's default credential chain if no credentials are explicitly passed to the client config.

@sklarsa sklarsa closed this Oct 14, 2022
@sklarsa sklarsa deleted the adds-irsa-credentials branch October 14, 2022 15:15
@okgolove
Copy link

okgolove commented Nov 8, 2022

@sklarsa could you show your result config?

@sklarsa
Copy link
Author

sklarsa commented Nov 8, 2022

the loki: part of my values.yaml...:

loki:
  storage:
    type: s3
    bucketNames:
      chunks: loki-chunks-bucket
      ruler: loki-ruler-bucket
      admin: loki-admin-bucket
    s3:
      region: us-east-2
  storage_config:
    boltdb_shipper:
      cache_ttl: 48h

  compactor:
    retention_enabled: true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants