Skip to content

Load credentials from file #4037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

nurdann
Copy link
Member

@nurdann nurdann commented Apr 1, 2025

Problem

We want to allow reading IAM credentials from a root protected file. So that there's a single point of entry otherwise many other roles can fetch EKS token.

Changes

  • Read root protected file
  • Re-run spark-run as sudo when given --get-eks-token-via-iam-user
  • Mount /etc/kubernetes directory - will make switching between configs easier
  • New paasta config key

Mounting /etc/kubernetes

Currently, only a single file is mounted

paasta/paasta_tools/utils.py

Lines 2736 to 2737 in 2e706a1

def get_spark_kubeconfig(self) -> str:
return self.config_dict.get("spark_kubeconfig", "/etc/kubernetes/spark.conf")

Changing that requires changes to /etc/paasta/spark-run.json loaded from

PATH_TO_SYSTEM_PAASTA_CONFIG_DIR = os.environ.get(
"PAASTA_SYSTEM_CONFIG_DIR", "/etc/paasta/"

populated via Puppet https://sourcegraph.yelpcorp.com/sysgit/puppet@4c6d259bc8e5cf7c48e75358a010741933591191/-/blob/modules/paasta_tools/manifests/public_config.pp?L501-508

Verification (Updated Apr 10)

  • can spark-submit using /etc/kubernetes/spark.conf - got expected PATH_NOT_FOUND since S3 object doesn't exist [cmd]
  • spark-submit with /etc/kubernetes/spark.conf fails to fetch credentials if EC2 metadata service is disabled as expected [cmd]
  • can fetch EKS token via loaded ENV vars - got PATH_NOT_FOUND as well [cmd]
    Config changes to come from https://github.yelpcorp.com/sysgit/puppet/pull/14234

@nurdann nurdann requested review from nemacysts and Qmando April 1, 2025 20:26
@nurdann nurdann marked this pull request as ready for review April 3, 2025 01:29
@nurdann nurdann requested a review from jfongatyelp April 4, 2025 23:46
Copy link
Member

@Qmando Qmando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall.

@nurdann nurdann requested a review from Qmando April 9, 2025 00:58
spark_env["GET_EKS_TOKEN_AWS_SECRET_ACCESS_KEY"] = config["default"][
"aws_secret_access_key"
]

spark_env["KUBECONFIG"] = system_paasta_config.get_spark_kubeconfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we use spark2.conf instead of the default spark.conf for --get-eks-token-via-iam-user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, you need to export KUBECONFIG=... to change it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually good point! Otherwise it makes it annoying to tweak command line

chi-yelp
chi-yelp previously approved these changes Apr 9, 2025
@nurdann nurdann requested a review from a team as a code owner April 10, 2025 21:40
"aws_secret_access_key"
]

spark_env["KUBECONFIG"] = system_paasta_config.get_spark2_kubeconfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how long do we anticipate the migration to this taking? i don't really love the spark2 naming filewise, but it'd be nice to avoid naming functions with spark2 :p

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without separate config key, the cli-flag cannot switch to new kubeconfig unless new KUBECONFIG is exported. Once we no longer see file access to spark.conf, we can make the spark2 default flow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when that's done, will we swap the names back to spark.conf /remove all the spark2 naming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i mostly don't want the $FILENAME$N naming to stick around for long since it's not particularly descriptive)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We remove spark2 and make spark use the new format

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and just for completeness: how long do we foresee this transition taking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we can get representative sample using the new kubeconfig, I want to be optimistic but I could see it staying in here until end of year.

Copy link
Member

@nemacysts nemacysts Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh my, if this is potentially sticking around that long we probably want some better naming: spark2 (as a name) will be pretty meaningless to folks that aren't us several months into the future (unless they look through this PR/the tech spec/etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed variable

if args.get_eks_token_via_iam_user and os.getuid() != 0:
print("Re-executing paasta spark-run with sudo..", file=sys.stderr)
# argv[0] is treated as command name, so prepending "sudo"
os.execvp("sudo", ["sudo"] + sys.argv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the -H that we do in paasta_local_run?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Home is inherited from running locally

$ sudo env | grep HOME
HOME=/nail/home/me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(-H changes that behavior :))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter for reading root owned file though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check how aws credentials are fetched

Copy link
Member Author

@nurdann nurdann Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User can still delete the file if the directory is user owned. But If we set home to root, then we cannot use a user's profile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the way docker command is constructed is different, no matter whether current uid is 0 or not, it will switch based on DEFAULT_SPARK_DOCKER_REGISTRY https://sourcegraph.yelpcorp.com/search?q=repo:%5EYelp/paasta%24+file:%5Epaasta_tools/cli/cmds/spark_run%5C.py%24+sudo&patternType=keyword&sm=0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, we should probably try to make sure that we're not potentially leaving root-owned files around people's homedirs since that will otherwise generate onpoint load to get these cleaned up.

i'm assuming this is a problem because there's other parts of spark-run that will try to use a profile and create the file if not found?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm wrong. Then they would get error that profile doesn't exist

botocore.exceptions.ProfileNotFound: The config profile (devc) could not be found

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - spark-run will create a temporary pod template file under /nail/tmp, but I think it's fine to have a root-owned file under that path

@nurdann nurdann requested a review from nemacysts April 14, 2025 17:39
"aws_secret_access_key"
]

spark_env["KUBECONFIG"] = system_paasta_config.get_spark2_kubeconfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and just for completeness: how long do we foresee this transition taking?

if args.get_eks_token_via_iam_user and os.getuid() != 0:
print("Re-executing paasta spark-run with sudo..", file=sys.stderr)
# argv[0] is treated as command name, so prepending "sudo"
os.execvp("sudo", ["sudo"] + sys.argv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, we should probably try to make sure that we're not potentially leaving root-owned files around people's homedirs since that will otherwise generate onpoint load to get these cleaned up.

i'm assuming this is a problem because there's other parts of spark-run that will try to use a profile and create the file if not found?

@nurdann nurdann requested review from nemacysts and chi-yelp April 14, 2025 22:00
chi-yelp
chi-yelp previously approved these changes Apr 15, 2025
Qmando
Qmando previously approved these changes Apr 15, 2025
@nurdann nurdann dismissed stale reviews from Qmando and chi-yelp via dcf17fe April 16, 2025 17:49
@nurdann nurdann requested a review from nemacysts April 16, 2025 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants