-
Notifications
You must be signed in to change notification settings - Fork 245
Load credentials from file #4037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ds-with-cli-arg # Conflicts: # tests/cli/test_cmds_spark_run.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good overall.
paasta_tools/cli/cmds/spark_run.py
Outdated
spark_env["GET_EKS_TOKEN_AWS_SECRET_ACCESS_KEY"] = config["default"][ | ||
"aws_secret_access_key" | ||
] | ||
|
||
spark_env["KUBECONFIG"] = system_paasta_config.get_spark_kubeconfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use spark2.conf
instead of the default spark.conf
for --get-eks-token-via-iam-user
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, you need to export KUBECONFIG=...
to change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually good point! Otherwise it makes it annoying to tweak command line
paasta_tools/cli/cmds/spark_run.py
Outdated
"aws_secret_access_key" | ||
] | ||
|
||
spark_env["KUBECONFIG"] = system_paasta_config.get_spark2_kubeconfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how long do we anticipate the migration to this taking? i don't really love the spark2 naming filewise, but it'd be nice to avoid naming functions with spark2 :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without separate config key, the cli-flag cannot switch to new kubeconfig unless new KUBECONFIG
is exported. Once we no longer see file access to spark.conf
, we can make the spark2 default flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when that's done, will we swap the names back to spark.conf /remove all the spark2 naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i mostly don't want the $FILENAME$N
naming to stick around for long since it's not particularly descriptive)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We remove spark2 and make spark use the new format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and just for completeness: how long do we foresee this transition taking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we can get representative sample using the new kubeconfig, I want to be optimistic but I could see it staying in here until end of year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh my, if this is potentially sticking around that long we probably want some better naming: spark2 (as a name) will be pretty meaningless to folks that aren't us several months into the future (unless they look through this PR/the tech spec/etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed variable
if args.get_eks_token_via_iam_user and os.getuid() != 0: | ||
print("Re-executing paasta spark-run with sudo..", file=sys.stderr) | ||
# argv[0] is treated as command name, so prepending "sudo" | ||
os.execvp("sudo", ["sudo"] + sys.argv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the -H that we do in paasta_local_run
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Home is inherited from running locally
$ sudo env | grep HOME
HOME=/nail/home/me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(-H changes that behavior :))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter for reading root owned file though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check how aws credentials are fetched
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User can still delete the file if the directory is user owned. But If we set home to root, then we cannot use a user's profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the way docker command is constructed is different, no matter whether current uid is 0 or not, it will switch based on DEFAULT_SPARK_DOCKER_REGISTRY
https://sourcegraph.yelpcorp.com/search?q=repo:%5EYelp/paasta%24+file:%5Epaasta_tools/cli/cmds/spark_run%5C.py%24+sudo&patternType=keyword&sm=0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, we should probably try to make sure that we're not potentially leaving root-owned files around people's homedirs since that will otherwise generate onpoint load to get these cleaned up.
i'm assuming this is a problem because there's other parts of spark-run
that will try to use a profile and create the file if not found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm wrong. Then they would get error that profile doesn't exist
botocore.exceptions.ProfileNotFound: The config profile (devc) could not be found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - spark-run will create a temporary pod template file under /nail/tmp
, but I think it's fine to have a root-owned file under that path
paasta_tools/cli/cmds/spark_run.py
Outdated
"aws_secret_access_key" | ||
] | ||
|
||
spark_env["KUBECONFIG"] = system_paasta_config.get_spark2_kubeconfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and just for completeness: how long do we foresee this transition taking?
if args.get_eks_token_via_iam_user and os.getuid() != 0: | ||
print("Re-executing paasta spark-run with sudo..", file=sys.stderr) | ||
# argv[0] is treated as command name, so prepending "sudo" | ||
os.execvp("sudo", ["sudo"] + sys.argv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, we should probably try to make sure that we're not potentially leaving root-owned files around people's homedirs since that will otherwise generate onpoint load to get these cleaned up.
i'm assuming this is a problem because there's other parts of spark-run
that will try to use a profile and create the file if not found?
Co-authored-by: Luis Pérez <[email protected]>
Co-authored-by: Luis Pérez <[email protected]>
Problem
We want to allow reading IAM credentials from a root protected file. So that there's a single point of entry otherwise many other roles can fetch EKS token.
Changes
spark-run
as sudo when given--get-eks-token-via-iam-user
/etc/kubernetes
directory - will make switching between configs easierMounting
/etc/kubernetes
Currently, only a single file is mounted
paasta/paasta_tools/utils.py
Lines 2736 to 2737 in 2e706a1
Changing that requires changes to
/etc/paasta/spark-run.json
loaded frompaasta/paasta_tools/utils.py
Lines 93 to 94 in 2e706a1
populated via Puppet https://sourcegraph.yelpcorp.com/sysgit/puppet@4c6d259bc8e5cf7c48e75358a010741933591191/-/blob/modules/paasta_tools/manifests/public_config.pp?L501-508
Verification (Updated Apr 10)
spark-submit
using/etc/kubernetes/spark.conf
- got expected PATH_NOT_FOUND since S3 object doesn't exist [cmd]spark-submit
with/etc/kubernetes/spark.conf
fails to fetch credentials if EC2 metadata service is disabled as expected [cmd]Config changes to come from https://github.yelpcorp.com/sysgit/puppet/pull/14234