Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non-default VPCs #94

Merged
merged 2 commits into from
Mar 23, 2016
Merged

Add support for non-default VPCs #94

merged 2 commits into from
Mar 23, 2016

Conversation

nchammas
Copy link
Owner

The discussion in #85 and #92 revealed that Flintrock has thus far assumed that everyone works with a default VPC, and this is not true. If you created your AWS account in or before 2013, you very likely don't have a default VPC.

This PR adds support for non-default VPCs. The relevant reading is here:

To summarize the key points from the above links, as well as the key changes in this PR:

  1. Security groups are scoped to a VPC, and Flintrock clusters are identified by security group. This means that all Flintrock commands that need to identify a cluster now accept a VPC parameter.
  2. If you do not specify the VPC you want, Flintrock will query EC2 for the default VPC in your region. This is how Flintrock worked before this PR, but now that logic has been formalized and Flintrock will give a clean error message if you have no default VPC set in your selected region.
  3. When you specify the VPC explicitly, Flintrock assumes it's a non-default VPC and requires that you specify a subnet explicitly since users currently are unable to set default subnets on non-default VPCs. Flintrock will additionally check that your non-default VPC and subnet are configured correctly.

@dhulse @marcuscollins - I've tested this PR thoroughly, but it would be a big help if you tested this in your environments and confirmed that it works for you as expected. The key things to test are the 3 points I just laid out above.

To install Flintrock at this PR, use this command:

pip3 install -U git+https://github.com/nchammas/flintrock@non-default-vpc

This will override any installation of Flintrock you have in your currently-active environment.

Fixes #85.
Fixes #92.

@marcuscollins
Copy link

@nchammas It launches into the (non-default) VPC and does have the specified IAM role, but still can't access s3 w/o specifying key id and secret key.

@nchammas
Copy link
Owner Author

Regarding S3 access, can you post the IAM policy as well as the S3 location you are trying to access? At this point I think this is most likely a problem with your IAM policy.

Also, can you confirm that Flintrock errors out cleanly if you try a launch without setting a VPC explicitly (since you don't have a default VPC)?

@marcuscollins
Copy link

Errors out correctly:

Flintrock could not find a default VPC in us-east-1. Please explicitly specify a VPC to work with in that region. Flintrock does not support managing EC2 clusters outside a VPC.

@marcuscollins
Copy link

The s3 loc is private, so I can't pass it along, sorry. I suspect you are correct though, and I'll look into it.

@nchammas
Copy link
Owner Author

In case it helps, a common problem with S3 policies is that they don't grant the following 3 rights--list access, root access, and content access (ref):

{
  "Statement": [
    {
      "Action": [
        "s3:ListAllMyBuckets"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::*"
    },
    {
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::my-bucket", "arn:aws:s3:::my-bucket/*"]
    }
  ]
}

I've made this mistake in the past and it can be hard to catch if you don't know what to look for.

@marcuscollins
Copy link

Huh, ours is pretty generic:

{
  "Version": "2012-10-17", 
  "Statement": [
    {
      "Action": [
        "ec2:AuthorizeSecurityGroupIngress", 
        "ec2:CancelSpotInstanceRequests", 
        "ec2:CreateSecurityGroup", 
        "ec2:CreateTags", 
        "ec2:Describe*", 
        "ec2:DeleteTags", 
        "ec2:ModifyImageAttribute", 
        "ec2:ModifyInstanceAttribute", 
        "ec2:RequestSpotInstances", 
        "ec2:RunInstances", 
        "ec2:TerminateInstances", 
        "iam:PassRole", 
        "iam:ListRolePolicies", 
        "iam:GetRole", 
        "iam:GetRolePolicy", 
        "iam:ListInstanceProfiles", 
        "s3:Get*", 
        "s3:List*", 
        "s3:CreateBucket", 
        "sdb:BatchPutAttributes", 
        "sdb:Select"
      ], 
      "Resource": "*", 
      "Effect": "Allow"
    }
  ]
}

I'll look up what other s3 operations there might be that I'd need, and I'll see if I can get permission to create a simple role like that one to test out. I wonder if this is just going to end up being YAA pre-2013 issue.

@nchammas
Copy link
Owner Author

Your policy is missing the s3:ListAllMyBuckets action. I wonder if that's the missing permission you need.

@marcuscollins
Copy link

that isn't covered under "s3:List*"?

@nchammas
Copy link
Owner Author

Oh, sorry, you're right. 😕

@nchammas
Copy link
Owner Author

Let's move the discussion about IAM roles to #90.

Since the non-default VPC launches work for you @marcuscollins, I'll just wait for @dhulse to also confirm that this PR works for him before merging it in.

@dhulse
Copy link

dhulse commented Mar 15, 2016

I have attempted to launch a cluster twice using the code from the PR. I no longer run into the error: "Exception: Error authorizing cluster ingress to self". However I still cannot create a cluster successfully.

Here is my output. (I am setting vpc-id and subnet-id in my config file)

flintrock launch dale-tiny2 --num-slaves 2
Requesting 3 spot instances at a max price of $0.2...
0 of 3 instances granted. Waiting...
0 of 3 instances granted. Waiting...
All 3 instances granted.
There was a problem with the launch. Cleaning up...
Do you want to terminate the 3 instances created by this operation? [Y/n]:

I may have something configured incorrectly on my side. I will continue to investigate.

@dhulse
Copy link

dhulse commented Mar 15, 2016

I have noticed that I cannot ssh into my ec2 instances created from flintrock as ec2-user. Perhaps I need to do some further configuration on my non default vpc

@nchammas
Copy link
Owner Author

Hmm, is there no explanatory error message when the launch fails, after you answer the prompt?

@dhulse
Copy link

dhulse commented Mar 15, 2016

No, it just says "There was a problem with the launch. Cleaning up..."

@nchammas
Copy link
Owner Author

Are you using a vanilla Amazon Linux AMI, or something else?

@nchammas
Copy link
Owner Author

If you newly created your VPC and subnet, you may need to attach an internet gateway. That's a mistake I've made before.

I'm confused though as to why there is no detail on the launch failure. That's definitely not good. :(

@dhulse
Copy link

dhulse commented Mar 15, 2016

I have been using ami-60b6c60a which is different from the default ami provided by flintrock ami-60b6c60a. I tried launching a cluster using ami-60b6c60a instead and I still have the same issue.

I don't think an internet gateway has been attached to the new VPC I have created. I will look into that.

@nchammas
Copy link
Owner Author

Just checking in briefly here: @dhulse - Any progress on getting Flintrock working on this branch? Any leads I can help you investigate?

@nchammas
Copy link
Owner Author

Merging this in since the primary feature (launching into non-default VPCs) seems to be working.

@nchammas nchammas merged commit 87af6a6 into master Mar 23, 2016
@nchammas nchammas deleted the non-default-vpc branch March 23, 2016 00:06
@dhulse
Copy link

dhulse commented Mar 23, 2016

Hey @nchammas I have not had any progress on getting Flintrock to work in my VPC. I have been really busy with other priorities these past few days and I have not had a chance to look into. I will investigate why it's not working on my side and will let you know what I find out. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants