modules/aws/bootstrap: Pull AWS bootstrap setup into a module #217

wking · 2018-09-06T18:37:02Z

Builds on #213; review that first.

This will make it easier to move into the existing infra step.

crawford · 2018-09-06T18:46:13Z

modules/aws/bootstrap/main.tf

+
+  role = "${join("|", aws_iam_role.bootstrap.*.name)}"
+
+  #"${var.iam_role == "" ?


Can this be removed?

ah, no, I need to get that working again... Will hopefully have a fix up shortly ;)

Can this be removed?

Fixed with 1efb239 -> f29ab92.

crawford · 2018-09-06T18:46:53Z

ok to test

abhinavdahiya · 2018-09-06T19:24:08Z

modules/aws/bootstrap/main.tf

+  volume_tags = "${var.tags}"
+}
+
+resource "aws_elb_attachment" "bootstrap" {


https://github.com/openshift/installer/blob/master/modules/aws/master/main.tf#L128-L144

When we move this module to be part of the infra step; we will end up with the same problem as var.elbs will not be countable at plan stage.
cc: @crawford

we will end up with the same problem as var.elbs will not be countable at plan stage.

This is a reference to hashicorp/terraform#12570, right? I expect we can work around that when we consolidate the stages, but am happy to adjust things here if there's something I can do now.

we will end up with the same problem as var.elbs will not be countable at plan stage.

This is a reference to hashicorp/terraform#12570, right?

I did indeed hit this, and pushed b8eec241 to #268, which seems like an only-moderately-hideous workaround ;).

abhinavdahiya · 2018-09-06T19:28:12Z

steps/bootstrap/aws/main.tf


  tags = "${merge(map(
      "Name", "${var.tectonic_cluster_name}-bootstrap",
-      "kubernetes.io/cluster/${var.tectonic_cluster_name}", "owned",


why drop this tag?

why drop this tag?

Because we're eventually going to tear down the bootstrap stuff as part of installation, so we aren't leaving it around for Kubernetes to own.

wking · 2018-09-06T20:11:57Z

#213 just landed, so I've rebased this and think it's good to go (unless more review suggestions come in ;).

wking · 2018-09-06T20:28:41Z

The e2e-aws error was:

1 error(s) occurred:

* module.vpc.data.aws_route_table.worker[4]: data.aws_route_table.worker.4: Your query returned no results. Please change your search criteria and try again.

Probably a flake.

/retest

Also run the smoke tests:

retest this please

crawford · 2018-09-06T20:53:26Z

/approve

blah blah blah blah blah blah

wking · 2018-09-06T21:42:17Z

Both e2e-aws and the smoke tests died with:

Error: module.bootstrap.aws_instance.bootstrap: vpc_security_group_ids: should be a list

I've pushed f29ab92 -> 10f717b adding explicit brackets to hopefully fix that. The underlying issue may be Terraform occasionally forgetting type information.

crawford · 2018-09-07T23:44:48Z

/retest

crawford · 2018-09-07T23:44:53Z

retest this please

wking · 2018-09-10T16:17:21Z

An earlier e2e-aws run failed with:

Waiting for API at https://ci-op-b4gygz8p-5849d-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-b4gygz8p-5849d-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Interrupted
2018/09/06 23:45:52 Container setup in pod e2e-aws failed, exit code 1, reason Error

Trying to reproduce locally, I spun up a cluster on Friday.

On the bootstrap node:

$ systemctl --failed | head -n2
  UNIT             LOAD   ACTIVE SUB    DESCRIPTION                   
● bootkube.service loaded failed failed Bootstrap a Kubernetes cluster
$ journalctl -u bootkube.service -n6
-- Logs begin at Fri 2018-09-07 20:26:48 UTC, end at Sat 2018-09-08 04:28:19 UTC. --
Sep 07 22:08:37 ip-10-0-10-162 bash[3486]: https://trking-87205-etcd-0.coreservices.team.coreos.systems:2379 is unhealthy: failed to connect: dial tcp 10.0.5.134:2379: getsockopt: connection refused
Sep 07 22:08:37 ip-10-0-10-162 bash[3486]: Error:  unhealthy cluster
Sep 07 22:08:38 ip-10-0-10-162 bash[3486]: etcdctl failed too many times.
Sep 07 22:08:38 ip-10-0-10-162 systemd[1]: bootkube.service: Main process exited, code=exited, status=1/FAILURE
Sep 07 22:08:38 ip-10-0-10-162 systemd[1]: bootkube.service: Failed with result 'exit-code'.
Sep 07 22:08:38 ip-10-0-10-162 systemd[1]: Failed to start Bootstrap a Kubernetes cluster.
$ docker run --rm --env ETCDCTL_API=3 --volume /opt/tectonic/tls:/opt/tectonic/tls:ro,z quay.io/coreos/etcd:v3.2.14 etcdctl --cacert=/opt/tectonic/tls/etcd-client-ca.crt --cert=/opt/tectonic/tls/etcd-client.crt --key=/opt/tectonic/tls/etcd-client.key --endpoints=https://trking-87205-etcd-0.coreservices.team.coreos.systems:2379 endpoint health
https://trking-87205-etcd-0.coreservices.team.coreos.systems:2379 is unhealthy: failed to connect: dial tcp 10.0.5.134:2379: getsockopt: connection refused
Error:  unhealthy cluster
$ dig trking-87205-etcd-0.coreservices.team.coreos.systems +short
10.0.5.134
$ dig trking-87205-master-0.coreservices.team.coreos.systems +short

Uh. Checking from my dev box:

$ aws ec2 describe-instances --query "Reservations[].Instances[] | [?Tags[? Key == 'Name' && Value == 'trking-87205-master-0']].PublicIpAddress" --output text
18.215.14.161
$ aws ec2 describe-instances --query "Reservations[].Instances[] | [?Tags[? Key == 'Name']]" --output text | grep '^0\|Name\|IPADDRESS\|ASSOCIATION' | cut -b -80
0	x86_64		False	True	xen	ami-00cc4337762ba4a52	i-00c28bd06d8eb77a1	t2.medium	201
PRIVATEIPADDRESSES	True	ip-10-0-133-231.ec2.internal	10.0.133.231
TAGS	Name	trking-87205-worker-0
0	x86_64		False	True	xen	ami-00cc4337762ba4a52	i-072fdf9d1e0beaf3a	t2.medium	201
PRIVATEIPADDRESSES	True	ip-10-0-158-83.ec2.internal	10.0.158.83
TAGS	Name	trking-87205-worker-1
0	x86_64		False	True	xen	ami-00cc4337762ba4a52	i-01cd2e4e6ecaea69e	t2.medium	201
ASSOCIATION	amazon	ec2-18-215-14-161.compute-1.amazonaws.com	18.215.14.161
PRIVATEIPADDRESSES	True	ip-10-0-5-134.ec2.internal	10.0.5.134
ASSOCIATION	amazon	ec2-18-215-14-161.compute-1.amazonaws.com	18.215.14.161
TAGS	Name	trking-87205-master-0
0	x86_64		False	True	xen	ami-00cc4337762ba4a52	i-0011bb9aa98d56684	t2.medium	201
ASSOCIATION	amazon	ec2-34-205-252-140.compute-1.amazonaws.com	34.205.252.140
PRIVATEIPADDRESSES	True	ip-10-0-10-162.ec2.internal	10.0.10.162
ASSOCIATION	amazon	ec2-34-205-252-140.compute-1.amazonaws.com	34.205.252.140
TAGS	Name	trking-87205-bootstrap
0	x86_64		False	True	xen	ami-00cc4337762ba4a52	i-01d196ea977f59b3b	t2.medium	201
PRIVATEIPADDRESSES	True	ip-10-0-165-55.ec2.internal	10.0.165.55
TAGS	Name	trking-87205-worker-2

So indeed master-0's internal IP is 10.0.5.134, and its public IP is 18.215.14.161. I'm not sure why I can't resolve it via DNS from the bootstrap node. Anyhow, back to the bootstrap node:

$ ssh -v [email protected]
OpenSSH_7.6p1, OpenSSL 1.0.2n  7 Dec 2017
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Connecting to 18.215.14.161 [18.215.14.161] port 22.
debug1: connect to address 18.215.14.161 port 22: Connection refused
ssh: connect to host 18.215.14.161 port 22: Connection refused
$ ssh -v [email protected] 
OpenSSH_7.6p1, OpenSSL 1.0.2n  7 Dec 2017
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Connecting to 10.0.5.134 [10.0.5.134] port 22.
debug1: connect to address 10.0.5.134 port 22: Connection refused
ssh: connect to host 10.0.5.134 port 22: Connection refused

And from my dev box:

$ ssh -v [email protected]
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
debug1: Reading configuration data /home/trking/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Connecting to 18.215.14.161 [18.215.14.161] port 22.
debug1: connect to address 18.215.14.161 port 22: Connection refused
ssh: connect to host 18.215.14.161 port 22: Connection refused

What's up with this node?

$ aws ec2 describe-instances --instance-id i-01cd2e4e6ecaea69e --query 'Reservations[].Instances[].State.Name' --output text
running
$ aws ec2 describe-instances --instance-id i-01cd2e4e6ecaea69e --query 'Reservations[].Instances[].SecurityGroups' --output text
sg-0061338459d264b41				 terraform-20180907202140389700000002

Compare that with the bootstrap node:

$ aws ec2 describe-instances --instance-id i-0011bb9aa98d56684 --query 'Reservations[].Instances[].SecurityGroups' --output text
sg-0061338459d264b41				 terraform-20180907202140389700000002

Same values. Actually, let's just diff the states:

$ BOOTSTRAP="$(aws ec2 describe-instances --instance-id i-0011bb9aa98d56684 --query 'Reservations[].Instances[]' --output json)"
$ MASTER="$(aws ec2 describe-instances --instance-id i-01cd2e4e6ecaea69e --query 'Reservations[].Instances[]' --output json)"
$ diff -u <(echo "${BOOTSTRAP}") <(echo "${MASTER}")
--- /dev/fd/63	2018-09-07 21:59:42.091362945 -0700
+++ /dev/fd/62	2018-09-07 21:59:42.091362945 -0700
@@ -3,22 +3,22 @@
         "Monitoring": {
             "State": "disabled"
         }, 
-        "PublicDnsName": "ec2-34-205-252-140.compute-1.amazonaws.com", 
+        "PublicDnsName": "ec2-18-215-14-161.compute-1.amazonaws.com", 
         "State": {
             "Code": 16, 
             "Name": "running"
         }, 
         "EbsOptimized": false, 
-        "LaunchTime": "2018-09-07T20:25:50.000Z", 
-        "PublicIpAddress": "34.205.252.140", 
-        "PrivateIpAddress": "10.0.10.162", 
+        "LaunchTime": "2018-09-07T20:22:56.000Z", 
+        "PublicIpAddress": "18.215.14.161", 
+        "PrivateIpAddress": "10.0.5.134", 
         "ProductCodes": [], 
         "VpcId": "vpc-0b6626eba63c20d20", 
         "StateTransitionReason": "", 
-        "InstanceId": "i-0011bb9aa98d56684", 
+        "InstanceId": "i-01cd2e4e6ecaea69e", 
         "EnaSupport": true, 
         "ImageId": "ami-00cc4337762ba4a52", 
-        "PrivateDnsName": "ip-10-0-10-162.ec2.internal", 
+        "PrivateDnsName": "ip-10-0-5-134.ec2.internal", 
         "SecurityGroups": [
             {
                 "GroupName": "terraform-20180907202140389700000002", 
@@ -31,30 +31,30 @@
         "NetworkInterfaces": [
             {
                 "Status": "in-use", 
-                "MacAddress": "02:f8:bf:18:7c:0a", 
+                "MacAddress": "02:52:a7:4b:43:10", 
                 "SourceDestCheck": true, 
                 "VpcId": "vpc-0b6626eba63c20d20", 
                 "Description": "", 
-                "NetworkInterfaceId": "eni-09d792347f4e050db", 
+                "NetworkInterfaceId": "eni-0f5378c203dea3028", 
                 "PrivateIpAddresses": [
                     {
-                        "PrivateDnsName": "ip-10-0-10-162.ec2.internal", 
-                        "PrivateIpAddress": "10.0.10.162", 
+                        "PrivateDnsName": "ip-10-0-5-134.ec2.internal", 
+                        "PrivateIpAddress": "10.0.5.134", 
                         "Primary": true, 
                         "Association": {
-                            "PublicIp": "34.205.252.140", 
-                            "PublicDnsName": "ec2-34-205-252-140.compute-1.amazonaws.com", 
+                            "PublicIp": "18.215.14.161", 
+                            "PublicDnsName": "ec2-18-215-14-161.compute-1.amazonaws.com", 
                             "IpOwnerId": "amazon"
                         }
                     }
                 ], 
-                "PrivateDnsName": "ip-10-0-10-162.ec2.internal", 
+                "PrivateDnsName": "ip-10-0-5-134.ec2.internal", 
                 "Attachment": {
                     "Status": "attached", 
                     "DeviceIndex": 0, 
                     "DeleteOnTermination": true, 
-                    "AttachmentId": "eni-attach-09624bb294e015616", 
-                    "AttachTime": "2018-09-07T20:25:50.000Z"
+                    "AttachmentId": "eni-attach-00536c4cc9813fb1b", 
+                    "AttachTime": "2018-09-07T20:22:56.000Z"
                 }, 
                 "Groups": [
                     {
@@ -64,11 +64,11 @@
                 ], 
                 "Ipv6Addresses": [], 
                 "OwnerId": "816138690521", 
-                "PrivateIpAddress": "10.0.10.162", 
+                "PrivateIpAddress": "10.0.5.134", 
                 "SubnetId": "subnet-018374f09ef32961c", 
                 "Association": {
-                    "PublicIp": "34.205.252.140", 
-                    "PublicDnsName": "ec2-34-205-252-140.compute-1.amazonaws.com", 
+                    "PublicIp": "18.215.14.161", 
+                    "PublicDnsName": "ec2-18-215-14-161.compute-1.amazonaws.com", 
                     "IpOwnerId": "amazon"
                 }
             }
@@ -86,22 +86,30 @@
                 "Ebs": {
                     "Status": "attached", 
                     "DeleteOnTermination": true, 
-                    "VolumeId": "vol-0514fba36b29b890c", 
-                    "AttachTime": "2018-09-07T20:25:51.000Z"
+                    "VolumeId": "vol-02edf331c988bde6f", 
+                    "AttachTime": "2018-09-07T20:22:57.000Z"
                 }
             }
         ], 
         "Architecture": "x86_64", 
         "RootDeviceType": "ebs", 
         "IamInstanceProfile": {
-            "Id": "AIPAJUQY64SRYUWKBSLAK", 
-            "Arn": "arn:aws:iam::816138690521:instance-profile/trking-87205-bootstrap-profile"
+            "Id": "AIPAIXRRU3YPPZDQJLI3A", 
+            "Arn": "arn:aws:iam::816138690521:instance-profile/trking-87205-master-profile"
         }, 
         "RootDeviceName": "/dev/xvda", 
         "VirtualizationType": "hvm", 
         "Tags": [
             {
-                "Value": "trking-87205-bootstrap", 
+                "Value": "owned", 
+                "Key": "kubernetes.io/cluster/trking-87205"
+            }, 
+            {
+                "Value": "2018-09-08T00:21+0000", 
+                "Key": "expirationDate"
+            }, 
+            {
+                "Value": "trking-87205-master-0", 
                 "Key": "Name"
             }, 
             {
@@ -109,10 +117,6 @@
                 "Key": "tectonicClusterID"
             }, 
             {
-                "Value": "2018-09-08T00:21+0000", 
-                "Key": "expirationDate"
-            }, 
-            {
                 "Value": "Resource does not meet policy: stop@2018/09/10", 
                 "Key": "maid_status"
             }

I don't see any surprising differences, and I have no idea why I can't SSH into the master node. But not being able to SSH into the master makes it hard to figure out why its etcd is broken. Or maybe there's just a networking issue that's behind my inability to connect for both SSH and etcd?

wking · 2018-09-10T17:15:50Z

/retest

wking · 2018-09-10T17:15:57Z

/test unit

wking · 2018-09-11T23:02:18Z

/retest

wking · 2018-09-12T05:11:49Z

I've spun up a cluster to debug this, and the master is dying in Ignition:

$ aws ec2 describe-instances --query "Reservations[].Instances[] | [?Tags[? Key == 'Name' && Value == 'trking-18d26-master-0']].InstanceId" --output text
i-098c83ac601024a12
$ aws ec2 get-console-output --instance-id i-098c83ac601024a12 --output text | tail -n5
[  170.062650] ignition[738]: INFO     : GET https://trking-18d26-tnc.coreservices.team.coreos.systems:80/config/master?etcd_index=0: attempt #37
[  170.073885] ignition[738]: INFO     : GET https://trking-18d26-tnc.coreserv[  170.076770] ignition[738]: INFO     : GET error: Get https://trking-18d26-tnc.coreservices.team.coreos.systems:80/config/master?etcd_index=0: EOF
ices.team.coreos.systems:80/config/master?etcd_index=0: attempt #37
[  170.087578] ignition[738]: INFO     : GET error: Get https://trking-18d26-tnc.coreservices.team.coreos.systems:80/config/master?etcd_index=0: EOF
	2018-09-12T04:49:15.000Z

Check from the bootstrap node:

$ aws ec2 describe-instances --query "Reservations[].Instances[] | [?Tags[? Key == 'Name' && Value == 'trking-18d26-bootstrap']].PublicIpAddress" --output text
34.205.135.98
$ ssh [email protected]
$ curl -v 'https://trking-18d26-tnc.coreservices.team.coreos.systems:80/config/master?etcd_index=0'
* About to connect() to trking-18d26-tnc.coreservices.team.coreos.systems port 80 (#0)
*   Trying 10.0.1.14...
* Connected to trking-18d26-tnc.coreservices.team.coreos.systems (10.0.1.14) port 80 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* NSS error -5938 (PR_END_OF_FILE_ERROR)
* Encountered end of file
* Closing connection 0
curl: (35) Encountered end of file
$ systemctl status | head -n2
● ip-10-0-6-58
    State: starting
$ systemctl | grep activating
bootkube.service                                                                                                                     loaded activating start        start Bootstrap a Kubernetes cluster
kubelet.service                                                                                                                      loaded activating auto-restart       Kubernetes Kubelet
$ systemctl status kubelet.service
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Wed 2018-09-12 05:05:40 UTC; 2s ago
  Process: 3765 ExecStart=/usr/bin/hyperkube kubelet --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --rotate-certificates --cni-conf-dir=/etc/kubernetes/cni/net.d --cni-bin-dir=/var/lib/cni/bin --network-plugin=cni --lock-file=/var/run/lock/kubelet.lock --exit-on-lock-contention --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged --node-labels=node-role.kubernetes.io/bootstrap --minimum-container-ttl-duration=6m0s --cluster-dns=10.3.0.10 --cluster-domain=cluster.local --client-ca-file=/etc/kubernetes/ca.crt --cloud-provider=aws --anonymous-auth=false --cgroup-driver=systemd --register-with-taints=node-role.kubernetes.io/bootstrap=:NoSchedule (code=exited, status=255)
  Process: 3760 ExecStartPre=/usr/bin/bash -c gawk '/certificate-authority-data/ {print $2}' /etc/kubernetes/kubeconfig | base64 --decode > /etc/kubernetes/ca.crt (code=exited, status=0/SUCCESS)
  Process: 3758 ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests (code=exited, status=0/SUCCESS)
 Main PID: 3765 (code=exited, status=255)

Sep 12 05:05:40 ip-10-0-6-58 systemd[1]: Unit kubelet.service entered failed state.
Sep 12 05:05:40 ip-10-0-6-58 systemd[1]: kubelet.service failed.
$ systemctl status bootkube.service
● bootkube.service - Bootstrap a Kubernetes cluster
   Loaded: loaded (/etc/systemd/system/bootkube.service; static; vendor preset: disabled)
   Active: activating (start) since Wed 2018-09-12 04:50:50 UTC; 15min ago
 Main PID: 962 (bash)
   Memory: 166.8M
   CGroup: /system.slice/bootkube.service
           ├─ 962 /usr/bin/bash /opt/tectonic/bootkube.sh
           └─3066 /usr/bin/podman run --rm --network host --name etcdctl --env ETCDCTL_API=3 --volume /opt/tectonic/tls:/opt/tectonic/tls:ro,z quay.io/coreos/etcd:v3.2.14 /usr/local/bin/etcdctl --dial-timeout...

Sep 12 04:51:02 ip-10-0-6-58 bash[962]: [31B blob data]
Sep 12 04:51:02 ip-10-0-6-58 bash[962]: Copying blob sha256:c15c14574a0bc94fb65cb906baae5debd103dd02991f3449adaa639441b7dde4
Sep 12 04:51:03 ip-10-0-6-58 bash[962]: [31B blob data]
Sep 12 04:51:03 ip-10-0-6-58 bash[962]: Skipping fetch of repeat blob sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Sep 12 04:51:03 ip-10-0-6-58 bash[962]: Skipping fetch of repeat blob sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Sep 12 04:51:03 ip-10-0-6-58 bash[962]: Writing manifest to image destination
Sep 12 04:51:03 ip-10-0-6-58 bash[962]: Storing signatures
Sep 12 05:01:04 ip-10-0-6-58 bash[962]: https://trking-18d26-etcd-0.coreservices.team.coreos.systems:2379 is unhealthy: failed to connect: dial tcp 10.0.9.212:2379: getsockopt: connection refused
Sep 12 05:01:04 ip-10-0-6-58 bash[962]: Error:  unhealthy cluster
Sep 12 05:01:04 ip-10-0-6-58 bash[962]: etcdctl failed. Retrying in 5 seconds...
$ sudo podman ps -a
CONTAINER ID   IMAGE                                                                                           COMMAND                  CREATED          STATUS                              PORTS   NAMES
ab5fe498da75   quay.io/coreos/etcd:v3.2.14                                                                     /usr/local/bin/etcd...   4 minutes ago    Up 4 minutes ago                            etcdctl
83d7fba588d5   quay.io/coreos/kube-etcd-signer-server:678cc8e6841e2121ebfdb6e2db568fce290b67d6                 kube-etcd-signer-se...   15 minutes ago   Up 15 minutes ago                           lucid_tesla
cdbffdb210ea   quay.io/coreos/tectonic-node-controller-operator-dev:0a24db2288f00b10ced358d9643debd601ffd0f1   /app/operator/node-...   15 minutes ago   Exited (0) Less than a second ago           trusting_morse
36af8121636c   quay.io/coreos/kube-core-renderer-dev:0a24db2288f00b10ced358d9643debd601ffd0f1                  /app/operator/kube-...   15 minutes ago   Exited (0) Less than a second ago           friendly_swanson
$ journalctl -n25
-- Logs begin at Wed 2018-09-12 04:49:14 UTC, end at Wed 2018-09-12 05:08:08 UTC. --
Sep 12 05:07:57 ip-10-0-6-58 systemd[1]: kubelet.service failed.
Sep 12 05:08:07 ip-10-0-6-58 systemd[1]: kubelet.service holdoff time over, scheduling restart.
Sep 12 05:08:07 ip-10-0-6-58 systemd[1]: Starting Kubernetes Kubelet...
Sep 12 05:08:07 ip-10-0-6-58 systemd[1]: Started Kubernetes Kubelet.
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --rotate-certificates has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/do
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --allow-privileged has been deprecated, will be removed in a future version
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --minimum-container-ttl-duration has been deprecated, Use --eviction-hard or --eviction-soft instead. Will be removed in a future version.
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --cluster-dns has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tas
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --cluster-domain has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --client-ca-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/t
Sep 12 05:08:07 ip-10-0-6-58 systemd[1]: Started Kubernetes systemd probe.
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: I0912 05:08:07.892870    4122 server.go:418] Version: v1.11.0+d4cacc0
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: I0912 05:08:07.892979    4122 server.go:496] acquiring file lock on "/var/run/lock/kubelet.lock"
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: I0912 05:08:07.893006    4122 server.go:501] watching for inotify events for: /var/run/lock/kubelet.lock
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: I0912 05:08:07.893193    4122 aws.go:1032] Building AWS cloudprovider
Sep 12 05:08:07 ip-10-0-6-58 hyperkube[4122]: I0912 05:08:07.893219    4122 aws.go:994] Zone not specified in configuration file; querying AWS metadata service
Sep 12 05:08:07 ip-10-0-6-58 systemd[1]: Starting Kubernetes systemd probe.
Sep 12 05:08:08 ip-10-0-6-58 hyperkube[4122]: E0912 05:08:08.075211    4122 tags.go:94] Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly.
Sep 12 05:08:08 ip-10-0-6-58 hyperkube[4122]: F0912 05:08:08.075258    4122 server.go:262] failed to run Kubelet: could not init cloud provider "aws": AWS cloud failed to find ClusterID
Sep 12 05:08:08 ip-10-0-6-58 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
Sep 12 05:08:08 ip-10-0-6-58 systemd[1]: Unit kubelet.service entered failed state.
Sep 12 05:08:08 ip-10-0-6-58 systemd[1]: kubelet.service failed.

So I'm still not clear on what's going on, but etcd is broken, our ignition-file server seems non-responsive and is keeping master-0 from booting, and the kubelet is thrashing around without an aws cloud provider and with a bunch of deprecated options. I still don't see how any of that is related to the changes in my PR :p.

abhinavdahiya · 2018-09-12T13:47:02Z

Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly.

This might be because we dropped a tag, #217 (comment)

wking · 2018-09-12T16:21:26Z

Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly.

This might be because we dropped a tag, #217 (comment)

Ah, thanks :). I've pushed 6d3370d -> ef35007, rebasing onto master and restoring that tag to the instance (but, as I explain in the commit message, I'm still removing it from the volumes).

wking · 2018-09-12T20:40:30Z

The smoke error was:

Waiting for API at https://ci-op-hlzw4yd1-3e1a1-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-hlzw4yd1-3e1a1-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Interrupted
2018/09/12 18:44:23 Container setup in pod e2e-aws-smoke failed, exit code 1, reason Error

But I can't reproduce when I launch a cluster locally, so maybe it's just a flake.

/retest

This will make it easier to move into the existing infra step. The module source syntax used in the README is documented in [1,2,3], and means "the modules/aws/ami subdirectory of the github.com/openshift/installer repository cloned over HTTPS", etc. I don't think I should need the wrapping brackets in: vpc_security_group_ids = ["${var.vpc_security_group_ids}"] but without it I get [4]: Error: module.bootstrap.aws_instance.bootstrap: vpc_security_group_ids: should be a list The explicit brackets match our approach in the master and worker modules though, so they shouldn't break anything. It sounds like Terraform still has a few problems with remembering type information [5], and that may be what's going on here. I've simplified the tagging a bit, keeping the extra tags unification outside the module. I tried dropping the kubernetes.io/cluster/ tag completely, but it lead to [6]: Sep 12 05:08:08 ip-10-0-6-58 hyperkube[4122]: E0912 05:08:08.075211 4122 tags.go:94] Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly. Sep 12 05:08:08 ip-10-0-6-58 hyperkube[4122]: F0912 05:08:08.075258 4122 server.go:262] failed to run Kubelet: could not init cloud provider "aws": AWS cloud failed to find ClusterID The backing code for that is [7,8,9]. From [9], you can see that only the tag on the instance matters, so I've dropped kubernetes.io/cluster/... from volume_tags. Going forward, we may move to configuring this directly instead of relying on the tag-based initialization. [1]: https://www.terraform.io/docs/configuration/modules.html#source [2]: https://www.terraform.io/docs/modules/sources.html#github [3]: https://www.terraform.io/docs/modules/sources.html#modules-in-package-sub-directories [4]: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/217/pull-ci-openshift-installer-e2e-aws/47/build-log.txt [5]: hashicorp/terraform#16916 (comment) [6]: openshift#217 [7]: https://github.com/kubernetes/kubernetes/blob/v1.11.3/pkg/cloudprovider/providers/aws/tags.go#L30-L34 [8]: https://github.com/kubernetes/kubernetes/blob/v1.11.3/pkg/cloudprovider/providers/aws/tags.go#L100-L126 [9]: https://github.com/kubernetes/kubernetes/blob/v1.11.3/pkg/cloudprovider/providers/aws/aws.go#L1126-L1132

crawford · 2018-09-13T12:30:30Z

Try rebasing on master again. #244 should help with the flakes.

crawford · 2018-09-13T12:38:28Z

Actually, I guess tide is smart enough to merge this onto master before testing.

/retest

wking · 2018-09-13T14:53:49Z

/retest

crawford · 2018-09-13T16:47:06Z

/lgtm

openshift-ci-robot · 2018-09-13T16:47:13Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crawford, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [crawford,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

As suggested by Stephen Cuppett, this allows registry <-> S3 transfers to bypass the (NAT) gateways. Traffic over the NAT gateways costs money, so the new endpoint should make S3 access from the cluster cheaper (and possibly more reliable). This also allows for additional security policy flexibility, although I'm not taking advantage of that in this commit. Docs for VPC endpoints are in [1,2,3,4]. Endpoints do not currently support cross-region requests [1]. And based on discussion with Stephen, adding an endpoint may *break* access to S3 on other regions. But I can't find docs to back that up, and [3] has: We use the most specific route that matches the traffic to determine how to route the traffic (longest prefix match). If you have an existing route in your route table for all internet traffic (0.0.0.0/0) that points to an internet gateway, the endpoint route takes precedence for all traffic destined for the service, because the IP address range for the service is more specific than 0.0.0.0/0. All other internet traffic goes to your internet gateway, including traffic that's destined for the service in other regions. which suggests that access to S3 on other regions may be unaffected. In any case, our registry buckets, and likely any other buckets associated with the cluster, will be living in the same region. concat is documented in [5]. The wrapping brackets avoid [6]: level=error msg="Error: module.vpc.aws_vpc_endpoint.s3: route_table_ids: should be a list" although I think that's a Terraform bug. See also 8a37f72 (modules/aws/bootstrap: Pull AWS bootstrap setup into a module, 2018-09-05, openshift#217), which talks about this same issue. [1]: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html [2]: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html [3]: https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html [4]: https://www.terraform.io/docs/providers/aws/r/vpc_endpoint.html [5]: https://www.terraform.io/docs/configuration/interpolation.html#concat-list1-list2- [6]: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/745/pull-ci-openshift-installer-master-e2e-aws/1673/build-log.txt

Centralize extra-tag inclusion on aws/main.tf. This reduces the number of places we need to think about what tags should be ;). Also keep kubernetes.io/cluster/{name} localized in the aws module. See 8a37f72 (modules/aws/bootstrap: Pull AWS bootstrap setup into a module, 2018-09-05, openshift#217) for why we need to keep it on the bootstrap instance. But the bootstrap resources will be removed after the bootstrap-complete event comes through, and we don't want Kubernetes controllers trying to pick them up. This commit updates the internal Route 53 zone from KubernetesCluster to kubernetes.io/cluster/{name}: owned, catching it up to kubernetes/kubernetes@0b5ae539 (AWS: Support shared tag, 2017-02-18, kubernetes/kubernetes#41695). That tag originally landed on the zone back in 75fb49a (platforms/aws: apply tags to internal route53 zone, 2017-05-02, coreos/tectonic-installer#465). Only the master instances need the clusterid tag, as described in 6c7a5f0 (Tag master machines for adoption by machine controller, 2018-10-17, openshift#479). A number of VPC resources have moved from "shared" to "owned". The shared values are from 45dfc2b (modules/aws,azure: use the new tag format for k8s 1.6, 2017-05-04, coreos/tectonic-installer#469). The commit message doesn't have much to say for motivation, but Brad Ison said [1]: I'm not really sure if anything in Kubernetes actually uses the owned vs. shared values at the moment, but in any case, it might make more sense to mark subnets as shared. That was actually one of the main use cases for moving to this style of tagging -- being able to share subnets between clusters. But we aren't sharing these resources; see 6f55e67 (terraform/aws: remove option to use an existing vpc in aws, 2018-11-11, openshift#654). [1]: coreos/tectonic-installer#469 (comment)

…-release:4.0.0-0.6 Clayton pushed 4.0.0-0.nightly-2019-02-27-213933 to quay.io/openshift-release-dev/ocp-release:4.0.0-0.6. Extracting the associated RHCOS build: $ oc adm release info --pullspecs quay.io/openshift-release-dev/ocp-release:4.0.0-0.6 | grep machine-os-content machine-os-content registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-02-27-213933@sha256:1262533e31a427917f94babeef2774c98373409897863ae742ff04120f32f79b $ oc image info registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-02-26-125216@sha256:1262533e31a427917f94babeef2774c98373409897863ae742ff04120f32f79b | grep version version=47.330 that's the same machine-os-content image referenced from 4.0.0-0.5, which we used for installer v0.13.0. Renaming OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE gets us CI testing of the pinned release despite openshift/release@60007df2 (Use RELEASE_IMAGE_LATEST for CVO payload, 2018-10-03, openshift/release#1793). Also comment out regions which this particular RHCOS build wasn't pushed to, leaving only: $ curl -s https://releases-rhcos.svc.ci.openshift.org/storage/releases/maipo/47.330/meta.json | jq -r '.amis[] | .name' ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2 I'd initially expected to export the pinning environment variables in release.sh, but I've put them in build.sh here because our continuous integration tests use build.sh directly and don't go through release.sh. Using the slick, new change-log generator from [1], here's everything that changed in the update payload: $ oc adm release info --changelog ~/.local/lib/go/src --changes-from quay.io/openshift-release-dev/ocp-release:4.0.0-0.5 quay.io/openshift-release-dev/ocp-release:4.0.0-0.6 # 4.0.0-0.6 Created: 2019-02-28 20:40:11 +0000 UTC Image Digest: `sha256:5ce3d05da3bfa3d0310684f5ac53d98d66a904d25f2e55c2442705b628560962` Promoted from registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-02-27-213933 ## Changes from 4.0.0-0.5 ### Components * Kubernetes 1.12.4 ### New images * [pod](https://github.com/openshift/images) git [2f60da39](openshift/images@2f60da3) `sha256:c0d602467dfe0299ce577ba568a9ef5fb9b0864bac6455604258e7f5986d3509` ### Rebuilt images without code change * [cloud-credential-operator](https://github.com/openshift/cloud-credential-operator) git [01bbf372](openshift/cloud-credential-operator@01bbf37) `sha256:f87be09923a5cb081722634d2e0c3d0a5633ea2c23da651398d4e915ad9f73b0` * [cluster-autoscaler](https://github.com/openshift/kubernetes-autoscaler) git [d8a4a304](openshift/kubernetes-autoscaler@d8a4a30) `sha256:955413b82cf8054ce149bc05c18297a8abe9c59f9d0034989f08086ae6c71fa6` * [cluster-autoscaler-operator](https://github.com/openshift/cluster-autoscaler-operator) git [73c46659](openshift/cluster-autoscaler-operator@73c4665) `sha256:756e813fce04841993c8060d08a5684c173cbfb61a090ae67cb1558d76a0336e` * [cluster-bootstrap](https://github.com/openshift/cluster-bootstrap) git [05a5c8e6](openshift/cluster-bootstrap@05a5c8e) `sha256:dbdd90da7d256e8d49e4e21cb0bdef618c79d83f539049f89f3e3af5dbc77e0f` * [cluster-config-operator](https://github.com/openshift/cluster-config-operator) git [aa1805e7](openshift/cluster-config-operator@aa1805e) `sha256:773d3355e6365237501d4eb70d58cd0633feb541d4b6f23d6a5f7b41fd6ad2f5` * [cluster-dns-operator](https://github.com/openshift/cluster-dns-operator) git [ffb04ae9](openshift/cluster-dns-operator@ffb04ae) `sha256:ca15f98cc1f61440f87950773329e1fdf58e73e591638f18c43384ad4f8f84da` * [cluster-machine-approver](https://github.com/openshift/cluster-machine-approver) git [2fbc6a6b](openshift/cluster-machine-approver@2fbc6a6) `sha256:a66af3b1f4ae98257ab600d54f8c94f3a4136f85863bbe0fa7c5dba65c5aea46` * [cluster-node-tuned](https://github.com/openshift/openshift-tuned) git [278ee72d](openshift/openshift-tuned@278ee72) `sha256:ad71743cc50a6f07eba013b496beab9ec817603b07fd3f5c022fffbf400e4f4b` * [cluster-node-tuning-operator](https://github.com/openshift/cluster-node-tuning-operator) git [b5c14deb](openshift/cluster-node-tuning-operator@b5c14de) `sha256:e61d1fdb7ad9f5fed870e917a1bc8fac9ccede6e4426d31678876bcb5896b000` * [cluster-openshift-controller-manager-operator](https://github.com/openshift/cluster-openshift-controller-manager-operator) git [3f79b51b](openshift/cluster-openshift-controller-manager-operator@3f79b51) `sha256:8f3b40b4dd29186975c900e41b1a94ce511478eeea653b89a065257a62bf3ae9` * [cluster-svcat-apiserver-operator](https://github.com/openshift/cluster-svcat-apiserver-operator) git [547648cb](openshift/cluster-svcat-apiserver-operator@547648c) `sha256:e7c9323b91dbb11e044d5a1277d1e29d106d92627a6c32bd0368616e0bcf631a` * [cluster-svcat-controller-manager-operator](https://github.com/openshift/cluster-svcat-controller-manager-operator) git [9261f420](openshift/cluster-svcat-controller-manager-operator@9261f42) `sha256:097a429eda2306fcd49e14e4f5db8ec3a09a90fa29ebdbc98cc519511ab6fb5b` * [cluster-version-operator](https://github.com/openshift/cluster-version-operator) git [70c0232e](openshift/cluster-version-operator@70c0232) `sha256:7d59edff68300e13f0b9e56d2f2bc1af7f0051a9fbc76cc208239137ac10f782` * [configmap-reloader](https://github.com/openshift/configmap-reload) git [3c2f8572](openshift/configmap-reload@3c2f857) `sha256:32360c79d8d8d54cea03675c24f9d0a69877a2f2e16b949ca1d97440b8f45220` * [console-operator](https://github.com/openshift/console-operator) git [32ed7c03](openshift/console-operator@32ed7c0) `sha256:f8c07cb72dc8aa931bbfabca9b4133f3b93bc96da59e95110ceb8c64f3efc755` * [container-networking-plugins-supported](https://github.com/openshift/ose-containernetworking-plugins) git [f6a58dce](openshift/ose-containernetworking-plugins@f6a58dc) `sha256:c6434441fa9cc96428385574578c41e9bc833b6db9557df1dd627411d9372bf4` * [container-networking-plugins-unsupported](https://github.com/openshift/ose-containernetworking-plugins) git [f6a58dce](openshift/ose-containernetworking-plugins@f6a58dc) `sha256:bb589cf71d4f41977ec329cf808cdb956d5eedfc604e36b98cfd0bacce513ffc` * [coredns](https://github.com/openshift/coredns) git [fbcb8252](openshift/coredns@fbcb825) `sha256:2f1812a95e153a40ce607de9b3ace7cae5bee67467a44a64672dac54e47f2a66` * [docker-builder](https://github.com/openshift/builder) git [1a77d837](openshift/builder@1a77d83) `sha256:27062ab2c62869e5ffeca234e97863334633241089a5d822a19350f16945fbcb` * [etcd](https://github.com/openshift/etcd) git [a0e62b48](openshift/etcd@a0e62b4) `sha256:e4e9677d004f8f93d4f084739b4502c2957c6620d633e1fdb379c33243c684fa` * [grafana](https://github.com/openshift/grafana) git [58efe0eb](openshift/grafana@58efe0e) `sha256:548abcc50ccb8bb17e6be2baf050062a60fc5ea0ca5d6c59ebcb8286fc9eb043` * [haproxy-router](https://github.com/openshift/router) git [2c33f47f](openshift/router@2c33f47) `sha256:c899b557e4ee2ea7fdbe5c37b5f4f6e9f9748a39119130fa930d9497464bd957` * [k8s-prometheus-adapter](https://github.com/openshift/k8s-prometheus-adapter) git [815fa76b](openshift/k8s-prometheus-adapter@815fa76) `sha256:772c1b40b21ccaa9ffcb5556a1228578526a141b230e8ac0afe19f14404fdffc` * [kube-rbac-proxy](https://github.com/openshift/kube-rbac-proxy) git [3f271e09](openshift/kube-rbac-proxy@3f271e0) `sha256:b6de05167ecab0472279cdc430105fac4b97fb2c43d854e1c1aa470d20a36572` * [kube-state-metrics](https://github.com/openshift/kube-state-metrics) git [2ab51c9f](openshift/kube-state-metrics@2ab51c9) `sha256:611c800c052de692c84d89da504d9f386d3dcab59cbbcaf6a26023756bc863a0` * [libvirt-machine-controllers](https://github.com/openshift/cluster-api-provider-libvirt) git [7ff8b08f](openshift/cluster-api-provider-libvirt@7ff8b08) `sha256:6ab8749886ec26d45853c0e7ade3c1faaf6b36e09ba2b8a55f66c6cc25052832` * [multus-cni](https://github.com/openshift/ose-multus-cni) git [61f9e088](https://github.com/openshift/ose-multus-cni/commit/61f9e0886370ea5f6093ed61d4cfefc6dadef582) `sha256:e3f87811d22751e7f06863e7a1407652af781e32e614c8535f63d744e923ea5c` * [oauth-proxy](https://github.com/openshift/oauth-proxy) git [b771960b](openshift/oauth-proxy@b771960) `sha256:093a2ac687849e91671ce906054685a4c193dfbed27ebb977302f2e09ad856dc` * [openstack-machine-controllers](https://github.com/openshift/cluster-api-provider-openstack) git [c2d845ba](openshift/cluster-api-provider-openstack@c2d845b) `sha256:f9c321de068d977d5b4adf8f697c5b15f870ccf24ad3e19989b129e744a352a7` * [operator-registry](https://github.com/operator-framework/operator-registry) git [0531400c](operator-framework/operator-registry@0531400) `sha256:730f3b504cccf07e72282caf60dc12f4e7655d7aacf0374d710c3f27125f7008` * [prom-label-proxy](https://github.com/openshift/prom-label-proxy) git [46423f9d](openshift/prom-label-proxy@46423f9) `sha256:3235ad5e22b6f560d447266e0ecb2e5655fda7c0ab5c1021d8d3a4202f04d2ca` * [prometheus](https://github.com/openshift/prometheus) git [6e5fb5dc](openshift/prometheus@6e5fb5d) `sha256:013455905e4a6313f8c471ba5f99962ec097a9cecee3e22bdff3e87061efad57` * [prometheus-alertmanager](https://github.com/openshift/prometheus-alertmanager) git [4617d550](openshift/prometheus-alertmanager@4617d55) `sha256:54512a6cf25cf3baf7fed0b01a1d4786d952d93f662578398cad0d06c9e4e951` * [prometheus-config-reloader](https://github.com/openshift/prometheus-operator) git [f8a0aa17](openshift/prometheus-operator@f8a0aa1) `sha256:244fc5f1a4a0aa983067331c762a04a6939407b4396ae0e86a1dd1519e42bb5d` * [prometheus-node-exporter](https://github.com/openshift/node_exporter) git [f248b582](openshift/node_exporter@f248b58) `sha256:390e5e1b3f3c401a0fea307d6f9295c7ff7d23b4b27fa0eb8f4017bd86d7252c` * [prometheus-operator](https://github.com/openshift/prometheus-operator) git [f8a0aa17](openshift/prometheus-operator@f8a0aa1) `sha256:6e697dcaa19e03bded1edf5770fb19c0d2cd8739885e79723e898824ce3cd8f5` * [service-catalog](https://github.com/openshift/service-catalog) git [b24ffd6f](openshift/service-catalog@b24ffd6) `sha256:85ea2924810ced0a66d414adb63445a90d61ab5318808859790b1d4b7decfea6` * [service-serving-cert-signer](https://github.com/openshift/service-serving-cert-signer) git [30924216](openshift/service-serving-cert-signer@3092421) `sha256:7f89db559ffbd3bf609489e228f959a032d68dd78ae083be72c9048ef0c35064` * [telemeter](https://github.com/openshift/telemeter) git [e12aabe4](openshift/telemeter@e12aabe) `sha256:fd518d2c056d4ab8a89d80888e0a96445be41f747bfc5f93aa51c7177cf92b92` ### [aws-machine-controllers](https://github.com/openshift/cluster-api-provider-aws) * client: add cluster-api-provider-aws to UserAgent for AWS API calls [openshift#167](openshift/cluster-api-provider-aws#167) * Drop the yaml unmarshalling [openshift#155](openshift/cluster-api-provider-aws#155) * [Full changelog](openshift/cluster-api-provider-aws@46f4852...c0c3b9e) ### [cli, deployer, hyperkube, hypershift, node, tests](https://github.com/openshift/ose) * Build OSTree using baked SELinux policy [#22081](https://github.com/openshift/ose/pull/22081) * NodeName was being cleared for `oc debug node/X` instead of set [#22086](https://github.com/openshift/ose/pull/22086) * UPSTREAM: 73894: Print the involved object in the event table [#22039](https://github.com/openshift/ose/pull/22039) * Publish CRD openapi [#22045](https://github.com/openshift/ose/pull/22045) * UPSTREAM: 00000: wait for CRD discovery to be successful once before [#22149](https://github.com/openshift/ose/pull/22149) * `oc adm release info --changelog` should clone if necessary [#22148](https://github.com/openshift/ose/pull/22148) * [Full changelog](openshift/ose@c547bc3...0cbcfc5) ### [cluster-authentication-operator](https://github.com/openshift/cluster-authentication-operator) * Add redeploy on serving cert and operator pod template change [openshift#75](openshift/cluster-authentication-operator#75) * Create the service before waiting for serving certs [openshift#84](openshift/cluster-authentication-operator#84) * [Full changelog](openshift/cluster-authentication-operator@78dd53b...35879ec) ### [cluster-image-registry-operator](https://github.com/openshift/cluster-image-registry-operator) * Enable subresource status [openshift#209](openshift/cluster-image-registry-operator#209) * Add ReadOnly flag [openshift#210](openshift/cluster-image-registry-operator#210) * do not setup ownerrefs for clusterscoped/cross-namespace objects [openshift#215](openshift/cluster-image-registry-operator#215) * s3: include operator version in UserAgent for AWS API calls [openshift#212](openshift/cluster-image-registry-operator#212) * [Full changelog](openshift/cluster-image-registry-operator@0780074...8060048) ### [cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator) * Adds info log msg indicating ns/secret used by DNSManager [openshift#134](openshift/cluster-ingress-operator#134) * Introduce certificate controller [openshift#140](openshift/cluster-ingress-operator#140) * [Full changelog](openshift/cluster-ingress-operator@1b4fa5a...09d14db) ### [cluster-kube-apiserver-operator](https://github.com/openshift/cluster-kube-apiserver-operator) * bump(*): fix installer pod shutdown and rolebinding [openshift#307](openshift/cluster-kube-apiserver-operator#307) * bump to fix early status [openshift#309](openshift/cluster-kube-apiserver-operator#309) * [Full changelog](openshift/cluster-kube-apiserver-operator@4016927...fa75c05) ### [cluster-kube-controller-manager-operator](https://github.com/openshift/cluster-kube-controller-manager-operator) * bump(*): fix installer pod shutdown and rolebinding [openshift#183](openshift/cluster-kube-controller-manager-operator#183) * bump to fix empty status [openshift#184](openshift/cluster-kube-controller-manager-operator#184) * [Full changelog](openshift/cluster-kube-controller-manager-operator@95f5f32...53ff6d8) ### [cluster-kube-scheduler-operator](https://github.com/openshift/cluster-kube-scheduler-operator) * Rotate kubeconfig [openshift#62](openshift/cluster-kube-scheduler-operator#62) * Don't pass nil function pointer to NewConfigObserver [openshift#65](openshift/cluster-kube-scheduler-operator#65) * [Full changelog](openshift/cluster-kube-scheduler-operator@50848b4...7066c96) ### [cluster-monitoring-operator](https://github.com/openshift/cluster-monitoring-operator) * *: Clean test invocation and documenation [openshift#267](openshift/cluster-monitoring-operator#267) * pkg/operator: fix progressing state of cluster operator [openshift#268](openshift/cluster-monitoring-operator#268) * jsonnet/main.jsonnet: Bump Prometheus to v2.7.1 [openshift#246](openshift/cluster-monitoring-operator#246) * OWNERS: Remove ironcladlou [openshift#204](openshift/cluster-monitoring-operator#204) * test/e2e: Refactor framework setup & wait for query logic [openshift#265](openshift/cluster-monitoring-operator#265) * jsonnet: Update dependencies [openshift#269](openshift/cluster-monitoring-operator#269) * [Full changelog](openshift/cluster-monitoring-operator@94b701f...3609aea) ### [cluster-network-operator](https://github.com/openshift/cluster-network-operator) * Update to be able to track both DaemonSets and Deployments [openshift#102](openshift/cluster-network-operator#102) * openshift-sdn: more service-catalog netnamespace fixes [openshift#108](openshift/cluster-network-operator#108) * [Full changelog](openshift/cluster-network-operator@9db4d03...15204e6) ### [cluster-openshift-apiserver-operator](https://github.com/openshift/cluster-openshift-apiserver-operator) * bump to fix status reporting [openshift#157](openshift/cluster-openshift-apiserver-operator#157) * [Full changelog](openshift/cluster-openshift-apiserver-operator@1ce6ac7...0a65fe4) ### [cluster-samples-operator](https://github.com/openshift/cluster-samples-operator) * use pumped up rate limiter, shave 30 seconds from startup creates [openshift#113](openshift/cluster-samples-operator#113) * [Full changelog](openshift/cluster-samples-operator@4726068...f001324) ### [cluster-storage-operator](https://github.com/openshift/cluster-storage-operator) * WaitForFirstConsumer in AWS StorageClass [openshift#12](openshift/cluster-storage-operator#12) * [Full changelog](openshift/cluster-storage-operator@dc42489...b850242) ### [console](https://github.com/openshift/console) * Add back OAuth configuration link in kubeadmin notifier [openshift#1202](openshift/console#1202) * Normalize display of <ResourceIcon> across browsers, platforms [openshift#1210](openshift/console#1210) * Add margin spacing so event info doesn't run together before truncating [openshift#1170](openshift/console#1170) * [Full changelog](openshift/console@a0b75bc...d10fb8b) ### [docker-registry](https://github.com/openshift/image-registry) * Bump k8s and OpenShift, use new docker-distribution branch [openshift#165](openshift/image-registry#165) * [Full changelog](openshift/image-registry@75a1fbe...afcc7da) ### [installer](https://github.com/openshift/installer) * data: route53 A records with SimplePolicy should not use health check [openshift#1308](openshift#1308) * bootkube.sh: do not hide problems with render [openshift#1274](openshift#1274) * data/bootstrap/files/usr/local/bin/bootkube: etcdctl from release image [openshift#1315](openshift#1315) * pkg/types/validation: Drop v1beta1 backwards compat hack [openshift#1251](openshift#1251) * pkg/asset/tls: self-sign etcd-client-ca [openshift#1267](openshift#1267) * pkg/asset/tls: self-sign aggregator-ca [openshift#1275](openshift#1275) * pkg/types/validation/installconfig: Drop nominal v1beta2 support [openshift#1319](openshift#1319) * Removing unused/deprecated security groups and ports. Updated AWS doc [openshift#1306](openshift#1306) * [Full changelog](openshift/installer@0208204...563f71f) ### [jenkins, jenkins-agent-maven, jenkins-agent-nodejs](https://github.com/openshift/jenkins) * recover from jenkins deps backleveling workflow-durable-task-step fro… [openshift#806](openshift/jenkins#806) * [Full changelog](openshift/jenkins@2485f9a...e4583ca) ### [machine-api-operator](https://github.com/openshift/machine-api-operator) * Rename labels from sigs.k8s.io to machine.openshift.io [openshift#213](openshift/machine-api-operator#213) * Remove clusters.cluster.k8s.io CRD [openshift#225](openshift/machine-api-operator#225) * MAO: Stop setting statusProgressing=true when resyincing same version [openshift#217](openshift/machine-api-operator#217) * Generate clientset for machine health check API [openshift#223](openshift/machine-api-operator#223) * [Full changelog](openshift/machine-api-operator@bf95d7d...34c3424) ### [machine-config-controller, machine-config-daemon, machine-config-operator, machine-config-server, setup-etcd-environment](https://github.com/openshift/machine-config-operator) * daemon: Only print status if os == RHCOS [openshift#495](openshift/machine-config-operator#495) * Add pod image to image-references [openshift#500](openshift/machine-config-operator#500) * pkg/daemon: stash the node object [openshift#464](openshift/machine-config-operator#464) * Eliminate use of cpu limits [openshift#503](openshift/machine-config-operator#503) * MCD: add ign validation check for mc.ignconfig [openshift#481](openshift/machine-config-operator#481) * [Full changelog](openshift/machine-config-operator@875f25e...f0b87fc) ### [operator-lifecycle-manager](https://github.com/operator-framework/operator-lifecycle-manager) * fix(owners): remove cross-namespace and cluster->namespace ownerrefs [openshift#729](operator-framework/operator-lifecycle-manager#729) * [Full changelog](operator-framework/operator-lifecycle-manager@1ac9ace...9186781) ### [operator-marketplace](https://github.com/operator-framework/operator-marketplace) * [opsrc] Do not delete csc during purge [openshift#117](operator-framework/operator-marketplace#117) * Remove Dependency on Owner References [openshift#118](operator-framework/operator-marketplace#118) * [Full changelog](operator-framework/operator-marketplace@7b53305...fedd694) [1]: openshift/origin#22030

openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 6, 2018

openshift-ci-robot requested review from aaronlevy and crawford September 6, 2018 18:37

crawford previously requested changes Sep 6, 2018

View reviewed changes

openshift deleted a comment from wking Sep 6, 2018

abhinavdahiya reviewed Sep 6, 2018

View reviewed changes

wking force-pushed the aws-bootstrap-module branch from 1efb239 to f29ab92 Compare September 6, 2018 20:09

wking added the run-smoke-tests label Sep 6, 2018

wking force-pushed the aws-bootstrap-module branch from f29ab92 to 10f717b Compare September 6, 2018 21:40

wking mentioned this pull request Sep 6, 2018

modules/libvirt/bootstrap: Pull libvirt bootstrap setup into a module #220

Merged

wking force-pushed the aws-bootstrap-module branch from 10f717b to 6d3370d Compare September 11, 2018 04:27

wking force-pushed the aws-bootstrap-module branch from 6d3370d to ef35007 Compare September 12, 2018 16:19

wking removed the run-smoke-tests label Sep 12, 2018

wking force-pushed the aws-bootstrap-module branch from ef35007 to 8a37f72 Compare September 12, 2018 22:32

openshift-ci-robot assigned crawford Sep 13, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 13, 2018

openshift-merge-robot merged commit dfd9ff9 into openshift:master Sep 13, 2018

wking deleted the aws-bootstrap-module branch September 13, 2018 17:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modules/aws/bootstrap: Pull AWS bootstrap setup into a module #217

modules/aws/bootstrap: Pull AWS bootstrap setup into a module #217

wking commented Sep 6, 2018

crawford Sep 6, 2018

wking Sep 6, 2018

wking Sep 6, 2018

crawford commented Sep 6, 2018

abhinavdahiya Sep 6, 2018 •

edited

Loading

wking Sep 6, 2018

wking Sep 18, 2018

abhinavdahiya Sep 6, 2018

wking Sep 6, 2018

wking commented Sep 6, 2018

wking commented Sep 6, 2018

crawford commented Sep 6, 2018

wking commented Sep 6, 2018

crawford commented Sep 7, 2018

crawford commented Sep 7, 2018

wking commented Sep 10, 2018

wking commented Sep 10, 2018

wking commented Sep 10, 2018

wking commented Sep 11, 2018

wking commented Sep 12, 2018

abhinavdahiya commented Sep 12, 2018

wking commented Sep 12, 2018

wking commented Sep 12, 2018

crawford commented Sep 13, 2018

crawford commented Sep 13, 2018

wking commented Sep 13, 2018

crawford commented Sep 13, 2018

openshift-ci-robot commented Sep 13, 2018


		role = "${join("\|", aws_iam_role.bootstrap.*.name)}"

		#"${var.iam_role == "" ?

modules/aws/bootstrap: Pull AWS bootstrap setup into a module #217

modules/aws/bootstrap: Pull AWS bootstrap setup into a module #217

Conversation

wking commented Sep 6, 2018

crawford Sep 6, 2018

Choose a reason for hiding this comment

wking Sep 6, 2018

Choose a reason for hiding this comment

wking Sep 6, 2018

Choose a reason for hiding this comment

crawford commented Sep 6, 2018

abhinavdahiya Sep 6, 2018 • edited Loading

Choose a reason for hiding this comment

wking Sep 6, 2018

Choose a reason for hiding this comment

wking Sep 18, 2018

Choose a reason for hiding this comment

abhinavdahiya Sep 6, 2018

Choose a reason for hiding this comment

wking Sep 6, 2018

Choose a reason for hiding this comment

wking commented Sep 6, 2018

wking commented Sep 6, 2018

crawford commented Sep 6, 2018

wking commented Sep 6, 2018

crawford commented Sep 7, 2018

crawford commented Sep 7, 2018

wking commented Sep 10, 2018

wking commented Sep 10, 2018

wking commented Sep 10, 2018

wking commented Sep 11, 2018

wking commented Sep 12, 2018

abhinavdahiya commented Sep 12, 2018

wking commented Sep 12, 2018

wking commented Sep 12, 2018

crawford commented Sep 13, 2018

crawford commented Sep 13, 2018

wking commented Sep 13, 2018

crawford commented Sep 13, 2018

openshift-ci-robot commented Sep 13, 2018

abhinavdahiya Sep 6, 2018 •

edited

Loading