Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Server requests credentials, but they are not assumed by pod. #127

Closed
coryodaniel opened this issue Jul 24, 2018 · 25 comments
Closed

Server requests credentials, but they are not assumed by pod. #127

coryodaniel opened this issue Jul 24, 2018 · 25 comments

Comments

@coryodaniel
Copy link

coryodaniel commented Jul 24, 2018

I have kiam configured and I can see that the server is requesting new credentials when a pod is launched, but the pod still appears to have the role of the k8s node.

Logs from the kiam server:

{"level":"info","msg":"starting server","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"started prometheus metric listener 0.0.0.0:9620","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"detecting arn prefix","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"using detected prefix: arn:aws:iam::ACCOUNT_HERE:role/","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"will serve on 0.0.0.0:443","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"starting credential manager process 0","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"started cache controller","time":"2018-07-24T16:59:35Z"}
{"level":"info","msg":"started namespace cache controller","time":"2018-07-24T16:59:36Z"}
{"credentials.access.key":"KEY_HERE","credentials.expiration":"2018-07-24T17:14:38Z","credentials.role":"kiam-test","level":"info","msg":"requested new credentials","time":"2018-07-24T16:59:38Z"}
{"credentials.access.key":"KEY_HERE","credentials.expiration":"2018-07-24T17:14:38Z","credentials.role":"kiam-test","generation.metadata":0,"level":"info","msg":"fetched credentials","pod.iam.role":"kiam-test","pod.name":"kiam-verifier-31879","pod.namespace":"playground","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"7868839","time":"2018-07-24T16:59:38Z"}

I am launching a pod that has a single container which has the aws cli tools installed.

export KIAM_POD=kiam-verifier-$RANDOM
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: ${KIAM_POD}
  namespace: playground
  annotations:
    iam.amazonaws.com/role: kiam-test
spec:
  containers:
  - name: kiam-verifier
    image: mesosphere/aws-cli
    command: ["sleep", "10000"]
  restartPolicy: Never
EOF

Then from the pod if I run aws iam get-user or aws sts get-caller-identity it appears that it is still running with the nodes role.

~ # aws iam get-user

An error occurred (AccessDenied) when calling the GetUser operation: User: arn:aws:sts::ACCOUNT_HERE:assumed-role/nodes.us1.k8s.local/i-ID_HERE is not authorized to perform: iam:GetUser on resource: user ID_HERE
~ # aws sts get-caller-identity
{
    "Account": "ACOUNT_HERE", 
    "UserId": "KEY_HERE:i-ID_HERE", 
    "Arn": "arn:aws:sts::ACOUNT_HERE:assumed-role/nodes.us1.k8s.local/i-ID_HERE"
}
~ # 

The namespace I am running in is annotated with:

kind: Namespace
apiVersion: v1
metadata:
  name: playground
  annotations:
    iam.amazonaws.com/permitted: ".*"

I dont see any error messages in the agent or server logs.

@coryodaniel coryodaniel changed the title Credentials are being requested by server, but not being assumed by application How to confirm that a role is being assumed? Jul 24, 2018
@coryodaniel
Copy link
Author

coryodaniel commented Jul 24, 2018

The reason I started investigating the above was that I had a simple policy for accessing an s3 bucket and it was getting denied permissions.

With kube2iam I was able to query the metadata API to verify the permissions:

wget -qO- http://169.254.169.254/latest/meta-data/iam/security-credentials/my-role

But with Kiam it seems to return 500 and 308 messages depending on the particular API call.

@coryodaniel coryodaniel changed the title How to confirm that a role is being assumed? Server requests credentials, but they are not assumed by pod. Jul 24, 2018
@coryodaniel
Copy link
Author

Actually upon making calls to the metadata API I am seeing error messages:

error processing request: error fetching credentials: rpc error: code = Canceled desc = context canceled

@pingles
Copy link
Contributor

pingles commented Jul 30, 2018

@coryodaniel This looks like a transport error- I suspect there are warnings that are being dropped by your log level?

#94 (comment) shows an example of adding some environment variables to cause the gRPC lib to output more information.

My guess is that it'll be a TLS handshaking error (mismatching hosts/alternate names etc.).

@jamiebuxxx
Copy link

I'm experiencing the same issue. I have enabled log debugging and added the gRPC environment variables.

GRPC_GO_LOG_SEVERITY_LEVEL=info GRPC_GO_LOG_VERBOSITY_LEVEL=8

I'm seeing no errors in both the server and agent logs, but I'm seeing the following warnings.

WARNING: 2018/09/14 02:52:14 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:41890: read: connection reset by peer
WARNING: 2018/09/14 02:52:24 grpc: Server.Serve failed to complete security handshake from "[::1]:33950": EOF
WARNING: 2018/09/14 02:52:44 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:42378: read: connection reset by peer
WARNING: 2018/09/14 02:52:55 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:42596: read: connection reset by peer
WARNING: 2018/09/14 02:53:15 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:42954: read: connection reset by peer
WARNING: 2018/09/14 02:53:35 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:43272: read: connection reset by peer
WARNING: 2018/09/14 02:53:54 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:43584: read: connection reset by peer
WARNING: 2018/09/14 02:54:04 transport: http2Server.HandleStreams failed to read frame: read tcp [::1]:443->[::1]:35646: read: connection reset by peer
WARNING: 2018/09/14 02:54:04 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:43748: read: connection reset by peer
WARNING: 2018/09/14 02:54:05 transport: http2Server.HandleStreams failed to read frame: read tcp [::1]:443->[::1]:35672: read: connection reset by peer

And like @coryodaniel mentioned, I can confirm the server is getting the proper information

{"credentials.access.key":"ACCESS_KEY","credentials.expiration":"2018-09-14T02:55:43Z","credentials.role":"kiam-s3-bucket-test","generation.metadata":0,"level":"info","msg":"fetched credentials","pod.iam.role":"kiam-s3-bucket-test","pod.name":"kiam-test","pod.namespace":"sandbox","pod.status.ip":"10.42.125.162","pod.status.phase":"Running","resource.version":"32537983","time":"2018-09-14T02:40:43Z"},
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"kiam-s3-bucket-test","pod.name":"kiam-test","pod.namespace":"sandbox","pod.status.ip":"10.42.125.162","pod.status.phase":"Running","resource.version":"32537983","time":"2018-09-14T02:47:43Z"}

Its just not being sent to the agent and pod. I did think it was odd that the agent was not showing any other logs except for the initial logs from gRPC and health checks.

Are there any ideas for troubleshooting the http2Server.HandleStreams errors?

@pingles
Copy link
Contributor

pingles commented Sep 14, 2018

I've not seen those connection errors before- could you describe more about the networking you're running please?

@jamiebuxxx
Copy link

Our K8s cluster is configure via Rancher using EC2 instances. We are running Rancher v1.6, which has a specific CNI configuration.

https://rancher.com/docs/rancher/v1.6/en/rancher-services/networking/

When setting the network interface for the agent, I was wondering about what interface to use. Based on what is in the README, I chose the actually network interface of the host(ens5).

@roffe
Copy link
Contributor

roffe commented Sep 15, 2018

@jamiebuxxx Most CNI solutions creates their own bridge interface that the pods are bound to, https://github.com/uswitch/kiam#typical-cni-interface-names

ifconfig or brctl show might help you find out what the interface is called and if there are multiple

@roffe
Copy link
Contributor

roffe commented Sep 15, 2018

https://rancher.com/docs/rancher/v1.6/en/rancher-services/networking/

BRIDGE
Specify the bridge name to be used by the CNI plugin. This is a generic CNI bridge plugin option.

For the “Rancher IPsec” plugin, the default is docker0

i'd give docker0 a shot if you have that in your system

@jamiebuxxx
Copy link

@roffe Yea, I thought and about yesterday and changed the interface to docker0 before heading out. I'm still seeing the same http2Server.HandleStreams warnings and errors from the kiam agent pods that my test instance is running on.

Here's the manifest of that kiam agent pod:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: 2018-09-14T21:06:08Z
  generateName: kiam-agent-
  labels:
    app: kiam
    component: agent
    controller-revision-hash: "3891618869"
    pod-template-generation: "1"
    release: kiam
  name: kiam-agent-dvd79
  namespace: default
spec:
  containers:
  - args:
    - --host-interface=docker0
    - --json-log
    - --level=info
    - --port=8181
    - --cert=/etc/kiam/tls/cert
    - --key=/etc/kiam/tls/key
    - --ca=/etc/kiam/tls/ca
    - --server-address=kiam-server:443
    - --prometheus-listen-addr=0.0.0.0:9620
    - --prometheus-sync-interval=5s
    - --gateway-timeout-creation=50ms
    command:
    - /agent
    env:
    - name: HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    - name: GRPC_GO_LOG_SEVERITY_LEVEL
      value: info
    - name: GRPC_GO_LOG_VERBOSITY_LEVEL
      value: "8"
    image: quay.io/uswitch/kiam:v2.8
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /ping
        port: 8181
        scheme: HTTP
      initialDelaySeconds: 3
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    name: kiam-agent
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/kiam/tls
      name: tls
    - mountPath: /var/run/xtables.lock
      name: xtables
    - mountPath: /etc/ssl/certs
      name: ssl-certs
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kiam-agent-token-f6qtg
      readOnly: true
  dnsPolicy: ClusterFirstWithHostNet
  hostNetwork: true
  nodeName: ip-10-16-136-179.us-west-2.compute.internal
  nodeSelector:
    compute: "true"
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: kiam-agent
  serviceAccountName: kiam-agent
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  volumes:
  - name: tls
    secret:
      defaultMode: 420
      secretName: kiam-agent
  - hostPath:
      path: /run/xtables.lock
      type: ""
    name: xtables
  - hostPath:
      path: /etc/ssl/certs
      type: ""
    name: ssl-certs
  - name: kiam-agent-token-f6qtg
    secret:
      defaultMode: 420
      secretName: kiam-agent-token-f6qtg

I will look on the host and see if there are any other bridge interfaces available.

@jamiebuxxx
Copy link

Just to be transparent, I'm using the following Helm chart for the deployment. https://github.com/helm/charts/tree/master/stable/kiam.

I re-created the server and agents certs as mentioned in the docs, just to be sure I wasn't missing something.

Everything looks as expected but I'm still receiving the http2Server.HandleStreams failed error and warning from the server.

Unrelated but relevant, we are unable to deploy the latest version of the kiam image(quay.io/uswitch/kiam:master). Every time we try to deploy that image, we get a RunContainerError/CrashLoopBackOff status with the following message:

container_linux.go:247: starting container process caused "exec: \"/server\": stat /server: no such file or directory"

But v2.8 works totally fine. Why is kiam so mad at me!? I just want to be friends. 😄

@pingles
Copy link
Contributor

pingles commented Oct 8, 2018

@jamiebuxxx sorry it's taken me a while to notice this. master will always track the repository and will pick up breaking changes (like the one above) as all changes are merged into master ahead of release. The change for the above was that we merged commands into a single binary to reduce the size of the docker image (there wasn't really a need to have them separated).

I'd suggest using either v2.7 (which is what I suspect the Helm etc. packages track) or v3-rc1 which represents the next release we'll make- both of those will avoid pulling in breaking command changes etc.

@herikwebb
Copy link

@coryodaniel @jamiebuxxx were you able to resolve this? I have the exact same issue but I'm using the v3.0 release. The server requests the credentials but they never make it to the pod.

INFO: 2019/01/17 20:20:14 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
WARNING: 2019/01/17 20:20:19 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:50844: read: connection reset by peer
INFO: 2019/01/17 20:20:19 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
WARNING: 2019/01/17 20:20:24 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:50892: read: connection reset by peer
INFO: 2019/01/17 20:20:24 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2019/01/17 20:20:29 transport: loopyWriter.run returning. connection error: desc = "transport is closing"

@jamiebuxxx
Copy link

@herikwebb I haven't found a solution to this yet and had put this on the backburner. Hopefully @coryodaniel was able to get this working. To me, it seem like the TLS cert could be causing issues. But I haven't been able to prove that yet.

@pingles
Copy link
Contributor

pingles commented Jan 18, 2019

@herikwebb I'd check the server and agent logs. The server will request credentials irrespective of an agent requesting them- they're prefetched when the server identifies a pod is running that's annotated with a role.

I'd check your agent logs- that will confirm that your applications attempting to access http://169.254.169.254/... are being intercepted correctly and what's happening. Likewise, there's a heap of Prometheus metrics exported that'll help understand what's happening.

@coryodaniel
Copy link
Author

coryodaniel commented Jan 18, 2019

I never got it working. I only tried for about a day or so. We had another initiative at the time to manage credentials for other services with Vault, so I ended up having an init container fetch creds from Vault using the assume_role method.

It’s def more work, but was more in line with our “security posture” and wrangling AWS service accounts.

The one “issue” we ran into with this approach was of a pod didn’t have the init container it would assume the nodes role. We ended up removing all node permissions and using the same approach above to give permissions to auxiliary services like external-dns, etc.

@sudermanjr
Copy link

I'm having the exact same issue with v3.0. I did some searching, and it seems like that log warning might be a red herring. This issue grpc/grpc-go#1062 states that it is just log spam.

I am continuing to try and get this working if anyone has more insight.

@sudermanjr
Copy link

I seem to have made it work by choosing the correct interface name. In the chart this is set by agent.host.interface which I have set to cni0.

I am using a Kubernetes 1.11 cluster with Flannel that was created using Kops.

@rlangfordBV
Copy link

I'm having this exact same issue with v3.0 as well, on Kubernetes v1.11. I can see the credentials being retrieved by the server from AWS, but nothing appears to make it back to the container. The agent logs are empty except for what appears to be healthcheck or liveness/readiness probes, but I see the credentials in the server logs. My logs basically look like @jamiebuxxx's logs

{"generation.metadata":0,"level":"debug","msg":"added pod","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"1150198","time":"2019-03-27T18:24:01Z"}
{"generation.metadata":0,"level":"debug","msg":"announced pod","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"1150198","time":"2019-03-27T18:24:01Z"}
{"credentials.access.key":"<REDACTED>","credentials.expiration":"2019-03-27T18:36:29Z","credentials.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","generation.metadata":0,"level":"info","msg":"fetched credentials","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"1150198","time":"2019-03-27T18:24:01Z"}
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"1150199","time":"2019-03-27T18:24:01Z"}
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"1150201","time":"2019-03-27T18:24:01Z"}
INFO: 2019/03/27 18:24:01 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
WARNING: 2019/03/27 18:24:01 grpc: Server.Serve failed to complete security handshake from "[::1]:41878": EOF
WARNING: 2019/03/27 18:24:01 grpc: Server.Serve failed to complete security handshake from "127.0.0.1:49706": EOF
WARNING: 2019/03/27 18:24:01 grpc: Server.Serve failed to complete security handshake from "[::1]:41888": read tcp [::1]:443->[::1]:41888: read: connection reset by peer
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"100.108.0.17","pod.status.phase":"Running","resource.version":"1150205","time":"2019-03-27T18:24:02Z"}
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"arn:aws:iam::<ACCOUNT ID REDACTED>:role/bosun/kiam-demo","pod.name":"kiam-demo-ml7bs","pod.namespace":"default","pod.status.ip":"100.108.0.17","pod.status.phase":"Failed","resource.version":"1150206","time":"2019-03-27T18:24:03Z"}

If I remove the annotation on the default workspace, and run the demo app, the app will complete, but will be using the nodes role just like @coryodaniel noted as well.

@pingles
Copy link
Contributor

pingles commented Mar 27, 2019 via email

@rlangfordBV
Copy link

If you don't see any accesses in the agent log I'd guess that the iptables config hasn't loaded correctly- I'd check whether you're setting the interface name correctly to match your cluster's chosen CNI.

Thanks @pingles. The interface is set to weave, but we shall check iptables as well!

@pingles
Copy link
Contributor

pingles commented Mar 29, 2019 via email

@legendjaks
Copy link

legendjaks commented May 23, 2019

I was stuck at same issue. I am using kops 1.12 with kubenet as CNI. Changed --host-interface=cbr0 , after that it started working

@andreivmaksimov
Copy link

Deployed EKS 1.12 cluster with eksctl and also can not get credentials on pod because of the following errors on agent:

{"addr":"10.110.43.140:33840","level":"error","method":"GET","msg":"error processing request: error fetching credentials: rpc error: code = Unknown desc = AccessDenied: Access denied\n\tstatus code: 403, request id: ac7ce2c3-9454-11e9-9bb6-0570c095a29f","path":"/latest/meta-data/iam/security-credentials/kubernetes-buildbot-worker","status":500,"time":"2019-06-21T18:45:00Z"}
{"addr":"10.110.43.140:33840","duration":1001,"headers":{"Content-Type":["text/plain; charset=utf-8"],"X-Content-Type-Options":["nosniff"]},"level":"info","method":"GET","msg":"processed request","path":"/latest/meta-data/iam/security-credentials/kubernetes-buildbot-worker","status":500,"time":"2019-06-21T18:45:00Z"}

on the server:

{"generation.metadata":0,"level":"debug","msg":"found 1/1 pods for ip 10.110.43.140","pod.iam.role":"kubernetes-buildbot-worker","pod.name":"aws-iam-tester-7b6b5b7976-77ttc","pod.namespace":"buildbot","pod.status.ip":"10.110.43.140","pod.status.phase":"Running","resource.version":"15723","time":"2019-06-21T18:45:00Z"}
{"generation.metadata":0,"level":"debug","msg":"found 1/1 pods for ip 10.110.43.140","pod.iam.role":"kubernetes-buildbot-worker","pod.name":"aws-iam-tester-7b6b5b7976-77ttc","pod.namespace":"buildbot","pod.status.ip":"10.110.43.140","pod.status.phase":"Running","resource.version":"15723","time":"2019-06-21T18:45:00Z"}
{"generation.metadata":0,"level":"debug","msg":"found 1/1 pods for ip 10.110.43.140","pod.iam.role":"kubernetes-buildbot-worker","pod.name":"aws-iam-tester-7b6b5b7976-77ttc","pod.namespace":"buildbot","pod.status.ip":"10.110.43.140","pod.status.phase":"Running","resource.version":"15723","time":"2019-06-21T18:45:00Z"}
{"level":"error","msg":"error requesting credentials: AccessDenied: Access denied\n\tstatus code: 403, request id: ac7ce2c3-9454-11e9-9bb6-0570c095a29f","pod.iam.role":"kubernetes-buildbot-worker","time":"2019-06-21T18:45:00Z"}
{"level":"debug","msg":"evicted credentials future had error: AccessDenied: Access denied\n\tstatus code: 403, request id: ac7ce2c3-9454-11e9-9bb6-0570c095a29f","pod.iam.role":"kubernetes-buildbot-worker","time":"2019-06-21T18:45:00Z"}
{"generation.metadata":0,"level":"error","msg":"error retrieving credentials: AccessDenied: Access denied\n\tstatus code: 403, request id: ac7ce2c3-9454-11e9-9bb6-0570c095a29f","pod.iam.requestedRole":"kubernetes-buildbot-worker","pod.iam.role":"kubernetes-buildbot-worker","pod.name":"aws-iam-tester-7b6b5b7976-77ttc","pod.namespace":"buildbot","pod.status.ip":"10.110.43.140","pod.status.phase":"Running","resource.version":"15723","time":"2019-06-21T18:45:00Z"}

During curling security-credentials on a test pod I can see my role:

curl http://169.254.169.254/latest/meta-data/iam/security-credentials/
kubernetes-buildbot-worker/

But I can not access it:

curl http://169.254.169.254/latest/meta-data/iam/security-credentials/kubernetes-buildbot-worker/
error fetching credentials: rpc error: code = Unknown desc = forbidden by policy

At the same time at the kiam agent:

{"addr":"10.110.43.140:34390","level":"error","method":"GET","msg":"error processing request: error fetching credentials: rpc error: code = Unknown desc = forbidden by policy","path":"/latest/meta-data/iam/security-credentials/kubernetes-buildbot-worker/","status":500,"time":"2019-06-21T18:49:10Z"}
{"addr":"10.110.43.140:34390","duration":5000,"headers":{"Content-Type":["text/plain; charset=utf-8"],"X-Content-Type-Options":["nosniff"]},"level":"info","method":"GET","msg":"processed request","path":"/latest/meta-data/iam/security-credentials/kubernetes-buildbot-worker/","status":500,"time":"2019-06-21T18:49:10Z"}

kiam-server configuration:

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: kiam-server
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: kiam
        role: server
    spec:
      serviceAccountName: kiam-server
      nodeSelector:
        kiam: server     
      volumes:
        - name: ssl-certs
          hostPath:
            # for AWS linux or RHEL distros
            path: /etc/pki/ca-trust/extracted/pem/
        - name: tls
          secret:
            secretName: kiam-server-tls
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:v3.2
          imagePullPolicy: Always
          env:
            - name: GRPC_GO_LOG_SEVERITY_LEVEL
              value: "info"
            - name: GRPC_GO_LOG_VERBOSITY_LEVEL
              value: "8"
          command:
            - /kiam
          args:
            - server
            - --json-log
            - --level=debug
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --role-base-arn-autodetect
            - --assume-role-arn=arn:aws:iam::408272790494:role/kiam-server
            - --sync=1m
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: kiam-server
  namespace: kube-system
spec:
  clusterIP: None
  selector:
    app: kiam
    role: server
  ports:
  - name: grpclb
    port: 443
    targetPort: 443
    protocol: TCP

kiam-agent configuration:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: kiam-agent
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: kiam
        role: agent
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      nodeSelector:
        kiam: agent
      tolerations:
        - key: "kiam"
          operator: "Equal"
          value: "server"
          effect: "NoSchedule"
      volumes:
        - name: ssl-certs
          hostPath:
            # for AWS linux or RHEL distros
            path: /etc/pki/ca-trust/extracted/pem/
        - name: tls
          secret:
            secretName: kiam-agent-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]
          image: quay.io/uswitch/kiam:v3.2
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=!eth0
            - --json-log
            - --port=8181
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --server-address=kiam-server:443
            - --gateway-timeout-creation=1s
          env:
            - name: GRPC_GO_LOG_SEVERITY_LEVEL
              value: "info"
            - name: GRPC_GO_LOG_VERBOSITY_LEVEL
              value: "8"
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

@ghost ghost mentioned this issue Mar 9, 2020
@pkazi
Copy link

pkazi commented Jan 12, 2021

Was facing error

error fetching credentials: rpc error: code = Unknown desc = forbidden by policy

This worked for me - cloudposse/docs#195

@pingles
Copy link
Contributor

pingles commented Jan 12, 2021

I think this issue has covered a lot of disparate things. To avoid confusion I'm going to close and people can re-open or ask questions on Slack elsewhere when they have problems.

Thanks!

@pingles pingles closed this as completed Jan 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants