-
Notifications
You must be signed in to change notification settings - Fork 238
unable to assign pod the iam role #46
Comments
strange, killing the kiam agent/server pods and letting the new ones come up fixed the issue. Any ideas on it? |
Not off the top of my head- do you have the log data from the server
processes? From the errors you forwarded before it sounds most likely an
issue inside the server process.
…On Thu, 15 Mar 2018 at 07:48, Tasdik Rahman ***@***.***> wrote:
strange, killing the kiam agent/server pods and letting the new ones come
up fixed the issue. Any ideas on it?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAEfpCgm6glIOuZf0Xelo99eWHprVD1ks5tehy3gaJpZM4SrrIo>
.
|
Ah too bad, I didn't get the logs from the server agent before deleting them. :/ |
Yeah sorry, without the server logs its difficult to know what the problem is. I'm going to close this for now but please reopen with if it happens again with as much log data as you can capture please. Thanks! |
Hey thanks @pingles , will post here again if I face the issue again. Thanks for your time. |
No problem, thanks for reporting an issue.
…On Thu, 15 Mar 2018 at 10:40, Tasdik Rahman ***@***.***> wrote:
Hey thanks @pingles <https://github.com/pingles> , will post here again
if I face the issue again. Thanks for your time.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAEfn-5_CKUcF9BzcOUQRSXxTNKKDxkks5tekUlgaJpZM4SrrIo>
.
|
Hey @pingles, whilst upgrading a cluster of ours. We faced the above issue again. Logs from one of the client-agents
Logs from one of the server agents
Please let me know if you want something else from the logs. Thanks |
Interesting, definitely looks like something's wrong there. I've reopened. |
So kiam definitely assumes that the deltas delivered by the k8s client are only for Had a quick search for the error and this seems the same: kubernetes/kubernetes@1c65d1d Should be relatively easy to fix. I'll try and do it asap unless someone else beats me to it! |
And more relevant docs: https://github.com/kubernetes/client-go/blob/master/tools/cache/delta_fifo.go#L656 |
Thanks for your time, appreciate it :) |
This helps to simplify the implementation of the pod and namespace caches, as well as better handling errors from `cache.DeletedFinalStateUnknown` identified in #46 and more.
I committed a fix earlier for this but I've also just changed again to remove some of the pod cache internals. This should also remove the The latest PR that addresses this (#51) also changes the server boot process so that the pod and namespace caches must @tasdikrahman I'm going to close this issue for now (relating to the type error). If after updating Kiam (sorry, you'll need to use latest or the SHA for now) you see the erroneous behaviour again (pod not found errors) please re-open. |
No problem at all. Will update on this issue if see the error again. Thanks a lot! Just for my sanity, was curious if a release is scheduled around 😀after |
Yep- we’ll probably do a release soon. I’d like to get better Prometheus
metrics exported first (which should be quite quick) then do a release so
perhaps within a few days/week?
…On Wed, 25 Apr 2018 at 08:49, Tasdik Rahman ***@***.***> wrote:
I'm going to close this issue for now (relating to the type error). If
after updating Kiam (sorry, you'll need to use latest or the SHA for now)
you see the erroneous behaviour again (pod not found errors) please re-open.
No problem at all. Will update on this issue if see the error again.
Thanks a lot! Just for my sanity, was curious if a release is scheduled
around 😀after 2.6
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAEfmgPKiRsrgHwk0njYGCIWHZCya2gks5tsCp4gaJpZM4SrrIo>
.
|
Sounds good, thanks for the help! :) |
I’ll try and do a release today- that’ll pull in the error handling fix.
I’ll push the Prometheus changes to the next.
…On Mon, 30 Apr 2018 at 08:04, Tasdik Rahman ***@***.***> wrote:
Yep- we’ll probably do a release soon. I’d like to get better Prometheus
metrics exported first (which should be quite quick) then do a release so
perhaps within a few days/week?
Sounds good, thanks for the help! :)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#46 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAEfusGPni9uOoZj6--IDgeg9wC5fkpks5ttrdrgaJpZM4SrrIo>
.
|
checked the kiam-agent logs on the node where the pod (which was to be assigned the iam role) was scheduled which look like
running
uswitch/kiam:v2.4
on both the agent and the server.The namespace where the pod is scheduled has the annotation
as stated in the docs
along with
in the trust relationships on the iam-role picked up the node where the pods are being scheduled.
Not sure if it's related, but we recently upgraded k8s from 1.8.4 to 1.8.9, but I guess that shouldn't be the problem.
The text was updated successfully, but these errors were encountered: