-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't get keys from leader #1
Comments
Thanks for raising this, @ramoncisternas … nothing that immediately comes to mind but could very well be a bug in my code. Will have a look ASAP. |
Hi Michael, I'm getting almost the same error has reported here above "Can't get keys from leader due to Get http://mehdb-0.default:9876/keys: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)". I think this is a DNS issue or the way you construct the leader's URL. Since the leader can't be resolved as "mehdb-0.default", but instead works calling it as "mehdb-0.mehdb.default", I think the right way to construct its URL is pod_name.service.namespace. So I would change this line of code url := "http://" + leaderShard + "." + ns + ":" + port + "/keys" to url := "http://" + leaderShard + "." + "THE SERVICE NAME FROM YAML" + ":" + ns + ":" + port + "/keys" Could you please verify the code at your end? Thanks for the job you did so far, it helped me to demo the StatefulSet behaviour (a part from not being able to accept keys from the leader ;-) ) |
I have the aforementioned fix in place, ie. in
However I'm hitting another issue (
This looks like a permission issue:
|
Thanks @jeffhoek! I'm a little unsure what you want me to do? I can't reproduce it. |
I was able to get it running on OpenShift 3.11, with the following steps:
then in app.yaml add the following to the
|
Hello Michael,
I have just followed this example of yours to learn something about stateful sets in OpenShit and I found the follower pods can’t get keys from the leader due to a bad hostname construction (I think) compared to what DNS is able to resolve: mehdb-0.axa-partners-chatbot-hogar-preprod-axa-services-es instead of mehdb-0.mehdb.axa-partners-chatbot-hogar-preprod-axa-services-es.svc.cluster.local.
There you have the output of my test:
ramon@bionic-beaver:~/OpenShift $ oc get sts
NAME DESIRED CURRENT AGE
mehdb 3 3 34m
ramon@bionic-beaver:~/OpenShift $ oc scale sts mehdb --replicas=4
statefulset "mehdb" scaled
ramon@bionic-beaver:~/OpenShift $ oc get pods
NAME READY STATUS RESTARTS AGE
mehdb-0 1/1 Running 0 36m
mehdb-1 1/1 Running 0 35m
mehdb-2 1/1 Running 0 33m
mehdb-3 1/1 Running 0 1m
ramon@bionic-beaver:~/OpenShift $ oc logs mehdb-3
2019/01/29 11:49:52 mehdb serving from mehdb-3:9876 using /mehdbdata as the data directory
2019/01/29 11:49:52 I am a follower shard, accepting READS
2019/01/29 11:50:02 Checking for new data from leader
2019/01/29 11:50:02 Can't get keys from leader due to Get http://mehdb-0.axa-partners-chatbot-hogar-preprod-axa-services-es:9876/keys: dial tcp: lookup mehdb-0.axa-partners-chatbot-hogar-preprod-axa-services-es on 10.64.9.9:53: no such host
2019/01/29 11:50:12 Checking for new data from leader
2019/01/29 11:50:12 Can't get keys from leader due to Get http://mehdb-0.axa-partners-chatbot-hogar-preprod-axa-services-es:9876/keys: dial tcp: lookup mehdb-0.axa-partners-chatbot-hogar-preprod-axa-services-es on 10.64.9.9:53: no such host
ramon@bionic-beaver:~/OpenShift $ oc run -i -t --rm dnscheck --restart=Never --image=quay.io/mhausenblas/jump:0.2 -- nslookup mehdb
If you don't see a command prompt, try pressing enter.
Name: mehdb
Address 1: 10.94.107.130 mehdb-2.mehdb.axa-partners-chatbot-hogar-preprod-axa-services-es.svc.cluster.local
Address 2: 10.94.112.4 mehdb-1.mehdb.axa-partners-chatbot-hogar-preprod-axa-services-es.svc.cluster.local
Address 3: 10.94.21.232 mehdb-0.mehdb.axa-partners-chatbot-hogar-preprod-axa-services-es.svc.cluster.local
Address 4: 10.94.9.228 mehdb-3.mehdb.axa-partners-chatbot-hogar-preprod-axa-services-es.svc.cluster.local
I wonder if it would be easy for you to explain the root cause of this error and suggest how it can be fixed.
Thank you in advance,
Ramon Cisternas
The text was updated successfully, but these errors were encountered: