Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added leader election. #530

Merged
merged 1 commit into from
Oct 11, 2018
Merged

Conversation

mhrivnak
Copy link
Member

See doc/proposals/leader-for-life.md

@mhrivnak mhrivnak added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 25, 2018
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 25, 2018

Both the Leader For Life and lease-based approaches to leader election are
built on the concept that each candidate will attempt to create a resource with
the same GVK, namespace and name. Whichever candidate succeeds becomes the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same GVK, namespace and name ->
the same GVK, namespace, and name? I believe we need an extra comma here to separate items in a list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar-wise, it's optional. I'm happy to add it if that's the preferred style here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add it if that's the preferred style here.
if you can do that, it will be great.

"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/rest"
crclient "sigs.k8s.io/controller-runtime/pkg/client"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need new line here.

Usually, we group imports with following quideline:

  1. group all go internal imports first.
  2. group all current pkg imports second.
  3. group all external imports last.

func myOwnerRef(ctx context.Context, client crclient.Client, ns string) (metav1.OwnerReference, error) {
hostname, err := os.Hostname()
if err != nil {
return metav1.OwnerReference{}, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return nil, err?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot use nil as type "github.com/operator-framework/operator-sdk/vendor/k8s.io/apimachinery/pkg/apis/meta/v1".OwnerReference in return argument

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, the return arg is a type of metav1.OwnerReference not *metav1.OwnerReference? Maybe change to *metav1.OwnerReference as the return type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a normal pattern either way, and there are examples of returning an error with both nil-pointer and zero-value in the standard library. I'm not sure if you had any reasoning in mind beyond aesthetics? In any case, since metav1 itself only ever returns a pointer to an OwnerReference, I'll go ahead and change this to match.

Copy link
Contributor

@fanminshi fanminshi Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhrivnak from my past experience with Go, almost all the code I have seen has return nil, err pattern which is what we use in the operator-sdk codebase. i guess this is just a style nit to be consistent with rest of the codebase. Sorry I should have been more clear.

err = client.Get(ctx, key, myPod)
if err != nil {
logrus.Error("failed to get pod")
return metav1.OwnerReference{}, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return nil, err?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot use nil as type "github.com/operator-framework/operator-sdk/vendor/k8s.io/apimachinery/pkg/apis/meta/v1".OwnerReference in return argument

// case where a namespace cannot be found for the current pod. This is useful
// for a service that might run outside the cluster, for example an operator
// being started with `operator-sdk up local`.
func TryBecome(ctx context.Context, name string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It maybe more intuitive if we call name to either lockKey or leaderKey.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking name because the value will literally go into the name field on the ConfigMap. Maybe lockName would be better?

I worry about using the term "key" in a scenario that involves a "lock", since we don't want anyone to think that it is the kind of key that could "unlock" something.

Thoughts on that? I'm open to suggestions.

Copy link
Contributor

@fanminshi fanminshi Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lockName sounds better. The reason I think having a keyword lock is useful because the usr now expects or have an idea that multiple instances having the same lockName going compete and one will win.

return err
}

cm := &corev1.ConfigMap{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move cm to line 121 so it is closer to its usage.

logrus.Info("Not the leader. Waiting.")
select {
case <-time.After(wait.Jitter(time.Second*time.Duration(seconds), .2)):
if seconds < 16 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change 16 to a named variable e.g maxBackoffInterval and also use backoff

if backoff < maxBackoffInterval {
   backoff *= 2
}

}

// try to create a lock
seconds := 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have backoff := time.second instead of seconds?

case apierrors.IsAlreadyExists(err):
logrus.Info("Not the leader. Waiting.")
select {
case <-time.After(wait.Jitter(time.Second*time.Duration(seconds), .2)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.After(wait.Jitter(time.Second*time.Duration(seconds), .2)) ->
time.After(wait.Jitter(backoff, .2))

@fanminshi
Copy link
Contributor

lgtm

// case where a namespace cannot be found for the current pod. This is useful
// for a service that might run outside the cluster, for example an operator
// being started with `operator-sdk up local`.
func TryBecome(ctx context.Context, lockName string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the operator-sdk up local case all we want to do is detect if the operator is not running in a pod and log that we're skipping leader election. We should do that in Become(). Otherwise users will have to keep changing their code between Become() and TryBecome() depending on how they run the operator.

So let's just remove TryBecome() and change the following in Become():

ns, err := myNS()
if err != nil {
  if err == ErrNoNS {
    logrus.Info("Skipping leader election; not running in cluster")
    return nil
  }
  return err
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm not quite sure what you mean by needing to "keep changing their code...", but I did write this for two uses cases, and you're right that the SDK will probably only care about one of them. Just for the sake of discussion, they are:

  1. My service might run off-cluster, so I want failure to detect the current namespace to be accepted as normal.
  2. My service always runs in-cluster, so failure to detect the current namespace should result in hard failure.

For our purposes, I'll go ahead with the change you suggested. Thanks!

return ctx.Err()
}
default:
logrus.Error("unknown error creating configmap")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log the full error:

logrus.Errorf("unknown error creating configmap: %v", err)

return "", err
}
ns := strings.TrimSpace(string(nsBytes))
logrus.Infof("found namespace: %s", ns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no need to log this. It's not saying much as this is normal behavior and users already know what namespace they're running the operator in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll change it to Debug level since it could be useful for troubleshooting.

if err != nil {
return nil, err
}
logrus.Infof("found hostname: %s", hostname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this log. We can remove this.

key := crclient.ObjectKey{Namespace: ns, Name: hostname}
err = client.Get(ctx, key, myPod)
if err != nil {
logrus.Error("failed to get pod")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log the complete error.

logrus.Errorf("failed to get self pod: %v", err)


// maxBackoffInterval defines the maximum amount of time to wait between
// attempts to become the leader.
const maxBackoffInterval = time.Second * 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on making this configurable via an option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hasbro17 i'd suggest we keep it a default value first. if user complains, then we make it configurable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't have any strong opinions on making it configurable in this PR. Just wanted to bring it up for discussion as something todo in the future as it might be a use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this could be valuable as a configuration item in the future. I concluded it wasn't a priority right now, because it's hard to come up with a compelling use case in which changing it has appreciable value. But if someone brings us a good user story, it would be easy enough to add.

See doc/proposals/leader-for-life.md
@shawn-hurley
Copy link
Member

LGTM

@hasbro17
Copy link
Contributor

LGTM

@mhrivnak Do you have any thoughts on how we can add an e2e test for this as a follow up?
It's a little tricky since we can't call leader.Become() from a test locally
I'm thinking in the test we would need to build and run a Deployment(of size 3) that does the following:

  • call leader.Become()
  • Set readiness probe to true once the lock is acquired.

And ensure that only 1 of the 3 pods become Ready(and is the leader).
And also ensure that it can step down after we kill that pod.

@mhrivnak
Copy link
Member Author

I'll open a new issue to track creating an e2e test.

@mhrivnak mhrivnak merged commit 70b4502 into operator-framework:master Oct 11, 2018
@mhrivnak mhrivnak deleted the leaderelection branch October 11, 2018 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants