Skip to content
This repository was archived by the owner on May 6, 2022. It is now read-only.

Add ObservedGeneration and Provisioned into ServiceInstanceStatus #1748

Merged
merged 22 commits into from
Mar 3, 2018
Merged

Add ObservedGeneration and Provisioned into ServiceInstanceStatus #1748

merged 22 commits into from
Mar 3, 2018

Conversation

nilebox
Copy link
Contributor

@nilebox nilebox commented Feb 20, 2018

See #1747

  • Add ObservedGeneration and Provisioned fields to ServiceInstanceStatus
  • Switch to using ObservedGeneration and Provisioned on the read path (while keeping write paths for both ReconciledGeneration and ObservedGeneration for backward compatibility)
  • Initialize ObservedGeneration and Provisioned fields if they are not set while ReconciledGeneration is set (for API migration between Service Catalog versions)

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 20, 2018
@nilebox nilebox changed the title WIP: Add ObservedGeneration and Provisioned into ServiceInstanceStatus Add ObservedGeneration and Provisioned into ServiceInstanceStatus Feb 21, 2018
// successfully provisioned (i.e. it exists on the broker's side and can be
// updated or deprovisioned). The flag should be reset after successful
// deprovisioning.
Provisioned bool
Copy link
Contributor

@kibbles-n-bytes kibbles-n-bytes Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provisioned seems like something appropriate for a condition as opposed to bool. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibbles-n-bytes Yeah I agree, that might be a good idea.
Not sure if we'll need to distinguish the initial "unprovisioned" state from the "deprovisioned" (after deletion or orphan mitigation) there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nilebox @kibbles-n-bytes I disagree that provisioned should be a condition as opposed to a bool. The move in kube is to use simple fields instead of conditions. See kubernetes/kubernetes#7856 (comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should avoid using bool fields. How about ProvisionedStatus for the field and an enum of {NotProvisioned,Provisioned} for the values it can take on, similar to the Deprovision status field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staebler I don't really have a preference, but the quorum seems to advocate for ProvisionedStatus...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nilebox That's fine. My issue was using a condition (in the traditional kube sense) as opposed to a field, where kube is moving towards preferring a field. I have no reservation with using ProvisionedStatus instead of a bool.

c.recorder.Event(instance, corev1.EventTypeWarning, readyCond.Reason, readyCond.Message)
currentReconciledGeneration := instance.Status.ReconciledGeneration

setServiceInstanceCondition(instance, v1beta1.ServiceInstanceConditionReady, readyCond.Status, readyCond.Reason, readyCond.Message)
clearServiceInstanceCurrentOperation(instance)
setServiceInstanceCondition(instance, v1beta1.ServiceInstanceConditionFailed, failedCond.Status, failedCond.Reason, failedCond.Message)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bugfix: once I switched from ReconciledGeneration to ObservedGeneration + Failed:True, I found that some integration tests were failing on waiting for this condition.
For provisioning we always set the Failed:True condition for terminal errors, and for updates we never were.
I think this is a bug, and fixed it for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely in favor of this, though we should make sure this keeps the current behavior of spec changes to a failed update triggering a new update.

From a cursory glance, it seems like that'd require 1) removing the L383 block on the Failed condition, and 2) removing the Failed condition when recording the operation start at L422.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it is a bug. The semantics of Failed:True are that the instance is not usable. If an update fails, the instance is still usable, in its previous state. With that being said, I am on board with changing the semantics of Failed:True to mean that the previous operation failed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. It's not really a bug, more just an unfortunate design choice that can be easily rectified.

Copy link
Contributor Author

@nilebox nilebox Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staebler @kibbles-n-bytes by "bug" I mean that this method is not doing what the comment says:

 // processUpdateServiceInstanceFailure handles the logging and updating of a
 // ServiceInstance that hit a terminal failure during update reconciliation.

The "terminal failure" here means that Instance should be marked with Failed:True, but it is not.
So I wanted to temporarily "fix" this to make it work as it was initially (even though poorly) designed, and then separately fix it according to the new design.

Copy link
Contributor Author

@nilebox nilebox Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. removing the L383 block on the Failed condition, and 2) removing the Failed condition when recording the operation start at L422.

@kibbles-n-bytes I believe the other PRs are already addressing this: #1751 and #1765

@nilebox
Copy link
Contributor Author

nilebox commented Feb 22, 2018

Fixed all tests, please review.

@MHBauer
Copy link
Contributor

MHBauer commented Feb 24, 2018

as far as beta api changnes go, do we have to be conncerned with changes in nstatus?

@nilebox
Copy link
Contributor Author

nilebox commented Feb 24, 2018

@MHBauer FWIW, it's not just about breaking API changes, but also about being able to pick up the existing ServiceInstance resources which have ReconciledGeneration set and don't have ObservedGeneration, i.e. data migration.

@@ -1441,12 +1480,13 @@ func (c *controller) processServiceInstanceOperationError(instance *v1beta1.Serv

// processProvisionSuccess handles the logging and updating of a
// ServiceInstance that has successfully been provisioned at the broker.
func (c *controller) processProvisionSuccess(instance *v1beta1.ServiceInstance, dashboardURL *string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ever update the ObservedGeneration in these processXY functions? The way I understood it, ObservedGeneration should be set when the controller starts processing a resource at a specific generation. Thus it would have been set correctly earlier on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my understanding as well. I am not keen on us relying on a side effect of clearing the current operation as the mechanism by which we are setting the ObservedGeneration. I would like the ObservedGeneration to be set explicitly in the reconcileXXXAdd and reconcileXXXUpdate functions.

One thing that is missing with the current approach is setting the ObservedGeneration when there is a change to the spec while an operation is in progress. I understand that is not allowed currently by the API validation, but I think we can accommodate that now to cover us in case we do allow for changes in the future.

Copy link
Contributor

@kibbles-n-bytes kibbles-n-bytes Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not keen on us relying on a side effect of clearing the current operation as the mechanism by which we are setting the ObservedGeneration

I agree. IMO clearServiceInstanceCurrentOperation should no longer do any generation updating. We started to hack around it once we introduced aborting orphan mitigation in favor of deprovisioning, and we only made it worse with the ReconciledGeneration hack when deleting polling instances. (I can fix the RG problems in a separate PR, though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibbles-n-bytes currently the ObservedGeneration is updated when the synchronous provisioning has succeeded, but you're right that recordStartOfServiceInstanceOperation is probably a better place to do this.

Copy link
Contributor Author

@nilebox nilebox Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is missing with the current approach is setting the ObservedGeneration when there is a change to the spec while an operation is in progress.

@staebler do you have suggestions how to address this? the problem is that with async operation in progress ObservedGeneration + Ready:True is a sign that the operation has succeeded. So if you update the ObservedGeneration while the previous generation is still being processed by the broker, once it succeeds the status will be incorrect (i.e. pointing that a newer generation has been successfully reconciled).
Which makes me think that we might still need ReconciledGeneration to cover this... But then the ReconciledGeneration should be updated in both success and failure scenarios, I think (i.e. mean that "the processing of the spec has finished" but not necessarily succeeded)

@kibbles-n-bytes WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is missing with the current approach is setting the ObservedGeneration when there is a change to the spec while an operation is in progress.

@nilebox I was not as clear as I could have been in describing what I meant by "while an operation is in progress". I meant that there were pending changes that still needed to be reconciled with the broker, not that there was an async operation in progress.

For example, let's say that the user updates the instance to change the parameters, changing the generation of the instance to 2. The controller observes the new generation and sets the observed generation to 2. The controller sets CurrentOperation to Update. There is some retryable error that prevents the instance from being updated successfully on the broker. Then, the user makes another update to the instance, changing the generation of the instance to 3. Subsequent requests sent to the broker will use the spec from generation 3. However, the controller will not change the observed generation to 2 because CurrentOperation was and remains Update. Later, a request to the broker is finally successful. If the request was sync, the controller sets the observed generation to 3. If the request was async, the controller does not set the observed generation: it stays 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. The comment above was written before we discussed the meaning of ObservedGeneration vs LastAttemptedGeneration in Slack.

@@ -2039,6 +2072,8 @@ func assertServiceInstanceErrorBeforeRequest(t *testing.T, obj runtime.Object, r
assertServiceInstanceCurrentOperationClear(t, obj)
assertServiceInstanceOperationStartTimeSet(t, obj, false)
assertServiceInstanceReconciledGeneration(t, obj, originalInstance.Status.ReconciledGeneration)
assertServiceInstanceObservedGeneration(t, obj, originalInstance.Status.ObservedGeneration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that the controller would set ObservedGeneration even if there was an error prior to sending the request to the broker. For example, if there were an error getting the service class, the controller sets the condition for the instance to Ready:False. I would expect that the controller would also set the ObservedGeneration to the current generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point, but as I mentioned above it breaks the condition to check whether the instance has been reconciled or not. Probably keeping both ObservedGeneration and ReconciledGeneration will help there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staebler I checked the controller_instance.go - currently we invoke recordStartOfServiceInstanceOperation() method only after we have checked the service class/plan and prepared an OSB request.
Do you suggest that we need to move this invocation higher (right after checking if we need to process at all)?
@kibbles-n-bytes do you agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that recordStartOfServiceInstanceOperation() method sets the instance.Status.InProgressProperties field that requires service class/plan.
Thus, we should either split recordStartOfServiceInstanceOperation() into two methods (the first updating ObservedGeneration and the second populating the operation details), or leave it as it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wait until after the class/plan are checked, then the ObservedGeneration is inconsistent in the presence of an errors with the check. Let's say that there was an error getting the service class. The controller will change the instance to Ready:False. If the controller does not also set the ObservedGeneration, then it will appear as though there was an error with an earlier generation as opposed to there being an error reconciling the current generation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point, but as I mentioned above it breaks the condition to check whether the instance has been reconciled or not.

@nilebox I don't understand this. Why does it break the condition? I would argue that it breaks the condition if we do not set the observed generation.

// needed for the instance based on ObservedGeneration
func isServiceInstanceProcessedAlready(instance *v1beta1.ServiceInstance) bool {
return instance.Status.ObservedGeneration >= instance.Generation &&
(isServiceInstanceReady(instance) || isServiceInstanceFailed(instance))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about also adding a check for !instance.Status.OrphanMitigationInProgress? It doesn't change the behavior of the controller either way, because isServiceInstanceProcessedAlready is never called with orphan mitigation is in progress. However, I think it covers more completely the criteria that determine when a generation has been completely processed.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 27, 2018
)
resetServiceInstanceCondition(
toUpdate,
v1beta1.ServiceInstanceConditionFailed,
)
return c.updateServiceInstanceStatus(toUpdate)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I didn't want to sent an update request there (and instead send it with the first status update), but:

  1. The updateServiceInstanceReferences() updates only references but re-reads the entire instance from the server, overwriting the ObservedGeneration.
    Doesn't it lead to re-reading the spec too, thus potentially reading the newer spec than the recorded ObservedGeneration? @kibbles-n-bytes @staebler see controller_instance.go#L1030
    If it does, then we can't set ObservedGeneration separately from storing InProgressProperties, i.e. ObservedGeneration should be only updated in recordStartOfServiceInstanceOperation() and not earlier...
  2. if the status update will fail later, we won't update ObservedGeneration until the next iteration, thus the status will be obsolete for a while

Copy link
Contributor Author

@nilebox nilebox Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibbles-n-bytes Ok, I think I will just need to extract InProgressProperties from the prepareProvisionRequest() method (which also requires class/plan that might be invalid), then I will be able to just move the recordStartOfServiceInstanceOperation() higher to the top of processing (before retrieving class and plan) instead of splitting it into two methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually still not possible, because:

rh.inProgressProperties = &v1beta1.ServiceInstancePropertiesState{
  ClusterServicePlanExternalName: servicePlan.Spec.ExternalName,
  ClusterServicePlanExternalID:   servicePlan.Spec.ExternalID,
  ...

so annoying...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nilebox

re-reads the entire instance from the server, overwriting the ObservedGeneration

What do you mean be overwriting the ObservedGeneration? If that field has been changed in the instance held by the controller, then it will be written to the server. My reading of what you wrote is that you think that the change to the ObservedGeneration made by the controller will be overwritten by whatever is currently held in storage prior to the update. So, for example, if the ObservedGeneration in storage is 2, the controller set it to 3, then updateServiceInstanceReferences will end up setting it back to 2. My understanding is different. I would expect updateServiceInstanceReferences to cause the ObservedGeneration in storage to change to 3. The ObservedGeneration of the instance held by the controller will remain 3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it lead to re-reading the spec too, thus potentially reading the newer spec than the recorded ObservedGeneration?

@nilebox It will not lead to reading in a newer spec. If the spec is newer in storage, then the update will fail with a conflict. The reconciliation will end there with no changes affected. The instance will be re-queued, at which time the controller will attempt to reconcile the newer spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staebler thanks, I completely forgot that update will fail with conflict, so there is no possibility of re-reading a newer spec.

I would expect updateServiceInstanceReferences to cause the ObservedGeneration in storage to change to 3.

No, this method is a subresource IIRC, and it will ignore the status (won't update ObservedGeneration) and will overwrite any changes to the status when instance is re-read. That's why we need to send a status update with updated ObservedGeneration separately before updating references. Please correct me if I'm missing something again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we can just remember the local changes to status and overwrite it again on the controller side. Then we don't have to make a separate call for updating ObservedGeneration - it will be updated with the first status update (i.e. in case of any error or when we record the operation)

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 27, 2018
updatedInstance, err := c.serviceCatalogClient.ServiceInstances(toUpdate.Namespace).UpdateReferences(toUpdate)
if err != nil {
glog.Errorf(pcb.Messagef("Failed to update references: %v", err))
}
// The UpdateReferences method ignores status changes.
// Restore status that might have changed locally to be able to update it later.
updatedInstance.Status = status
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of moving the ObservedGeneration change until after updateServiceInstanceReferences() method is invoked, I just "fixed" the updateServiceInstanceReferences method so that ordering doesn't matter anymore.
I think it's better than having to remember that the ordering matters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, seems reasonable, and with the no-multi-update changes coming it probably won't end up mattering soon anyway. 😃

@@ -468,6 +514,10 @@ func (c *controller) reconcileServiceInstanceDelete(instance *v1beta1.ServiceIns
glog.V(4).Info(pcb.Message("Processing deleting event"))

instance = instance.DeepCopy()
// Any status updates from this point should have an updated observed generation
if instance.Status.ObservedGeneration != instance.Generation {
c.prepareObservedGeneration(instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to set ObservedGeneration when doing orphan mitigation. That could cause the latest generation to be skipped. There's an argument to be made that we don't need to bother setting ObservedGeneration for a pure delete either, given that there will never be another provision or update for the object.

Here is an example of skipping a generation.

  1. User creates the instance.
  2. Controller sends a failing provision request to the broker.
  3. Prior to the controller starting the orphan mitigation process, the user changes the spec to generation 2.
  4. Controller does an orphan-mitigation reconcile with generation 2 of the instance. Controller sets ObservedGeneration to 2.
  5. Deprovision request completes successfully. Controller sets OrphanMitigationInProgress to false.
  6. Controller does an instance-add reconcile with generation 2 of the instance. ObservedGeneration is 2. Controller stops reconcile early because isServiceInstanceProcessedAlready returns true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that we might just stop updating ObservedGeneration for both orphan mitigation and actual deletion. The latter is the one that is triggered by the user though, so it might be still better to report that we have started processing the deletion of instance.
Will add a condition on orphan mitigation there (ugly but should work).

// to "false" with empty reason and message
func resetServiceInstanceCondition(toUpdate *v1beta1.ServiceInstance,
conditionType v1beta1.ServiceInstanceConditionType) {
setServiceInstanceConditionInternal(toUpdate, conditionType, v1beta1.ConditionFalse, "", "", metav1.Now())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there precedence for having blank reason and message in conditions? My inclination is to use a reason like "ObservedNewGeneration" and a message like "Observed a new generation of the ServiceInstance". I am also inclined to call setServiceInstanceCondition directly in prepareObservedGeneration rather than having the resetServiceInstanceCondition function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will add some meaningful reason and message there.

@@ -950,6 +1009,13 @@ func newServiceInstanceFailedCondition(status v1beta1.ConditionStatus, reason, m
}
}

// resetServiceInstanceCondition sets a single condition on Instnace's status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Instnace/Instance

@@ -2148,7 +2193,7 @@ func assertServiceInstanceRequestFailingError(t *testing.T, obj runtime.Object,
case v1beta1.ServiceInstanceOperationProvision, v1beta1.ServiceInstanceOperationDeprovision:
assertServiceInstanceCondition(t, obj, v1beta1.ServiceInstanceConditionFailed, v1beta1.ConditionTrue, failureReason)
case v1beta1.ServiceInstanceOperationUpdate:
assertServiceInstanceConditionsCount(t, obj, 1)
assertServiceInstanceCondition(t, obj, v1beta1.ServiceInstanceConditionFailed, v1beta1.ConditionTrue, failureReason)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch is not needed at all any more. The assert is the same for all operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

@@ -216,7 +216,8 @@ func WaitForInstanceReconciledGeneration(client v1beta1servicecatalog.Servicecat
return false, fmt.Errorf("error getting Instance %v/%v: %v", namespace, name, err)
}

if instance.Status.ReconciledGeneration == reconciledGeneration {
if instance.Status.ObservedGeneration >= reconciledGeneration &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change the name of the WaitForInstanceReconciledGeneration function and its reconciledGeneration parameter since it is now waiting for the generation to be processed rather than waiting for a particular reconciled generation.

@@ -1518,6 +1525,13 @@ func GetOpenAPIDefinitions(ref common.ReferenceCallback) map[string]common.OpenA
Ref: ref("github.com/kubernetes-incubator/service-catalog/pkg/apis/servicecatalog/v1beta1.ServiceInstancePropertiesState"),
},
},
"provisioned": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has not been updated with the json name change.

I wonder if this is supposed to be caught by the portion of the CI that verifies that the generated files are up-to-date.

// status true.
func isServiceInstanceFailed(instance *v1beta1.ServiceInstance) bool {
for _, condition := range instance.Status.Conditions {
if condition.Type == v1beta1.ServiceInstanceConditionFailed && condition.Status == v1beta1.ConditionTrue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that isServiceInstanceReady and isServiceInstanceFailed are functions copied from elsewhere. But it makes me sad to see them so close together, doing the same thing, and yet differing slightly. Can we have each one call a common isServiceInstanceConditionTrue(*v1beta1.ServiceInstance, v1beta1.ServiceInstanceCondition) function? Barring that, can we be consistent about whether we short-circuit when the condition is found or whether we only short-circuit when the condition is found with a True status?

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 28, 2018
@nilebox
Copy link
Contributor Author

nilebox commented Feb 28, 2018

@staebler I believe all your recent comments were addressed, PTAL.

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@staebler staebler added the LGTM1 label Mar 1, 2018
@@ -537,8 +537,14 @@ type ServiceInstanceStatus struct {
// ReconciledGeneration is the 'Generation' of the serviceInstanceSpec that
// was last processed by the controller. The reconciled generation is updated
// even if the controller failed to process the spec.
// Deprecated: use ObservedGeneration instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to specify what 'use' pertains to here (totally fine to do this in a follow-up)

ReconciledGeneration int64 `json:"reconciledGeneration"`

// ObservedGeneration is the 'Generation' of the serviceInstanceSpec that
// was last processed by the controller. The observed generation is updated
// whenever the status is updated regardless of operation result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, this should end with a period.

if err != nil {
return err
}
if handled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for returning early here 👍

// isServiceInstanceProcessedAlready returns true if there is no further processing
// needed for the instance based on ObservedGeneration
func isServiceInstanceProcessedAlready(instance *v1beta1.ServiceInstance) bool {
return instance.Status.ObservedGeneration >= instance.Generation &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really appreciate a comment here that explains this logic.

deprovisionStatus v1beta1.ServiceInstanceDeprovisionStatus
serviceBinding *v1beta1.ServiceBinding
generation int64
reconciledGeneration int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, lol

@pmorie
Copy link
Contributor

pmorie commented Mar 1, 2018

I'll leave LGTM2 to @kibbles-n-bytes, but this has my stamp of approval, thanks a lot for driving this to completion.

@kibbles-n-bytes
Copy link
Contributor

sorry for the delay; diving into this now.

Copy link
Contributor

@kibbles-n-bytes kibbles-n-bytes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! If you want to address the Failed condition stuff later I'm not opposed; it's more applicable for the retries/reattempt situations anyway.

toUpdate.Status.ObservedGeneration = toUpdate.Generation
reason := observedNewGenerationReason
message := observedNewGenerationMessage
setServiceInstanceCondition(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition will never actually surface to the user as far as I'm aware. Whenever we update the status of our resource, we also update the Ready condition, so it'll always be overwritten.

setServiceInstanceCondition(
toUpdate,
v1beta1.ServiceInstanceConditionFailed,
v1beta1.ConditionFalse,
Copy link
Contributor

@kibbles-n-bytes kibbles-n-bytes Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that instead of setting the Failed condition to false, we should be removing the condition entirely. That seems the more Kube-way™ to me. For example, when transient errors occur on Deployments during pod creation, they are tagged with the ReplicaFailure condition, but once they're resolved that condition is removed as opposed to unset. And I think it makes the user experience more straightforward during reattempts after spec edits, as the conditions a user would see wouldn't be different depending on whether they experienced a failure in the past, and they wouldn't have to walk through the logic of "oh, I have a Failed condition; wait, but it's false, so we're not actually failed".

updatedInstance, err := c.serviceCatalogClient.ServiceInstances(toUpdate.Namespace).UpdateReferences(toUpdate)
if err != nil {
glog.Errorf(pcb.Messagef("Failed to update references: %v", err))
}
// The UpdateReferences method ignores status changes.
// Restore status that might have changed locally to be able to update it later.
updatedInstance.Status = status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, seems reasonable, and with the no-multi-update changes coming it probably won't end up mattering soon anyway. 😃

@nilebox
Copy link
Contributor Author

nilebox commented Mar 3, 2018

I'll address remaining comments in other PRs.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. LGTM1 LGTM2 non-happy-path size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants