Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Clusterctl alpha rollout history #7994

Conversation

hiromu-a5a
Copy link
Contributor

@hiromu-a5a hiromu-a5a commented Jan 25, 2023

What this PR does / why we need it:

Adds command and client for the clusterctl alpha rollout history.

Although the rollout undo command is already implemented, it's hard to determine a revision number that should be passed to --to-revision option due to the lack of the history command.
The basic behavior of this history command is similar to the kubectl rollout history. A few differences are: i) only the machinedeployment is supported; ii) clusterctl alpha rollout history --revision=<revision> describes all attributes of a machinset corresponding to that revision in yaml format.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Tracking Issue:
#3439

Depends-on:
#7988

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 25, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @hiromu-a5a. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 25, 2023
@hiromu-a5a
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 25, 2023
@hiromu-a5a
Copy link
Contributor Author

unhold after #7988 is merged.

@fabriziopandini fabriziopandini added the area/clusterctl Issues or PRs related to clusterctl label Jan 25, 2023
@fabriziopandini
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 25, 2023
// Create a revisionToChangeCause map
historyInfo := make(map[int64]string)
for _, ms := range msList {
v, err := mdutil.Revision(ms)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just stumbled into machinedeployment.clusters.x-k8s.io/revision-history and I'm wondering if we should keep this as well into account while building the revision history

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comments.
Maybe, it's a good time to discuss the deprecation of revision-history. I personally can't find any use cases for this annotation but we need to confirm that, and I think it should be handled by another issue.

FYI, I found that Kubernetes also has this annotation, but it's not used in the actual logic.

Copy link
Member

@fabriziopandini fabriziopandini Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CAPI it is used in https://github.com/kubernetes-sigs/cluster-api/blob/29618402670e19c77833edbede6d9e[…]7b4d1036a/internal/controllers/machinedeployment/mdutil/util.go

TL;DR each machineset keeps track of the revision it currently is, and if “reused”, it keeps also track of the revision it was…

Maybe, it's a good time to discuss the deprecation of revision-history.

I would prefer to get this PR merged with support for revision-history and then eventually open the discussion deprecation vs keeping this PR around on hold till the disussion is completed and revision-history is actually deprecated

Copy link
Contributor

@ykakarap ykakarap Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not showing the revision numbers that are listed under revision history annotation, at least as separate entries in the output.
Since the idea behind the history command is to list the revisions the user can rollback to, it would be best to not show the revision numbers in the revision history annotation as it is not possible to rollback using those revision numbers.

If we want really want to show those numbers in the output for the sake of having a full history then how about adding a third column called "matching revisions"/"duplicate reivisions" or some better name where we list these revision numbers against target revision numbers?

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we rollback to revision number in revision history?

Copy link
Contributor Author

@hiromu-a5a hiromu-a5a Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically agree with @ykakarap 's comment.

Why can't we rollback to revision number in revision history?

Do you have any ideas for this use case? I thought, maybe, users want to store a specific revision number and want to rollback there always, but such needs can be satisfied by storing manifests for that revision. I think rollback is just a rollback for temporal error handling, and thus there's no need to rollback to a revision number in the history.

Copy link
Member

@sbueringer sbueringer Feb 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up from the discussion in the office hours.

I think it's very user unfriendly if we do not support revision-history (as long as Cluster API is using it). Today Cluster API only tracks MachineSets that are actually different via separate MachineSet objects. As soon a MachineSet is the same as another one in the history revision-history is used automatically.

If we don't support revision-history from a user point-of-view there will be arbritrary holes in the history (e.g. if 1, 4, 5, 7 are the same the history shown by clusterctl would be: 2, 3, 6, 7). I think this is very hard to understand for users.

I would suggest:

  • in the history command: list all revisions including the one from the revision-history annotation
  • for the rollback command: make it possible to also rollback to revisions from the revision history (should be as easy as also looking at the revision-history annotation instead of only at the revision annotation to find the MachineSet for a specific revision)

Copy link
Contributor Author

@hiromu-a5a hiromu-a5a Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for following up and my apologies for the delayed reply.

I changed PR following your suggestion.

I'll deal with your second suggestion in a different PR as it should be an enhancement of undo, but not the implementation of history.

should be as easy as also looking at the revision-history annotation instead of only at the revision annotation to find the MachineSet for a specific revision

This makes sense.


BTW, I made output of the history command like this. If I am a user, I want to group revisions associated to the same MachineSet with the same change cause.

REVISIONS       CHANGE-CAUSE
        2             <none>  
    4,6,8,10      add label
   1,3,5,7,9,11   update to the latest k8s version

The above is more useful than this when you are looking for a revision that you want to rollback.

REVISION        CHANGE-CAUSE
        1             <none>  
        2             <none>  
        3             <none>  
        4             <none>  
        5             <none>  
        6             <none>  
        7             <none>  
        8             <none>  
        9             <none>  
       10            add label  
       11            update to the latest k8s version

As an alternative, we can extend the change cause to record its history but I think that is too much.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2023
@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch from 74bf7b1 to 291382b Compare January 31, 2023 16:06
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 31, 2023
@hiromu-a5a
Copy link
Contributor Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2023
@hiromu-a5a
Copy link
Contributor Author

I'll update the note v1.3 to v1.4 later.

cmd/clusterctl/cmd/rollout/history.go Outdated Show resolved Hide resolved
cmd/clusterctl/cmd/rollout/history.go Outdated Show resolved Hide resolved

// NewCmdRolloutHistory returns a Command instance for 'rollout history' sub command.
func NewCmdRolloutHistory(cfgFile string) *cobra.Command {
historyOpt := client.RolloutHistoryOptions{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curios, doesn't this need to be a top-level variable? Similar to how it is defined in the other commands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a result of trial-and-error during creating #7988. I felt there are no reasons to put it as a global variable and changed it, then I forgot to revert it, sorry.
Also, I thought of using client.Rollout*Options as the only structure for handling options rather than using separated structures for Options in cmd and client. In the end, I think it's better to keep the same style as the existing codes in our new codes, fixed it.

return err
}
default:
return errors.Errorf("invalid resource type %q, valid values are %v", ref.Kind, validResourceTypes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please introduce a new validHistroyResourceTypes and use that here? Because validResourceTypes also include KCP and KCP does not have any history feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

sort.Slice(revisions, func(i, j int) bool { return revisions[i] < revisions[j] })

// Output the revisionToChangeCause map
writer := new(tabwriter.Writer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: We are already using github.com/olekukonko/tablewriter in other parts of the clusterctl codebase to print and format out tables. How about the using the same here as well?

Examples:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sharing. It makes sense for me. I changed this part with tablewriter.


// ChangeCauseAnnotation is the annotation set on MachineSets by users to identify the cause of revision changes.
// This annotation will be shown when users run `clusterctl alpha rollout history`.
ChangeCauseAnnotation = "cluster.x-k8s.io/change-cause"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This annotation is never updated, right? So the cause is always "". Is there a follow-up planned to update this annotation?
If so, FYI, that work could potentially overlap with the in-place propagation work (#7731).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the plan to update this specific annotation directly from the codes. I assume that this annotation is set by users, like Kubernetes.

Also, I'm okay with propagating this annotation set in a machinedeployment to machinsets within the scope of the label propagation as Kubernetes does a similar behavior (as described below), but I guess it shouldn't be specific for this annotation.

CHANGE-CAUSE is copied from the Deployment annotation kubernetes.io/change-cause to its revisions upon creation. You can specify theCHANGE-CAUSE message by:

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 2, 2023
@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch 2 times, most recently from 7023c88 to 10a01a5 Compare February 6, 2023 13:07
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 6, 2023
Copy link
Contributor Author

@hiromu-a5a hiromu-a5a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review.
Please kindly review the updates.

sort.Slice(revisions, func(i, j int) bool { return revisions[i] < revisions[j] })

// Output the revisionToChangeCause map
writer := new(tabwriter.Writer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sharing. It makes sense for me. I changed this part with tablewriter.

return err
}
default:
return errors.Errorf("invalid resource type %q, valid values are %v", ref.Kind, validResourceTypes)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


// NewCmdRolloutHistory returns a Command instance for 'rollout history' sub command.
func NewCmdRolloutHistory(cfgFile string) *cobra.Command {
historyOpt := client.RolloutHistoryOptions{}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a result of trial-and-error during creating #7988. I felt there are no reasons to put it as a global variable and changed it, then I forgot to revert it, sorry.
Also, I thought of using client.Rollout*Options as the only structure for handling options rather than using separated structures for Options in cmd and client. In the end, I think it's better to keep the same style as the existing codes in our new codes, fixed it.

@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch from 2449370 to 06ad075 Compare April 7, 2023 12:55
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2023
@hiromu-a5a
Copy link
Contributor Author

I've applied all comments. Please kindly review it again when you have time :)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2023
@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch from 06ad075 to d2e1572 Compare July 10, 2023 13:05
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sbueringer for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 19, 2023
@Jont828
Copy link
Contributor

Jont828 commented Sep 7, 2023

@hiromu-a5a Looks like you need to rebase. Do you need any help with this PR?

@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch from d2e1572 to 2a000d4 Compare September 16, 2023 12:08
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 16, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch from 2a000d4 to d60c14d Compare December 14, 2023 04:59
@hiromu-a5a hiromu-a5a force-pushed the clusterctl-alpha-rollout-history branch from d60c14d to b630cbb Compare December 14, 2023 07:35
@hiromu-a5a
Copy link
Contributor Author

@Jont828 Sorry for the late reply. I think this PR is now ready for review again. Could you check it?

@k8s-ci-robot
Copy link
Contributor

@hiromu-a5a: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-e2e-informing-main d2e1572 link false /test pull-cluster-api-e2e-informing-main
pull-cluster-api-apidiff-main b630cbb link false /test pull-cluster-api-apidiff-main
pull-cluster-api-e2e-blocking-main b630cbb link true /test pull-cluster-api-e2e-blocking-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 25, 2024
@Jont828
Copy link
Contributor

Jont828 commented Jun 11, 2024

@hiromu-a5a Are you still working on this? If not we might want to close this PR as it's been without activity for almost 6 months like the other two.

@sbueringer
Copy link
Member

At this point it's very very likely that we will deprecate and remove the revision management, see #10479

@hiromu-a5a
Copy link
Contributor Author

I am really sorry for the late reply. We are shifting to GitOps and lost motivation to clusterctl.

@hiromu-a5a hiromu-a5a closed this Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterctl Issues or PRs related to clusterctl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants