Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete BatchingTopicController refactoring: handler classes #10248

Closed
wants to merge 14 commits into from

Conversation

fvaleri
Copy link
Contributor

@fvaleri fvaleri commented Jun 20, 2024

Type of change

  • Refactoring + fix

Description

This change completes the BatchingTopicController refactoring by introduicing a set of handlers, one for each external system, that host the request logic. With that, the BatchingTopicController class is only focused on the reconciliation logic, resulting in 40% less code and better maintenance.

Tested with all Topic Operator unit and integration tests, TopicST, and TopicReplicasChangeST. I also ran the scalability test comparing this branch with 0.41.0, and the performance doesn't change.

EDIT: This PR now includes a fix for the TO-CC race issue.

Checklist

  • Make sure all tests pass
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally

@fvaleri fvaleri requested a review from tombentley June 20, 2024 17:17
@fvaleri fvaleri added this to the 0.42.0 milestone Jun 20, 2024
@fvaleri fvaleri requested a review from ppatierno June 20, 2024 17:18
@fvaleri
Copy link
Contributor Author

fvaleri commented Jun 20, 2024

Scalability comparison with the same custom perf test.

Screenshot from 2024-06-20 19-10-57

@fvaleri
Copy link
Contributor Author

fvaleri commented Jun 21, 2024

Failed test is unrelated and working when running locally.

Copy link
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have a KafkaHandler and a KubernetesHandler shouldn't be be able to have unit tests specifically for those?

@fvaleri fvaleri force-pushed the to-btc-ref-handlers branch 6 times, most recently from cad9cde to ad53e76 Compare July 1, 2024 12:11
@scholzj scholzj modified the milestones: 0.42.0, 0.43.0 Jul 1, 2024
@fvaleri fvaleri force-pushed the to-btc-ref-handlers branch from ad53e76 to 85743a6 Compare July 2, 2024 16:51
@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 2, 2024

@tombentley this took more than I anticipated, as there were some test utils improvements to do. Mainly around recovery from interrupted tests, and namespace cleanup. I also improved the test log4j configuration. No existing test logic has been changed.

We now have tests for the two new handlers (Kube and Kafka), and I ran locally the full integration test suite multiple times, including TopicControllerIT, and a couple of system tests: TopicST, TopicReplicasChangeST. Finally, I also ran the Topic Operator manually and checked basic functionality.

Screenshot from 2024-07-02 17-35-34

I would suggests to run the regression tests suite (as I can't do it), and review this PR after #10241, so that I can take care of the rebase after the bugifx merge.

@fvaleri fvaleri force-pushed the to-btc-ref-handlers branch from 85743a6 to 9dc73e0 Compare July 2, 2024 18:11
@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 3, 2024

@ppatierno maybe you can start the regression tests. Thanks.

@ppatierno
Copy link
Member

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

fvaleri added 3 commits July 6, 2024 18:08
This change completes the BatchingTopicController refactoring by introduicing a set of handlers, one for each external system, that host the request logic.
With that, the BatchingTopicController class is only focused on the reconciliation logic, resulting in 40% less code and better maintenance.

Tested with all Topic Operator unit and integration tests, TopicST, and TopicReplicasChangeST.
I also ran the scalability test comparing this branch with 0.41.0, and the performance doesn't change.

Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
@fvaleri fvaleri force-pushed the to-btc-ref-handlers branch from 9dc73e0 to 854b295 Compare July 6, 2024 16:56
fvaleri added 2 commits July 8, 2024 17:15
Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
@tombentley
Copy link
Member

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Federico Valeri <[email protected]>
@fvaleri fvaleri force-pushed the to-btc-ref-handlers branch from 9c4d490 to c804447 Compare July 9, 2024 15:30
@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 9, 2024

@tombentley for now I created the addConfigurationWarnings to add back the config warning conditions, and enable related tests. We also need to refactor the CruiseControlHandler to avoid the KT status updates (side effect).

I'm a bit confused on how to avoid side effects in both cases with in-place recording. That's probably due to my lack of experience with functional programming. I know you suggested to use the new Results class, but how?

@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 10, 2024

Completed the alter config filtering, now working on the replicas changes.

@fvaleri fvaleri marked this pull request as draft July 10, 2024 17:26
Signed-off-by: Federico Valeri <[email protected]>
@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 11, 2024

@tombentley following your suggestion, I removed side effects from the reconciliation logic, and added a note in BatchingTopicController.onUpdate. It is not perfect, but I think it's a big improvement, and should make maintenance easier. Feel free to iterate on that if you have time. I ran all tests mentioned above, plus manual and performance tests. It's all good. I'm going to ask to trigger regression tests one more time.

fvaleri added 2 commits July 11, 2024 15:47
Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
@fvaleri fvaleri marked this pull request as ready for review July 11, 2024 21:19
@fvaleri fvaleri requested a review from tombentley July 11, 2024 21:19
Copy link
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now at the point of being too big to review properly. It was big before, but it now seems to combine what were two separate strands of work: One to fix the race condition and one to do a refactoring.

In many ways I 'd rather not merge something which it too big to review. But I know this already represents a significant effort, so I don't want to insist that you break it down into smaller PRs. But please let's avoid doing this again in the future!

@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 12, 2024

@tombentley having separate PRs was the original plan, but then things started to pile up with time, and I didn't know the full solution anymore, so I needed all the pieces to make progress.

Even if I think the code is fine, I don't want to merge without your review. Would the following split work for you?

  1. Test refactoring and cleanup.
  2. Reconciliation refactoring.
  3. TO-CC race condition issue fix.

@tombentley
Copy link
Member

@fvaleri it would. If you're willing for break it up I'd appreciate it!

@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 12, 2024

@Frawless can you please launch regression tests on this? Thanks.

@Frawless
Copy link
Member

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Frawless
Copy link
Member

/packit test --labels regression

@fvaleri
Copy link
Contributor Author

fvaleri commented Jul 12, 2024

Regression tests looks good. We only have one failure in testing-farm from a flaky test, which is already reported in #10295.

Signed-off-by: Federico Valeri <[email protected]>
fvaleri added 4 commits July 22, 2024 10:01
Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
Signed-off-by: Federico Valeri <[email protected]>
@fvaleri
Copy link
Contributor Author

fvaleri commented Aug 2, 2024

Split into two new PRs.

@fvaleri fvaleri closed this Aug 2, 2024
@fvaleri fvaleri deleted the to-btc-ref-handlers branch August 2, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants