Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: prometheus scale down e2e test #448

Merged

Conversation

lihongyan1
Copy link
Contributor

@lihongyan1 lihongyan1 commented Mar 28, 2024

This should fix test failure COO-79
monitoring_stack_controller_test.go:585: assertion failed: 0 (prom.Status.Replicas int32) != 1 (int32)
Also fix
COO-84 [QE] Case Alertmanager_runs_in_HA_mode is not stable
monitoring_stack_controller_test.go:99: statefulsets.apps "alertmanager-alerting" not found

@lihongyan1 lihongyan1 requested a review from a team as a code owner March 28, 2024 05:47
@lihongyan1 lihongyan1 requested review from danielmellado and JoaoBraveCoding and removed request for a team March 28, 2024 05:47
@openshift-ci openshift-ci bot requested a review from simonpasquier March 28, 2024 05:47
@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch 5 times, most recently from 1494d03 to 953b8cb Compare April 2, 2024 05:21
@lihongyan1
Copy link
Contributor Author

@simonpasquier @jan--f Please help review when convenient

@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch from 953b8cb to f0d678a Compare April 9, 2024 08:21
f.GetResourceWithRetry(t, ms.Name, ms.Namespace, &prom)

assert.Equal(t, prom.Status.Replicas, int32(1))
if err = wait.PollUntilContextTimeout(context.Background(), 5*time.Second, framework.DefaultTestTimeout, true, func(ctx context.Context) (bool, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetResourceWithRetry already polls on getting the Prometheus resource.

If we need to retry for checking Status.Replica I'd propose we add a AssertReplicaStatus(name, namespace, resources, expectedReplicas) to framework.go.

@openshift-ci openshift-ci bot removed the approved label Apr 11, 2024
@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch from 93e585a to a35d0d5 Compare April 11, 2024 05:48
@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch from a35d0d5 to 7968633 Compare April 11, 2024 06:05
@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch from 7968633 to 5e34e30 Compare April 11, 2024 06:19
if wait.Interrupted(err) {
t.Fatal(fmt.Errorf("Prometheus was not scaled down"))
}
f.AssertPrometheusReplicaStatus(ms.Name, ms.Namespace, numOfRep)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f.AssertPrometheusReplicaStatus(ms.Name, ms.Namespace, numOfRep)
f.AssertPrometheusReplicaStatus(ms.Name, ms.Namespace, 0)

Should this test for scale down to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry typo, thanks your sharp eyes

return true, nil

}); wait.Interrupted(err) {
t.Fatal(fmt.Errorf("Prometheus %s/%s was not scaled down to %d", namespace, name, expectedReplicas))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Fatal(fmt.Errorf("Prometheus %s/%s was not scaled down to %d", namespace, name, expectedReplicas))
t.Fatal(fmt.Errorf("Prometheus %s/%s has unexpected number of replicas, got %d, expected %d", namespace, name, prom.Status.Replicas, expectedReplicas))

@lihongyan1
Copy link
Contributor Author

lihongyan1 commented Apr 15, 2024

The fix of PR #454 can't work, case Alertmanager_runs_in_HA_mode failed in https://github.com/rhobs/observability-operator/actions/runs/8660157870/job/23747502573?pr=448, will provide new update to the case Alertmanager_runs_in_HA_mode

@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch from f2b0a3c to bd97bc3 Compare April 15, 2024 06:07
@lihongyan1 lihongyan1 force-pushed the fix-test-sacledown-prometheus branch from bd97bc3 to b8c9740 Compare April 15, 2024 06:11
@lihongyan1
Copy link
Contributor Author

/retest

Copy link
Collaborator

@jan--f jan--f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Apr 22, 2024
Copy link

openshift-ci bot commented Apr 22, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jan--f, lihongyan1

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lihongyan1
Copy link
Contributor Author

/retest

@openshift-merge-bot openshift-merge-bot bot merged commit 4cb05b3 into rhobs:main Apr 23, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants