Support multi-datacenter clusters in test runner and script #1881

rzetelskik · 2024-04-04T11:36:57Z

Description of your changes: This PR adds support for multi-datacenter clusters in our test runner and scripts.

Which issue is resolved by this Pull Request:
Prerequisite for #1632.

/kind feature
/priority important-longterm
/cc

scylla-operator-bot · 2024-04-04T11:37:00Z

@rzetelskik: GitHub didn't allow me to request PR reviews from the following users: rzetelskik.

Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Description of your changes: WIP

Which issue is resolved by this Pull Request:
Resolves #

/kind feature
/priority important-longterm
/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

rzetelskik · 2024-04-04T12:55:45Z

/cc zimnx tnozicka

rzetelskik · 2024-04-11T13:50:55Z

#1525 (comment)

#1694 (comment)

I don't see how these changes could contribute to flakiness, since it's just some changes in machinery, so I'm not investigating.

/test images
/retest

rzetelskik · 2024-04-14T10:52:51Z

@rzetelskik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke-parallel 9ff84af link true /test e2e-gke-parallel
Full PR test history. Your PR dashboard.

Cluster provisioning failed.
/test images
/retest

rzetelskik · 2024-04-14T12:11:09Z

@rzetelskik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke-serial 61326f2 link true /test e2e-gke-serial
Full PR test history. Your PR dashboard.

Cluster provisioning failed.
/retest

rzetelskik · 2024-04-18T10:04:28Z

@rzetelskik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke-parallel-clusterip d3b46ef link true /test e2e-gke-parallel-clusterip
Full PR test history. Your PR dashboard.

Known Scylla Manager flake. Already working on it, this PR is unrelated.

/retest

hack/.ci/run-e2e-gke.sh

pkg/genericclioptions/genericclioptions.go

test/e2e/utils/exec.go

pkg/genericclioptions/genericclioptions.go

rzetelskik · 2024-04-29T10:48:49Z

@zimnx thanks for the review, I replied to all of your comments

test/e2e/set/nodeconfig/nodeconfig_disksetup.go

pkg/genericclioptions/genericclioptions.go

tnozicka

Thanks for the updates!

I think this is a good start and it bumps into multiple design issues at once. It brings a lot of value and I think we can iterate on the scripts or framework interfaces in the future, so it's not set in stone. It will be easier to address each of those nits individually, if needed.

/approve
lgtm, but you need to fix the CI failure

rzetelskik · 2024-05-31T16:12:09Z

lgtm, but you need to fix the CI failure

https://github.com/scylladb/scylla-operator-release/pull/206 sent a PR to fix the typo on the CI side

tnozicka · 2024-06-03T07:18:41Z

https://github.com/scylladb/scylla-operator-release/pull/206 landed
/retest

tnozicka · 2024-06-03T07:19:03Z

/lgtm

rzetelskik · 2024-06-03T08:30:14Z

Waiting for cluster to be provisioned...
Cluster provisioning failed. Exiting.
Missing kubeconfigs.
Usage: /usr/bin/bash kubeconfig [kubeconfig ...]

@rzetelskik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke-release-script-latest d03fb8e link unknown /test e2e-gke-release-script-latest
Full PR test history. Your PR dashboard.

Waiting for cluster to be provisioned...
Cluster provisioning failed. Exiting.
Missing kubeconfigs.
Usage: /usr/bin/bash kubeconfig [kubeconfig ...]

@tnozicka should I fix this in scylla-operator-release (pass kubeconfig to funcs) or make them discover kubeconfigs here?

tnozicka · 2024-06-03T08:42:09Z

I'd use the env vars for KUBECONFIGS to match how everything else handles KUBECONFIG. If you get KUBECONFIGS it wins over a KUBECONFIG, if you get KUBECONFIG, translate it to KUBECONFIGS[0] so you can use KUBECONFIGS consistently, if needed.

rzetelskik · 2024-06-03T11:36:39Z

I'd use the env vars for KUBECONFIGS to match how everything else handles KUBECONFIG. If you get KUBECONFIGS it wins over a KUBECONFIG, if you get KUBECONFIG, translate it to KUBECONFIGS[0] so you can use KUBECONFIGS consistently, if needed.

You can't really pass/test arrays as env vars that way, so best I could do here is to do this for KUBECONFIG_DIR on sourcing e2e lib. Unless you know a reasonable workaround for this.

tnozicka · 2024-06-03T13:03:18Z

You can't really pass/test arrays as env vars that way, so best I could do here is to do this for KUBECONFIG_DIR on sourcing e2e lib.

sounds good

rzetelskik · 2024-06-03T13:35:10Z

You can't really pass/test arrays as env vars that way, so best I could do here is to do this for KUBECONFIG_DIR on sourcing e2e lib.
sounds good

ok, done

@tnozicka I realised I haven't passed kubeconfigs to the e2e pod in this PR. Should we land this as a starting point regardless? As I'm trying to run a multi-dc e2e test I'll probably bump into some other issues, but I think the baseline for framework etc is solid.

tnozicka · 2024-06-03T14:01:58Z

I am fine with followups

/approve
/lgtm

scylla-operator-bot · 2024-06-03T14:02:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rzetelskik, tnozicka, zimnx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [tnozicka,zimnx]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

rzetelskik · 2024-06-03T14:06:21Z

/hold cancel

rzetelskik mentioned this pull request Apr 4, 2024

[DUMMY] Add an E2E test for multi-datacenter clusters - test #1632

Closed

rzetelskik force-pushed the multi-region-support branch from 10d5a2a to 9ff84af Compare April 4, 2024 11:45

rzetelskik changed the title ~~[WIP] Support multi-dc clusters in test runner and script~~ Support multi-dc clusters in test runner and script Apr 4, 2024

scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 4, 2024

scylla-operator-bot bot requested review from tnozicka and zimnx April 4, 2024 12:55

rzetelskik changed the title ~~Support multi-dc clusters in test runner and script~~ Support multi-datacenter clusters in test runner and script Apr 14, 2024

rzetelskik force-pushed the multi-region-support branch 2 times, most recently from 325c6b6 to 61326f2 Compare April 14, 2024 10:54

scylla-operator-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2024

rzetelskik force-pushed the multi-region-support branch from 61326f2 to d3b46ef Compare April 18, 2024 08:15

scylla-operator-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2024

zimnx reviewed Apr 26, 2024

View reviewed changes

rzetelskik force-pushed the multi-region-support branch 2 times, most recently from 255f6d4 to 9f0d333 Compare April 29, 2024 10:46

rzetelskik requested a review from zimnx April 29, 2024 10:48

zimnx reviewed Apr 29, 2024

View reviewed changes

test/e2e/set/nodeconfig/nodeconfig_disksetup.go Outdated Show resolved Hide resolved

zimnx reviewed Apr 29, 2024

View reviewed changes

pkg/genericclioptions/genericclioptions.go Outdated Show resolved Hide resolved

rzetelskik force-pushed the multi-region-support branch from 9f0d333 to 66167d5 Compare April 29, 2024 11:21

rzetelskik changed the title ~~[WIP] Support multi-datacenter clusters in test runner and script~~ Support multi-datacenter clusters in test runner and script May 31, 2024

scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 31, 2024

rzetelskik requested a review from tnozicka May 31, 2024 13:04

tnozicka reviewed May 31, 2024

View reviewed changes

scylla-operator-bot bot added the lgtm Indicates that a PR is ready to be merged. label Jun 3, 2024

rzetelskik force-pushed the multi-region-support branch from d03fb8e to 2d65286 Compare June 3, 2024 08:37

scylla-operator-bot bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 3, 2024

rzetelskik force-pushed the multi-region-support branch 5 times, most recently from 3e5efa0 to 55cdb76 Compare June 3, 2024 11:36

rzetelskik requested a review from tnozicka June 3, 2024 11:36

Support multi-datacenter clusters in test runner and script

4d4cf5c

rzetelskik force-pushed the multi-region-support branch from 55cdb76 to 4d4cf5c Compare June 3, 2024 12:14

scylla-operator-bot bot added the lgtm Indicates that a PR is ready to be merged. label Jun 3, 2024

scylla-operator-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 3, 2024

scylla-operator-bot bot merged commit a6cb614 into scylladb:master Jun 3, 2024
12 of 13 checks passed

rzetelskik mentioned this pull request Jun 4, 2024

Propagate kubeconfigs to e2e Pod #1951

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multi-datacenter clusters in test runner and script #1881

Support multi-datacenter clusters in test runner and script #1881

rzetelskik commented Apr 4, 2024 •

edited

Loading

scylla-operator-bot bot commented Apr 4, 2024

rzetelskik commented Apr 4, 2024

rzetelskik commented Apr 11, 2024 •

edited

Loading

rzetelskik commented Apr 14, 2024

rzetelskik commented Apr 14, 2024

rzetelskik commented Apr 18, 2024

rzetelskik commented Apr 29, 2024

tnozicka left a comment •

edited

Loading

rzetelskik commented May 31, 2024

tnozicka commented Jun 3, 2024

tnozicka commented Jun 3, 2024

rzetelskik commented Jun 3, 2024 •

edited

Loading

tnozicka commented Jun 3, 2024 •

edited

Loading

rzetelskik commented Jun 3, 2024

tnozicka commented Jun 3, 2024

rzetelskik commented Jun 3, 2024 •

edited

Loading

tnozicka commented Jun 3, 2024

scylla-operator-bot bot commented Jun 3, 2024

rzetelskik commented Jun 3, 2024

Support multi-datacenter clusters in test runner and script #1881

Support multi-datacenter clusters in test runner and script #1881

Conversation

rzetelskik commented Apr 4, 2024 • edited Loading

scylla-operator-bot bot commented Apr 4, 2024

rzetelskik commented Apr 4, 2024

rzetelskik commented Apr 11, 2024 • edited Loading

rzetelskik commented Apr 14, 2024

rzetelskik commented Apr 14, 2024

rzetelskik commented Apr 18, 2024

rzetelskik commented Apr 29, 2024

tnozicka left a comment • edited Loading

Choose a reason for hiding this comment

rzetelskik commented May 31, 2024

tnozicka commented Jun 3, 2024

tnozicka commented Jun 3, 2024

rzetelskik commented Jun 3, 2024 • edited Loading

tnozicka commented Jun 3, 2024 • edited Loading

rzetelskik commented Jun 3, 2024

tnozicka commented Jun 3, 2024

rzetelskik commented Jun 3, 2024 • edited Loading

tnozicka commented Jun 3, 2024

scylla-operator-bot bot commented Jun 3, 2024

rzetelskik commented Jun 3, 2024

rzetelskik commented Apr 4, 2024 •

edited

Loading

rzetelskik commented Apr 11, 2024 •

edited

Loading

tnozicka left a comment •

edited

Loading

rzetelskik commented Jun 3, 2024 •

edited

Loading

tnozicka commented Jun 3, 2024 •

edited

Loading

rzetelskik commented Jun 3, 2024 •

edited

Loading