Flake - random tests are failing due to timeout on write #1996

zimnx · 2024-06-28T17:57:27Z

This was on -clusterip job which has slower persistent network attached ssds. We might want to reevaluate whether we still want to use them.

Link to the job that flaked.

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1991/pull-scylla-operator-master-e2e-gke-parallel-clusterip/1806721597633990656

Snippet of what failed.

   [FAILED] Unexpected error:
      <*fmt.wrapError | 0xc0006967a0>: 
      can't insert data: Operation timed out for 58qkpsqt.test - received only 1 responses from 2 CL=ALL.
      {
          msg: "can't insert data: Operation timed out for 58qkpsqt.test - received only 1 responses from 2 CL=ALL.",
          err: <*gocql.RequestErrWriteTimeout | 0xc0004f4380>{
              errorFrame: {
                  frameHeader: {version: 132, flags: 0, stream: 576, op: 0, length: 104, warnings: nil},
                  code: 4352,
                  message: "Operation timed out for 58qkpsqt.test - received only 1 responses from 2 CL=ALL.",
              },
              Consistency: 5,
              Received: 1,
              BlockFor: 2,
              WriteType: "SIMPLE",
          },
      }
  occurred
  In [It] at: github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/verify.go:312 @ 06/28/24 16:30:32.464

The text was updated successfully, but these errors were encountered:

zimnx · 2024-06-28T19:34:04Z

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1991/pull-scylla-operator-master-e2e-gke-parallel-clusterip/1806749115925663744

tnozicka · 2024-07-01T17:41:11Z

We don' have a choice on some platforms - preferably we'd adjust the timeout or concurrency based o what exactly goes so slow.
/priority important-longterm
/triage accepted

tnozicka · 2024-07-02T10:09:05Z

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1871/pull-scylla-operator-master-e2e-gke-parallel-clusterip/1808031823570145280#1:test-build-log.txt%3A1205

tnozicka · 2024-07-09T08:08:31Z

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1971/pull-scylla-operator-master-e2e-gke-parallel-clusterip/1810558454176157696#1:test-build-log.txt%3A962

tnozicka · 2024-07-09T09:21:55Z

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1971/pull-scylla-operator-master-e2e-gke-parallel-clusterip/1810586982485594112#1:test-build-log.txt%3A750

tnozicka · 2024-07-09T10:18:21Z

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1971/pull-scylla-operator-master-e2e-gke-parallel-clusterip/1810605342397042688#1:test-build-log.txt%3A1258

zimnx added the kind/flake Categorizes issue or PR as related to a flaky test. label Jun 28, 2024

scylla-operator-bot bot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jun 28, 2024

zimnx mentioned this issue Jun 28, 2024

Bump default ScyllaDB version used in E2E's to 6.0.1 #1991

Merged

tnozicka mentioned this issue Jul 2, 2024

Update release procedures #1871

Merged

tnozicka self-assigned this Jul 9, 2024

tnozicka added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jul 9, 2024

tnozicka mentioned this issue Jul 9, 2024

Add dedicated ServiceAccount for perftune jobs #1971

Merged

1 task

scylla-operator-bot bot closed this as completed Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flake - random tests are failing due to timeout on write #1996

Flake - random tests are failing due to timeout on write #1996

zimnx commented Jun 28, 2024 •

edited

Loading

zimnx commented Jun 28, 2024

tnozicka commented Jul 1, 2024

tnozicka commented Jul 2, 2024

tnozicka commented Jul 9, 2024

tnozicka commented Jul 9, 2024

tnozicka commented Jul 9, 2024

Flake - random tests are failing due to timeout on write #1996

Flake - random tests are failing due to timeout on write #1996

Comments

zimnx commented Jun 28, 2024 • edited Loading

Link to the job that flaked.

Snippet of what failed.

zimnx commented Jun 28, 2024

tnozicka commented Jul 1, 2024

tnozicka commented Jul 2, 2024

tnozicka commented Jul 9, 2024

tnozicka commented Jul 9, 2024

tnozicka commented Jul 9, 2024

zimnx commented Jun 28, 2024 •

edited

Loading