Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data race between test and product code in partition compaction #24040

Closed
davidby-influx opened this issue Jan 13, 2023 · 1 comment
Closed

Comments

@davidby-influx
Copy link
Contributor

There is an occasional data race between a test and the product code. The test needs to lock the partition for writing before modifying the MaxLogFileSize.

Failed
=== RUN   TestPartition_Compact_Write_Fail
=== RUN   TestPartition_Compact_Write_Fail/write_MANIFEST
==================
WARNING: DATA RACE
Write at 0x00c0001593e0 by goroutine 7789:
  github.com/influxdata/influxdb/tsdb/index/tsi1_test.TestPartition_Compact_Write_Fail.func1()
      /root/project/tsdb/index/tsi1/partition_test.go:135 +0x14a
  testing.tRunner()
      /go/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /go/src/testing/testing.go:1493 +0x47

Previous read at 0x00c0001593e0 by goroutine 7790:
  github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).needsLogCompaction()
      /root/project/tsdb/index/tsi1/partition.go:1217 +0x9e
  github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).compact()
      /root/project/tsdb/index/tsi1/partition.go:1013 +0x19d
  github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Compact()
      /root/project/tsdb/index/tsi1/partition.go:915 +0x72
  github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).runPeriodicCompaction()
      /root/project/tsdb/index/tsi1/partition.go:952 +0x47
  github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Open.func3()
      /root/project/tsdb/index/tsi1/partition.go:254 +0x39

Goroutine 7789 (running) created at:
  testing.(*T).Run()
      /go/src/testing/testing.go:1493 +0x75d
  github.com/influxdata/influxdb/tsdb/index/tsi1_test.TestPartition_Compact_Write_Fail()
      /root/project/tsdb/index/tsi1/partition_test.go:125 +0x44
  testing.tRunner()
      /go/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /go/src/testing/testing.go:1493 +0x47

Goroutine 7790 (running) created at:
  github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Open()
      /root/project/tsdb/index/tsi1/partition.go:254 +0xd2b
  github.com/influxdata/influxdb/tsdb/index/tsi1_test.MustOpenPartition()
      /root/project/tsdb/index/tsi1/partition_test.go:163 +0x49
  github.com/influxdata/influxdb/tsdb/index/tsi1_test.TestPartition_Compact_Write_Fail.func1()
      /root/project/tsdb/index/tsi1/partition_test.go:129 +0xb4
  testing.tRunner()
      /go/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /go/src/testing/testing.go:1493 +0x47
==================
@davidby-influx davidby-influx self-assigned this Jan 13, 2023
@davidby-influx davidby-influx changed the title data race between test an dproduct code in parition compaction data race between test and product code in partition compaction Jan 13, 2023
philjb added a commit that referenced this issue Jan 30, 2024
TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040
philjb added a commit that referenced this issue Jan 31, 2024
* fix(tsi1/partition/test): fix data race in test code

TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040

* chore: complete renaming of mutex in file and fix flux test

The flux test is another failing test because it was using a relative
time range.
bnpfeife pushed a commit that referenced this issue Jan 31, 2024
* fix(tsi1/partition/test): fix data race in test code

TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040

* chore: complete renaming of mutex in file and fix flux test

The flux test is another failing test because it was using a relative
time range.
bnpfeife pushed a commit that referenced this issue Jan 31, 2024
* fix(tsi1/partition/test): fix data race in test code

TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040

* chore: complete renaming of mutex in file and fix flux test

The flux test is another failing test because it was using a relative
time range.
bnpfeife added a commit that referenced this issue Jan 31, 2024
* fix: update broken flux and perf tests (main-2.x) (#24617)

* chore: download repository key to file

* fix: broken perf tests

Some perf tests had to be temporarily disabled. Work is
needed in the pref_tests repositories to make them work
again.

* fix(tsi1/partition/test): fix data race in test code (#24613)

* fix(tsi1/partition/test): fix data race in test code

TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040

* chore: complete renaming of mutex in file and fix flux test

The flux test is another failing test because it was using a relative
time range.

---------

Co-authored-by: Phil Bracikowski <[email protected]>
bnpfeife added a commit that referenced this issue Jan 31, 2024
* fix: update broken flux and perf tests (main-2.x) (#24617)

* chore: download repository key to file

* fix: broken perf tests

Some perf tests had to be temporarily disabled. Work is
needed in the pref_tests repositories to make them work
again.

* fix(tsi1/partition/test): fix data race in test code (#24613)

* fix(tsi1/partition/test): fix data race in test code

TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040

* chore: complete renaming of mutex in file and fix flux test

The flux test is another failing test because it was using a relative
time range.

---------

Co-authored-by: Phil Bracikowski <[email protected]>
@devanbenz
Copy link

Closing this! The backports are in all upstream branches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants