Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PD data races report #6069

Closed
hnes opened this issue Mar 1, 2023 · 2 comments · Fixed by #6080
Closed

PD data races report #6069

hnes opened this issue Mar 1, 2023 · 2 comments · Fixed by #6080
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. severity/moderate type/bug The issue is confirmed as a bug.

Comments

@hnes
Copy link
Contributor

hnes commented Mar 1, 2023

Bug Report

There are some data races in the code.

What did you do?

$ WITH_RACE=1 make build
# and run benchmark

What did you expect to see?

No data race warning.

What did you see instead?

Git Commit Hash: f999ad5 https://github.com/tikv/pd/tree/master

WARNING: DATA RACE
Write at 0x00c0007f44a0 by main goroutine:
  github.com/tikv/pd/server.(*Server).startServer()
      /x/pd/server/server.go:337 +0xcd
  github.com/tikv/pd/server.(*Server).Run()
      /x/pd/server/server.go:480 +0x145
  main.createServerWrapper()
      /x/pd/cmd/pd-server/main.go:201 +0xd92
  github.com/spf13/cobra.(*Command).execute()
      /x/go/pkg/mod/github.com/spf13/[email protected]/command.go:846 +0xb53
  github.com/spf13/cobra.(*Command).ExecuteC()
      /x/go/pkg/mod/github.com/spf13/[email protected]/command.go:950 +0x5da
  github.com/spf13/cobra.(*Command).Execute()
      /x/go/pkg/mod/github.com/spf13/[email protected]/command.go:887 +0x633
  main.main()
      /x/pd/cmd/pd-server/main.go:73 +0x634

Previous read at 0x00c0007f44a0 by goroutine 417:
  github.com/tikv/pd/server.(*GrpcServer).errorHeader()
      /x/pd/server/grpc_service.go:1520 +0x1c8
  github.com/tikv/pd/server.(*GrpcServer).wrapErrorToHeader()
      /x/pd/server/grpc_service.go:92 +0x7f
  github.com/tikv/pd/server.(*GrpcServer).GetMembers()
      /x/pd/server/grpc_service.go:105 +0x69
  github.com/pingcap/kvproto/pkg/pdpb._PD_GetMembers_Handler.func1()
      /x/go/pkg/mod/github.com/pingcap/[email protected]/pkg/pdpb/pdpb.pb.go:8178 +0x88
  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1()
      /x/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:31 +0x238
  github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1()
      /x/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/server_metrics.go:107 +0xca
  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1()
      /x/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:34 +0x18c
  go.etcd.io/etcd/etcdserver/api/v3rpc.newUnaryInterceptor.func1()
      /x/go/pkg/mod/go.etcd.io/[email protected]/etcdserver/api/v3rpc/interceptor.go:70 +0x47a
  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1()
      /x/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:34 +0x18c
  go.etcd.io/etcd/etcdserver/api/v3rpc.newLogUnaryInterceptor.func1()
      /x/go/pkg/mod/go.etcd.io/[email protected]/etcdserver/api/v3rpc/interceptor.go:77 +0xc7
  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1()
      /x/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:39 +0x375
  github.com/pingcap/kvproto/pkg/pdpb._PD_GetMembers_Handler()
      /x/go/pkg/mod/github.com/pingcap/[email protected]/pkg/pdpb/pdpb.pb.go:8180 +0x1dd
  google.golang.org/grpc.(*Server).processUnaryRPC()
      /x/go/pkg/mod/google.golang.org/[email protected]/server.go:1024 +0x14ba
  google.golang.org/grpc.(*Server).handleStream()
      /x/go/pkg/mod/google.golang.org/[email protected]/server.go:1313 +0xfb3
  google.golang.org/grpc.(*Server).serveStreams.func1.1()
      /x/go/pkg/mod/google.golang.org/[email protected]/server.go:722 +0xec
-----
WARNING: DATA RACE
Read at 0x00c0065fbda8 by goroutine 529:
  github.com/tikv/pd/pkg/movingaverage.(*MedianFilter).Get()
      /x/pd/pkg/movingaverage/median_filter.go:52 +0x3e
  github.com/tikv/pd/pkg/movingaverage.(*TimeMedian).Get()
      /x/pd/pkg/movingaverage/time_median.go:38 +0xbc
  github.com/tikv/pd/server/statistics.(*dimStat).Get()
      /x/pd/server/statistics/hot_peer.go:79 +0x7c
  github.com/tikv/pd/server/statistics.(*HotPeerStat).GetLoad()
      /x/pd/server/statistics/hot_peer.go:164 +0x9d
  github.com/tikv/pd/server/statistics.CollectHotPeerInfos.func1()
      /x/pd/server/statistics/store_hot_peers_infos.go:52 +0x638
  github.com/tikv/pd/server/statistics.CollectHotPeerInfos()
      /x/pd/server/statistics/store_hot_peers_infos.go:65 +0xe6
  github.com/tikv/pd/server/cluster.collectHotMetrics()
      /x/pd/server/cluster/coordinator.go:575 +0xe4
  github.com/tikv/pd/server/cluster.(*coordinator).collectHotSpotMetrics()
      /x/pd/server/cluster/coordinator.go:556 +0xa6
  github.com/tikv/pd/server/cluster.(*RaftCluster).collectMetrics()
      /x/pd/server/cluster/cluster.go:1992 +0x258
  github.com/tikv/pd/server/cluster.(*RaftCluster).runMetricsCollectionJob()
      /x/pd/server/cluster/cluster.go:485 +0x166
  github.com/tikv/pd/server/cluster.(*RaftCluster).Start.func3()
      /x/pd/server/cluster/cluster.go:295 +0x39

Previous write at 0x00c0065fbda8 by goroutine 505:
  github.com/tikv/pd/pkg/movingaverage.(*MedianFilter).Get()
      /x/pd/pkg/movingaverage/median_filter.go:63 +0x15e
  github.com/tikv/pd/pkg/movingaverage.(*TimeMedian).Get()
      /x/pd/pkg/movingaverage/time_median.go:38 +0xbc
  github.com/tikv/pd/server/statistics.(*dimStat).Get()
      /x/pd/server/statistics/hot_peer.go:79 +0x7c
  github.com/tikv/pd/server/statistics.(*HotPeerStat).GetLoad()
      /x/pd/server/statistics/hot_peer.go:164 +0x9d
  github.com/tikv/pd/server/statistics.(*HotPeerStat).Less()
      /x/pd/server/statistics/hot_peer.go:126 +0x49
  github.com/tikv/pd/server/statistics.(*indexedHeap).Less()
      /x/pd/server/statistics/topn.go:249 +0x229
  container/heap.up()
      /x/sdk/go1.20.1/src/container/heap/heap.go:92 +0x8a
  container/heap.Fix()
      /x/sdk/go1.20.1/src/container/heap/heap.go:85 +0x6b
  github.com/tikv/pd/server/statistics.(*indexedHeap).Put()
      /x/pd/server/statistics/topn.go:308 +0x184
  github.com/tikv/pd/server/statistics.(*singleTopN).Put()
      /x/pd/server/statistics/topn.go:174 +0x13c
  github.com/tikv/pd/server/statistics.(*TopN).Put()
      /x/pd/server/statistics/topn.go:97 +0x13e
  github.com/tikv/pd/server/statistics.(*hotPeerCache).putItem()
      /x/pd/server/statistics/hot_peer_cache.go:536 +0x1af
  github.com/tikv/pd/server/statistics.(*hotPeerCache).updateStat()
      /x/pd/server/statis

Git Commit Hash: 88357d3 from https://github.com/rleungx/pd/tree/api-service
(For the median_filter.go part, there is no change when it's comaring with f999ad5)

WARNING: DATA RACE 
Read at 0x00c00068f430 by goroutine 441:
  github.com/tikv/pd/pkg/movingaverage.(*MedianFilter).Get()
      /x/pd/pkg/movingaverage/median_filter.go:53 +0x1b4
  github.com/tikv/pd/server/statistics.(*RollingStoreStats).GetLoad()
      /x/pd/server/statistics/store.go:251 +0x137
  github.com/tikv/pd/server/statistics.(*StoresStats).GetStoresLoads()
      /x/pd/server/statistics/store.go:111 +0x2c4
  github.com/tikv/pd/server/cluster.(*RaftCluster).GetStoresLoads()
      /x/pd/server/cluster/cluster.go:2179 +0x144
  github.com/tikv/pd/server/cluster.(*coordinator).getHotRegionsByType()
      /x/pd/server/cluster/coordinator.go:491 +0xd9
  github.com/tikv/pd/server/cluster.(*RaftCluster).GetHotReadRegions()
      /x/pd/server/cluster/cluster.go:2241 +0x6b
  github.com/tikv/pd/server.(*Handler).GetHotReadRegions()
      /x/pd/server/handler.go:197 +0x50
  github.com/tikv/pd/server.(*Handler).PackHistoryHotReadRegions()
      /x/pd/server/handler.go:1013 +0x30
  github.com/tikv/pd/pkg/storage.(*HotRegionStorage).pullHotRegionInfo()
      /x/pd/pkg/storage/hot_region_storage.go:257 +0x51
  github.com/tikv/pd/pkg/storage.(*HotRegionStorage).backgroundFlush()
      /x/pd/pkg/storage/hot_region_storage.go:217 +0x227
  github.com/tikv/pd/pkg/storage.NewHotRegionsStorage.func1()
      /x/pd/pkg/storage/hot_region_storage.go:158 +0x39

Previous write at 0x00c00068f430 by goroutine 503:
  github.com/tikv/pd/pkg/movingaverage.(*MedianFilter).Get()
      /x/pd/pkg/movingaverage/median_filter.go:62 +0x144
  github.com/tikv/pd/server/statistics.(*RollingStoreStats).GetLoad()
      /x/pd/server/statistics/store.go:251 +0x137
  github.com/tikv/pd/server/statistics.(*storeStatistics).Observe()
      /x/pd/server/statistics/store_collection.go:149 +0x3396
  github.com/tikv/pd/server/statistics.(*storeStatisticsMap).Observe()
      /x/pd/server/statistics/store_collection.go:282 +0x1d2
  github.com/tikv/pd/server/cluster.(*RaftCluster).collectMetrics()
      /x/pd/server/cluster/cluster.go:1987 +0x16e
  github.com/tikv/pd/server/cluster.(*RaftCluster).runMetricsCollectionJob()
      /x/pd/server/cluster/cluster.go:485 +0x166
  github.com/tikv/pd/server/cluster.(*RaftCluster).Start.func3()
      /x/pd/server/cluster/cluster.go:295 +0x39

What version of PD are you using (pd-server -V)?

Git Commit Hash: f999ad5
Git Commit Hash: 88357d3

@hnes hnes added the type/bug The issue is confirmed as a bug. label Mar 1, 2023
@hnes
Copy link
Contributor Author

hnes commented Mar 1, 2023

I was investigating #6045 and found these several data races by chance.

@rleungx rleungx added affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. and removed affects-6.1 This bug affects the 6.1.x(LTS) versions. labels Mar 2, 2023
@lhy1024
Copy link
Contributor

lhy1024 commented Mar 2, 2023

I will solve the part of median filter

ti-chi-bot added a commit that referenced this issue Mar 3, 2023
ref #5798, close #6069

Signed-off-by: lhy1024 <[email protected]>

Co-authored-by: Ti Chi Robot <[email protected]>
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Mar 3, 2023
ti-chi-bot added a commit that referenced this issue Mar 3, 2023
ref #5798, close #6069, ref #6080

Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: lhy1024 <[email protected]>

Co-authored-by: lhy1024 <[email protected]>
ti-chi-bot added a commit that referenced this issue Mar 6, 2023
ref #5310, ref #6069

Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ti Chi Robot <[email protected]>
ti-chi-bot added a commit that referenced this issue Mar 6, 2023
ref #5310, ref #6069, ref #6070

Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ryan Leung <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. severity/moderate type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants