Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use atomic.Load to access fields used in /varz and /subsz requests. #445

Merged
merged 2 commits into from
Mar 3, 2017

Conversation

ColinSullivan1
Copy link
Member

@ColinSullivan1 ColinSullivan1 commented Mar 1, 2017

  • Includes a unit test that checks all endpoints for data races.
  • Link to issue, e.g. Resolves #NNN (N/A, included test demonstrates the issue)
  • Documentation added (if applicable) (N/A)
  • Tests added
  • Branch rebased on top of current master (git pull --rebase origin master)
  • Changes squashed to a single commit (described here) I suggest we squash/rebase with the merge... we can take this offline.
  • Build is green in Travis CI
  • You have certified that the contribution is your original work and that you license the work to the project under the MIT license

/cc @nats-io/core

* Includes a unit test that checks all endpoints for data races.
@ColinSullivan1
Copy link
Member Author

The data race behind this PR:

==================
WARNING: DATA RACE
Read at 0x00c42017a298 by goroutine 49:
  github.com/nats-io/gnatsd/server.(*Server).HandleVarz()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/monitor.go:479 +0x64d
  github.com/nats-io/gnatsd/server.(*Server).HandleVarz-fm()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/server.go:504 +0x5f
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/Cellar/go/1.8/libexec/src/net/http/server.go:1942 +0x51
  net/http.(*ServeMux).ServeHTTP()
      /usr/local/Cellar/go/1.8/libexec/src/net/http/server.go:2238 +0xa2
  net/http.serverHandler.ServeHTTP()
      /usr/local/Cellar/go/1.8/libexec/src/net/http/server.go:2568 +0xbc
  net/http.(*conn).serve()
      /usr/local/Cellar/go/1.8/libexec/src/net/http/server.go:1825 +0x71a

Previous write at 0x00c42017a298 by goroutine 69:
  sync/atomic.AddInt64()
      /usr/local/Cellar/go/1.8/libexec/src/runtime/race_amd64.s:276 +0xb
  github.com/nats-io/gnatsd/server.(*client).deliverMsg()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/client.go:944 +0x251
  github.com/nats-io/gnatsd/server.(*client).processMsg()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/client.go:1164 +0xd98
  github.com/nats-io/gnatsd/server.(*client).parse()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/parser.go:226 +0x20e1
  github.com/nats-io/gnatsd/server.(*client).readLoop()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/client.go:283 +0x296
  github.com/nats-io/gnatsd/server.(*Server).createClient.func2()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/server.go:627 +0x41

Goroutine 49 (running) created at:
  net/http.(*Server).Serve()
      /usr/local/Cellar/go/1.8/libexec/src/net/http/server.go:2668 +0x35f
  github.com/nats-io/gnatsd/server.(*Server).startMonitoring.func1()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/server.go:525 +0x6e

Goroutine 69 (running) created at:
  github.com/nats-io/gnatsd/server.(*Server).startGoRoutine()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/server.go:870 +0xba
  github.com/nats-io/gnatsd/server.(*Server).createClient()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/server.go:627 +0x68d
  github.com/nats-io/gnatsd/server.(*Server).AcceptLoop.func2()
      /Users/colinsullivan/go/src/github.com/nats-io/gnatsd/server/server.go:426 +0x58
==================
--- FAIL: TestEndpointDataRaces (4.50s)
	testing.go:610: race detected during execution of test
FAIL
exit status 1
FAIL	github.com/nats-io/gnatsd/test	4.523s

After the fix here:

=== RUN   TestEndpointDataRaces
--- PASS: TestEndpointDataRaces (4.46s)
PASS
ok  	github.com/nats-io/gnatsd/test	5.490s

@ColinSullivan1 ColinSullivan1 self-assigned this Mar 1, 2017
Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments, otherwise looks good.
For some reason, the coverall result is not displayed in the PR?!

st.NumMatches = s.matches
if s.matches > 0 {
st.CacheHitRate = float64(s.cacheHits) / float64(s.matches)
st.NumInserts = atomic.LoadUint64(&s.inserts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like s.inserts and s.removes are updated under sublist's lock, so I don't think you need atomic for those 2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will fix.

opts2.Routes = server.RoutesFromStr("nats-route://127.0.0.1:10223")

s2 := RunServer(&opts2)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to call checkClusterFormed(t, s1, s2) here to ensure cluster is formed, instead of using time.Sleep(2 * time.Second) in the caller. You would need to pass t *testing.T as parameter to runMonitorServerClusteredPair().

Copy link
Member Author

@ColinSullivan1 ColinSullivan1 Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

defer s2.Shutdown()

// give some time for a route to form
time.Sleep(2 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment in runMonitorServerClusteredPair()

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling efbd423 on fix-monitoring-data-races into ** on master**.

@kozlovic
Copy link
Member

kozlovic commented Mar 3, 2017

Sorry for the delay, LGTM.

@petemiron
Copy link
Contributor

LGTM. Looks like coveralls error is a known, unresolved issue.

@petemiron petemiron merged commit 4fa3b94 into master Mar 3, 2017
@petemiron petemiron deleted the fix-monitoring-data-races branch March 3, 2017 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants