-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread-safety improvements. Add tests with multiple threads. CI improvements. #122
Conversation
The test failures are coming from replacing I think it's best to be conservative with this update so I'm going to revert that commit. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #122 +/- ##
==========================================
- Coverage 79.29% 79.19% -0.10%
==========================================
Files 10 10
Lines 1922 1923 +1
==========================================
- Hits 1524 1523 -1
- Misses 398 400 +2 ☔ View full report in Codecov by Sentry. |
2be833c
to
877be4f
Compare
As discussed offline, added back the commits replacing |
Again discussed on slack. We decided to revert to |
This should fix an exception seen in CI from the lingering timeout task: ``` Test Summary: | Pass Total Time Deserialization error recovery and include() | 11 11 3.9s From worker 4: Unhandled Task ERROR: EOFError: read end of file From worker 4: Stacktrace: From worker 4: [1] wait From worker 4: @ .\asyncevent.jl:159 [inlined] From worker 4: [2] sleep(sec::Float64) From worker 4: @ Base .\asyncevent.jl:265 From worker 4: [3] (::DistributedNext.var"#34#37"{DistributedNext.Worker, Float64})() From worker 4: @ DistributedNext D:\a\DistributedNext.jl\DistributedNext.jl\src\cluster.jl:213 ```
a866f6b
to
c09b034
Compare
f6f1d38
to
190a6b6
Compare
e7212a9
to
dcd89eb
Compare
ddb8ad4
to
836bed1
Compare
3d6f794
to
54575e2
Compare
54575e2
to
e28698f
Compare
@@ -274,7 +274,7 @@ end | |||
const any_gc_flag = Threads.Condition() | |||
function start_gc_msgs_task() | |||
errormonitor( | |||
Threads.@spawn begin | |||
@async begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change fixed the race (#124). It seems to have always been a @spawn
hence why I didn't initially convert it to @async
.
e28698f
to
ca2bd3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to help the process along given review is required for merge
This cuts the running time down from ~1m to ~10s. It shouldn't be necessary to re-create the workers for every iteration of the inner loop.
f189b1c
to
8494bbd
Compare
Co-authored by @IanButterworth
@spawn
to@async
to get tests passing. Moving back to@spawn
will take more work, but it should be low priority because the main thread, where Distributed usually runs, will already besticky
anyway, so the gain is minor.fbf8c12
(#122) is necessary to avoid each worker in those tests re-precompiling any stdlibs they use. Speeds up macos runs by ~1 minute97fc237
(#122) adds a sigusr1/siginfo profile before killing a worker that won't shut down, to aid debuggingFixes #124
TODO
@async
)CC @jpsamaroo