-
Notifications
You must be signed in to change notification settings - Fork 6.8k
a problem in distribute training #6975
Comments
Delete "send_buf.WaitToRead();" in line 217 of the file 'src/kvstore/kvstore_dist.h' can solve the problem. |
I don't know why the following chunk of code is not moved inside
If it is moved inside the lambda, the |
|
sir, will you instruct me for distribute training on two machines, thank you very much. I dit it according the official document, but it did not work on two machines |
@idealboy There's an example for running dist training here https://mxnet.incubator.apache.org/how_to/multi_devices.html The original issue should be resolved now with #7489 so I'm closing it for now. For further discussions/questions, we're moving to https://discuss.mxnet.io/ |
Environment info
Operating System:ubuntu14.04
Package used (Python/R/Scala/Julia):Python
MXNet version:0.10.1
Or if installed from source:install from source
If you are using python package, please provide
Python version and distribution:python 2.7
Problem Message:
Why do I meet this problem and how to resolve it?
The text was updated successfully, but these errors were encountered: