-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing on Win10 occasionally fails (and loses data) #781
Comments
Oh nice bug report! |
There are a few changes that write needs. This is one of them. The other is it calls the callback when there is still data to be written, and end up calling it multiple times. I'm going to fix the later bug first and get good integration tests on it, then work on this one. If you want to land a patch in the meantime I can help with the automated testing. |
I've written the tests that writes 5k at 9600 baud. This was enough to force multiple writes on unix but doesn't seem to cause this issue on our CI server. We should probably test on an actual computer, but it should exercise your proposed option 1. |
The branch is write-bindings where I'm doing the work. |
@giseburt I've done a lot more research and I've got a PR #808 out for testing. On our CI (Windows Server 2012 R2 ) a 5k write at 9600 baud doesn't cause the issue. I don't have a windows 10 box around for testing. I'm curious to why we're getting If the If the You can test this branch by changing your package.json to include Do you have the full text of your error? I'm glad you're back helping improve serialport! =) |
Also it looks like we don't lose data, just repeat data if it does get written which I can't seem to find any documentation to confirm. |
@giseburt I'd love to get your feedback on this, I'm unable to reproduce it but I'm happy to add the correct logic if you can help explain it further. |
I'm just now starting to work on this again. We have customers waiting on it, so I'll have some more info later today. |
@reconbot Some more background on the problem, how we're using it, and what I think/hope will fix this. Note that this issue doesn't exist on Linux or OS X. In both of those OSs, the write it truly blocking -- which is fine, it's in it's own thread. Background: I work at @synthetos on the TinyG and G2 Core (the next generation of TinyG) projects. In both cases, we use either a USB->serial port or a true serial port (from a BeagleBone, C.H.I.P, or Pi, for example) to talk to the board. In G2 Core, we are doing high-speed control of machines like 3D printers. We queue up movements that come in from the serial port, and when our queue or serial port buffers are full, we stop reading from the serial port. To prevent the data from getting lost to Problem Description: When the host OS is told to stop writing, it continues to buffer data before it starts to block writing to the file descriptor. On OS X and Linux we tell it to use block without a timeout, and that works perfectly. On Windows, we configure the timeouts to non-zero (see docs), and then when it does timeout we throw away the "packet" that partially sent and then move to the next packet. Probable Solution: Simply set the write timeouts to zero. Then the thread would completely block for the write to complete, or the port to close. There is one other concern. When you compare the linux write call to the windows write call you see that the windows version writes from I'll now attempt to work out these fixes and see if I can find a clean test to prove not-working before and working after. ** UPDATED ** - I found that we ARE setting the timeouts. |
I sent some data to ESP8266's NodeMCU. Same data loss happened. When I set baudrate to 9600 it works perfectly. @giseburt, I tried to apply your solution but I haven't succeded yet. The same data loss happened again on Windows 10. I tried to set "write timeouts" to 0. The same issue happened. I also fixed the data->bufferData + data->offset issue. But i think that this is not the case. I believe that it doesn't iterate. So that one has no effect at all. |
Should be fixed in master, I'll have a beta out shortly |
Thoughts on the read timeouts? It doesn't seem like we'd want them either |
I thought about that too. I wasn't sure enough to change it without proof Logically, it seems that there should not be timeouts for reads either. I |
Lets do it and give it a good manual testing. |
Sounds good. 👍🏻 |
- [all] fixed a crash when pausing while reading thanks to @bminer and @baffo32 and others to debug and fix this - [all] ReadLineParser has been renamed to ReadlineParser to match worldwide convention - [windows] Fix flush behavior using PurgeComm fixing #962 via @samisaham - [windows] Remove read and write timeouts solving #781 via @giseburt - [docs] Electron build docs #965 via @chalkers - [docs] Spelling fixes via @Awk34 - [docs] Update parser docs to be correct #970 via @jacobq - [notes] buffer-indexof is now completely compliant with latest node 6 buffer.indexOf because it's nice to have and we now use it on older nodes - [notes] dropped node 5 support, should still work but you shouldn't use it
I gave a try to version 5.0.0-beta2. I use rs-485, no flow control. |
How does v4.0.5 work for you? |
According to https://msdn.microsoft.com/en-us/library/windows/desktop/aa363190(v=vs.85).aspx and other sources referring to that same:
So I believe what we want, what other platforms call "non-blocking read," is to set (I emailed this and it's not showing up so hopefully it doesn't duplicate.) |
With [email protected] everything is ok. |
I've updated the timeout again, sorry about taking so long. I'll see about releasing another beta. |
Operating System and Hardware: Windows 10
NodeJS Version: electron 0.37.2
Serialport version: 2.1.2
Python Version: (unknown)
Expected behavior
Writing to the serialport would occasionally block in a thread, as it does on Linux and OS X. The WriteQueue for that fd will continue to be added to for further writes. (See https://github.com/voodootikigod/node-serialport/blame/master/src/serialport.cpp#L298 , which is code I helped add. 😄 ).
Alternatively, if blocking in a thread isn't acceptable for Windows, the timeout should NOT pop that write from the queue but instead modify it to prevent double-sending of sent portion of the data, and ensure that the send will be attempted again very shortly. (We don't want to cause a performance issue here.)
It's vital that we don't add semantics that pay attention to the flow control of the device. This is regularly abused and causes all sorts of troubles. (Flow control is currently broken in Python for us for this very reason.)
Actual behavior
An error (generated from https://github.com/voodootikigod/node-serialport/blob/master/src/serialport_win.cpp#L374 ) with error code 121 ("Semphore Timout": https://msdn.microsoft.com/en-us/library/windows/desktop/ms681382(v=vs.85).aspx#ERROR_SEM_TIMEOUT ).
Steps and/or code to reproduce the issue
Basically, you want to coerce the device to not read what we write, forcing Windows to block while writing and eventually timing out, losing data in the process.
Option 1 (explicit flow control):
write()
data to the device until the device's buffer fills and it signals that flow control should stop. NOTE: This should NOT be explicitly handled by serialport. The OS handles this properly and that should be left as-is.write()
more data to the serialport. The error will happen and the data written will be lost instead of queued.Option 2 (USB-based flow control):
3.
write()
data to device until you see the error.The text was updated successfully, but these errors were encountered: