Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle Writable | Hup? #449

Closed
SerhoLiu opened this issue Jun 25, 2018 · 26 comments
Closed

How to handle Writable | Hup? #449

SerhoLiu opened this issue Jun 25, 2018 · 26 comments
Assignees
Labels
C-bug Category: This is a bug.

Comments

@SerhoLiu
Copy link
Contributor

Related questions: tokio-rs/tokio-socks5#13

When socket is Writable | Hup, https://github.com/tokio-rs/tokio/blob/master/tokio-reactor/src/poll_evented.rs#L520 poll_read_ready return Ok(Ready(Hup)), but self.get_ref().read(buf) return WouldBlock, how to handle with this situation?

any help?

@carllerche
Copy link
Member

If this is true, then it is a bug. Receiving HUP should make the socket readable and the next read should return 0.

@carllerche carllerche added C-bug Category: This is a bug. E-help-wanted Call for participation: Help is requested to fix this issue. labels Jul 20, 2018
@SerhoLiu
Copy link
Contributor Author

Hi, macOS version 10.13.6. Follow the steps below to reproduce the problem:

  1. add debug log to https://github.com/tokio-rs/tokio/blob/master/tokio-reactor/src/poll_evented.rs#L390
let x = self.poll_read_ready(mio::Ready::readable());
println!("poll read ready {:?}", x);
if let Async::NotReady = x? {
    return Err(io::ErrorKind::WouldBlock.into())
}

let r = self.get_mut().read(buf);
println!("poll read {:?}", r);
if is_wouldblock(&r) {
    self.clear_read_ready(mio::Ready::readable())?;
}
  1. open Mac Terminal, run cargo run --example proxy and cargo run --example echo

  2. run telnet 127.0.0.1 8081, input some data

  3. kill -9 [telnet process]

  4. proxy log:

poll read ready Ok(Ready(Readable | Hup))
poll read Ok(0)
poll read ready Ok(NotReady)

# ......
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup))
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Readable | Hup))
poll read Ok(0)
client wrote 6 bytes and received 6 bytes

@hawkw
Copy link
Member

hawkw commented Aug 21, 2018

Hi @SerhoLiu, can I get you to let me know which version of tokio-reactor you were using when you observed this behaviour?

I've been trying to reproduce this by following the steps you posted above (on macOS 10.12.6). I don't see the behaviour you described on master, but I do see it with tokio-reactor v0.1.2 (which I believe was the current version when you opened this issue) and v0.1.1.

It's possible this was fixed between versions?

Proxy logs from `master`
Listening on: 127.0.0.1:8081
Proxying to: 127.0.0.1:8080
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(8)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(8)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Readable));
poll read Ok(9)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(9)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Readable | Hup));
poll read Ok(0)
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable | Hup));
poll read Ok(0)
client wrote 17 bytes and received 17 bytes
Proxy logs from `tokio-reactor` v0.1.2
Listening on: 127.0.0.1:8081
Proxying to: 127.0.0.1:8080
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(6)

// ...

poll read ready Ok(Ready(Readable | Hup));
poll read Ok(0)
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Ok(0)
client wrote 19 bytes and received 19 bytes
Proxy logs from `tokio-reactor` v0.1.1
Listening on: 127.0.0.1:8081
Proxying to: 127.0.0.1:8080
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(8)

// ...

poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Ok(0)
client wrote 21 bytes and received 21 bytes

@hawkw
Copy link
Member

hawkw commented Aug 21, 2018

It looks like the issue also exists in the current release version of tokio-reactor (v0.1.3), so we ought to cut a new release. @SerhoLiu, can I get you to confirm that the issue is fixed for you with the current master, in order to make sure that a new release would resolve your issues?

Proxy logs from `tokio-reactor` v0.1.3
Listening on: 127.0.0.1:8081
Proxying to: 127.0.0.1:8080
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(5)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(5)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Readable));
poll read Ok(5)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(5)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Readable));
poll read Ok(5)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(NotReady);
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Readable));
poll read Ok(5)
poll read ready Ok(Ready(Readable));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Readable | Hup));
poll read Ok(0)
poll read ready Ok(NotReady);
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Err(Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" })
poll read ready Ok(Ready(Hup));
poll read Ok(0)
client wrote 15 bytes and received 15 bytes

@hawkw
Copy link
Member

hawkw commented Aug 21, 2018

I did a git bisect and it appears that this issue was fixed in commit 989262f (and manually verified it again by checking that commit and the prior commit).

It looks like the only change in that commit that could actually effect behaviour was the version bump for crossbeam-deque, from 0.6.0 to 0.6.1 (the other change only effects CI). According to the crossbeam-deque changelog, the only change in that version is "Change a few Relaxed orderings to Release in order to fix false positives by tsan." (crossbeam-rs/crossbeam-deque@64b174b). Perhaps it's possible that the errors tsan reported were not actually false positives?

@stjepang, do you have any insight into this?

@ghost
Copy link

ghost commented Aug 21, 2018

@hawkw Are you testing that on an x86-64 machine?

Assuming the answer is "yes", I'm surprised because Release fences compile down to noop and Release stores are equivalent to Relaxed stores. In other words, there should be absolutely no difference between crossbeam-deque v0.6.0 and v0.6.1 on x86-64 machines. 🤔

(and manually verified it again by checking that commit and the prior commit)

Just double-checking: was the prior commit d91c775?

Also, I wonder whether the bug persists if we use the latest master branch, but change 0.6.1 to =0.6.0 in Cargo.toml (would try for myself, but I don't have a Mac).

@hawkw
Copy link
Member

hawkw commented Aug 21, 2018

@stjepang I changed the crossbeam-deque dependency to =0.6.0 on the latest commit to master, and the bug persists. So, it seems the crossbeam-deque change did effect something --- I'm curious about what we'd see if we compared the assembly generated between those two commits.

Also, to answer your other questions: yes, I'm testing on x86_64 (Core i7 7920HQ); and yes, the prior commit I tested was d91c775.

@SerhoLiu
Copy link
Contributor Author

SerhoLiu commented Aug 22, 2018

@hawkw I use https://github.com/tokio-rs/tokio/tree/e964c4136c91c535fb2dfee0a3327dc5a6599903 produce the problem #449 (comment), this problem also exists on FreeBSD 11.2.

On my computer, current master also has this problem 😅. So I guess it might be related to the behavior of kqueue, maybe EV_EOF will only be processed when EVFILT_READ, like go and this. But kqueue man page say EV_EOF may also appear in EVFILT_WRITE, really confused.

@hawkw
Copy link
Member

hawkw commented Aug 22, 2018

@SerhoLiu interesting! Thanks for the additional information. I'll keep looking into it.

@kpp
Copy link
Contributor

kpp commented Aug 22, 2018

Tests must be written to catch this bug in the future.

@carllerche
Copy link
Member

@stjepang It is always possible as Relaxed vs. Release impacts compiler optimizations. So, the code that gets run could be different.

@hawkw
Copy link
Member

hawkw commented Aug 22, 2018

@SerhoLiu

On my computer, current master also has this problem 😅

I ran the reproduction against the current master a few more times, and it looks like (on my machine, at least) the issue does still exist, but it occurs intermittently. That would explain why git bisect was giving me such confusing results.

@stjepang, it seems like the crossbeam-deque dependency bump was a red herring, sorry to bother you about that.

@jerryz920
Copy link

I ran into this problem recently with tokio-0.1.16. We have two middle boxes that proxy traffic between each other. In this scenario, it is important for us to maintain half-closed state. However, futures are polled excessively when it is actually in half-closed state.

The key trigger logic is inside PollEvented2's write method.

Suppose someone has a simple future that only does "try_ready!(Socket.write(buf))". Then when this future starts to return NotReady, and the remote socket has shutdown the write channel, following will happen:

  1. calls poll_write_ready, and it returns Ready(HUP).
  2. treats this as a write ready event, and proceeds to TCP write
  3. TCP write returns EAGAIN since write is actually not ready
  4. check result is EAGAIN, and call clear_write_ready()
  5. in clear_write_ready() it calls poll_write_ready() again to test on its is_ready method
  6. it is ready (still returning Ready(HUP)), notify current task to be scheduled again.
  7. tokio threadpool worker will insert the task to local queue and soon repeats everything from step 1.

This will continue until a threshold (run local task for 32 times) is reached, so the worker will turn and call reactor(epoll). But even epoll won't break the loop unless the underlying write buffer has been cleared. HUP makes it work like a spinlock.

To fix this, we may need to distinguish RDHUP and HUP event, so that write is not affected by read shutdown on the other side. However, this also has something to do with mio, which currently does not distinguish the RDHUP and HUP (https://github.com/carllerche/mio/blob/master/src/sys/unix/epoll.rs#L231). I have no idea how to deal with this without breaking current users.

This becomes a serious problem for us. It easily freezes a multi-core machine with only a handful of connections. Wondering if we can discuss about the solution. I am more than happy to contribute to it, too.

@carllerche
Copy link
Member

Are you able to provide a failing test case?

@jerryz920
Copy link

jerryz920 commented Apr 23, 2019

yes. I attached a ready to build tarball and a README inside. Let me know if anything is unclear.
issue-91.tar.gz

You may find similar tarball in quininer/tokio-rustls#33, which is quite related I believe. That one uses TLS, so it's more complicated. But the condition to trigger the problem is the same: client shuts down write side, and the server side PollEvented.write returns EAGAIN due to write pressure or packet loss.

@jerryz920
Copy link

@carllerche do you have any plan to look into this issue (again) recently? It's being a showstopper for us right now so I hope it can be dealt with as soon as possible. The root cause of the problem is pretty clear to us at the moment, would you like me to add more details here or in a new ticket, such as approaches that we have tried to fix the problem?

@carllerche
Copy link
Member

Could you open a PR w/ a failing test case illustrating the problem? I think I understand, but it would be most helpful to have this as a minimized test case that can be committed to the repo.

@carllerche
Copy link
Member

The problem is if a HUP is not treated as writable, then in some cases, there is no signal to the write tasks to shutdown.

One strategy would be to treat a HUP as a write event only one time (to avoid the loop).

@jerryz920
Copy link

Got it. I took "failing test case" as "some external program". I will create a PR for this. Thanks

behrat added a commit to behrat/mio that referenced this issue May 1, 2019
Currently, EPOLLRDHUP sets UnixReady::hup(). This is incorrect behavior
because higher-level libraries like tokio (correctly) assume that
UnixReady::hup() is unclearable since it signals that both the read and
write halfs are shutdown. However, EPOLLRDHUP only means that the TCP
stream has been half-closed and, in reality, such a half-closed stream
can still be written to.

This will fix a current issue with tokio, which is that tokio infinitely
attempts to write to a half-closed socket that's returning WouldBlock
when it's write buffer is full, an issue which manifests with excessive
CPU usage. I think this may help some of the issues discussed in
tokio-rs/tokio#449

After this change, EOF will still be propogated correctly, because
read-hangups also trigger read-readiness via EPOLLIN. However, if
handling of EPOLLRDHUP is desired to be retained, I believe it should be
implemented as another readiness kind on UnixReady, perhaps
UnixReady::read_hup().

Possible concern of a breaking change: Since it's not currently possible
for a user of mio to differentiate between EPOLLHUP and EPOLLRDHUP, it
must be that no users of mio currently are. There _may_ be applications
that test the "health" of a socket by checking for UnixRead::hup(),
which would previously trigger on EPOLLRDHUP but will no longer with
this change. This will change such applications from considering a
half-closed connection as closed to considering it open. However, I
still beleive this change is a correction of the semantics of HUP and
the desired behavior such applications was already ambiguous.

Note: if this is an agreed upon change for epoll, I think a similar
change is in order for kqueue. I _think_ this would be to only set
UnixReady::hup() if (e.filter == libc::EVFILT_READ && e.flags &
libc::EV_EOF != 0), but I will leave that change to someone more
knowledgeable with kqueue.
behrat added a commit to behrat/mio that referenced this issue May 1, 2019
Currently, EPOLLRDHUP sets UnixReady::hup(). This is incorrect behavior
because higher-level libraries like tokio (correctly) assume that
UnixReady::hup() is unclearable since it signals that both the read and
write halfs are shutdown. However, EPOLLRDHUP only means that the TCP
stream has been half-closed and, in reality, such a half-closed stream
can still be written to.

This will fix a current issue with tokio, which is that tokio infinitely
attempts to write to a half-closed socket that's returning WouldBlock
when it's write buffer is full, an issue which manifests with excessive
CPU usage. I think this may help some of the issues discussed in
tokio-rs/tokio#449

After this change, EOF will still be propagated correctly, because
read-hangups also trigger read-readiness via EPOLLIN. However, if
handling of EPOLLRDHUP is desired to be retained, I believe it should be
implemented as another readiness kind on UnixReady, perhaps
UnixReady::read_hup().

Possible concern of a breaking change: Since it's not currently possible
for a user of mio to differentiate between EPOLLHUP and EPOLLRDHUP, it
must be that no users of mio currently are. There _may_ be applications
that test the "health" of a socket by checking for UnixRead::hup(),
which would previously trigger on EPOLLRDHUP but will no longer with
this change. This will change such applications from considering a
half-closed connection as closed to considering it open. However, I
still beleive this change is a correction of the semantics of HUP and
the desired behavior such applications was already ambiguous.

Note: if this is an agreed upon change for epoll, I think a similar
change is in order for kqueue. I _think_ this would be to only set
UnixReady::hup() if (e.filter == libc::EVFILT_READ && e.flags &
libc::EV_EOF != 0), but I will leave that change to someone more
knowledgeable with kqueue.
behrat added a commit to behrat/mio that referenced this issue May 1, 2019
Currently, EPOLLRDHUP sets UnixReady::hup(). This is incorrect behavior
because higher-level libraries like tokio (correctly) assume that
UnixReady::hup() is unclearable since it signals that both the read and
write halfs are shutdown. In reality, EPOLLRDHUP only means that the TCP
stream has been half-closed and such a half-closed stream can still be
written to.

This will fix a current issue with tokio, which is that tokio infinitely
attempts to write to a half-closed socket that's returning WouldBlock
when it's write buffer is full, an issue which manifests with excessive
CPU usage. I think this may help some of the issues discussed in
tokio-rs/tokio#449

After this change, EOF will still be propagated correctly, because
read-hangups also trigger read-readiness via EPOLLIN. However, if
handling of EPOLLRDHUP is desired to be retained, I believe it should be
implemented as another readiness kind on UnixReady, perhaps
UnixReady::read_hup().

Possible concern of a breaking change: Since it's not currently possible
for a user of mio to differentiate between EPOLLHUP and EPOLLRDHUP, it
must be that no users of mio currently are. There _may_ be applications
that test the "health" of a socket by checking for UnixRead::hup(),
which would previously trigger on EPOLLRDHUP but will no longer with
this change. This will change such applications from considering a
half-closed connection as closed to considering it open. However, I
still beleive this change is a correction of the semantics of HUP and
the desired behavior such applications was already ambiguous.

Note: if this is an agreed upon change for epoll, I think a similar
change is in order for kqueue. I _think_ this would be to only set
UnixReady::hup() if (e.filter == libc::EVFILT_READ && e.flags &
libc::EV_EOF != 0), but I will leave that change to someone more
knowledgeable with kqueue.
carllerche pushed a commit to tokio-rs/mio that referenced this issue May 14, 2019
Currently, EPOLLRDHUP sets UnixReady::hup(). This is incorrect behavior
because higher-level libraries like tokio (correctly) assume that
UnixReady::hup() is unclearable since it signals that both the read and
write halfs are shutdown. In reality, EPOLLRDHUP only means that the TCP
stream has been half-closed and such a half-closed stream can still be
written to.

This will fix a current issue with tokio, which is that tokio infinitely
attempts to write to a half-closed socket that's returning WouldBlock
when it's write buffer is full, an issue which manifests with excessive
CPU usage. I think this may help some of the issues discussed in
tokio-rs/tokio#449

After this change, EOF will still be propagated correctly, because
read-hangups also trigger read-readiness via EPOLLIN. However, if
handling of EPOLLRDHUP is desired to be retained, I believe it should be
implemented as another readiness kind on UnixReady, perhaps
UnixReady::read_hup().

Possible concern of a breaking change: Since it's not currently possible
for a user of mio to differentiate between EPOLLHUP and EPOLLRDHUP, it
must be that no users of mio currently are. There _may_ be applications
that test the "health" of a socket by checking for UnixRead::hup(),
which would previously trigger on EPOLLRDHUP but will no longer with
this change. This will change such applications from considering a
half-closed connection as closed to considering it open. However, I
still beleive this change is a correction of the semantics of HUP and
the desired behavior such applications was already ambiguous.

Note: if this is an agreed upon change for epoll, I think a similar
change is in order for kqueue. I _think_ this would be to only set
UnixReady::hup() if (e.filter == libc::EVFILT_READ && e.flags &
libc::EV_EOF != 0), but I will leave that change to someone more
knowledgeable with kqueue.
carllerche added a commit to tokio-rs/mio that referenced this issue May 15, 2019
Currently, EPOLLRDHUP sets UnixReady::hup(). This is incorrect behavior
because higher-level libraries like tokio (correctly) assume that
UnixReady::hup() is unclearable since it signals that both the read and
write halfs are shutdown. In reality, EPOLLRDHUP only means that the TCP
stream has been half-closed and such a half-closed stream can still be
written to.

This will fix a current issue with tokio, which is that tokio infinitely
attempts to write to a half-closed socket that's returning WouldBlock
when it's write buffer is full, an issue which manifests with excessive
CPU usage. I think this may help some of the issues discussed in
tokio-rs/tokio#449

After this change, EOF will still be propagated correctly, because
read-hangups also trigger read-readiness via EPOLLIN. However, if
handling of EPOLLRDHUP is desired to be retained, I believe it should be
implemented as another readiness kind on UnixReady, perhaps
UnixReady::read_hup().

Possible concern of a breaking change: Since it's not currently possible
for a user of mio to differentiate between EPOLLHUP and EPOLLRDHUP, it
must be that no users of mio currently are. There _may_ be applications
that test the "health" of a socket by checking for UnixRead::hup(),
which would previously trigger on EPOLLRDHUP but will no longer with
this change. This will change such applications from considering a
half-closed connection as closed to considering it open. However, I
still beleive this change is a correction of the semantics of HUP and
the desired behavior such applications was already ambiguous.

This also fixes a similar issue with kqueue.
@carllerche
Copy link
Member

This should be resolved by the upcoming release of mio: tokio-rs/mio#942.

@SerhoLiu
Copy link
Contributor Author

On my macOS version 10.14.4, update mio to v0.6.17 also has this problem. I can't write some minimized test case, but can follow the steps below to reproduce the problem:

  1. add some assert to mio kqueue
diff --git a/src/sys/unix/kqueue.rs b/src/sys/unix/kqueue.rs
index 371e0cf..7179d95 100644
--- a/src/sys/unix/kqueue.rs
+++ b/src/sys/unix/kqueue.rs
@@ -319,6 +319,13 @@ impl Events {
             }
         }
 
+        for event in &self.events {
+            if event.readiness().is_hup() {
+                assert!(event.readiness().is_writable(), "EV_EOF and EVFILT_WRITE");
+                assert!(event.readiness().is_readable(), "EV_EOF with EVFILT_WRITE not read ready");
+            }
+        }
+
         ret
     }
  1. open Mac Terminal, run cargo run --example proxy, cd tokio && cargo run --example echo

  2. run telnet 127.0.0.1 8081, input some data

  3. kill -9 [telnet process]

  4. proxy process log

thread 'tokio-runtime-worker-1' panicked at 'EV_EOF with EVFILT_WRITE not read ready', .../registry/src/github.jparrowsec.cn-1ecc6299db9ec823/mio-0.6.17/src/sys/unix/kqueue.rs:324:17
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

As I said, EV_EOF with EVFILT_WRITE does not indicate EVFILT_READ.

@jerryz920
Copy link

This should be resolved by the upcoming release of mio: tokio-rs/mio#942.

Yep. Braden and I have talked about it so epoll is good now.

@carllerche
Copy link
Member

@SerhoLiu Just because you don't get readable in the same call to poll doesn't mean the socket isn't readable. I'm not exactly sure what specific behavior you would like to see changed.

@SerhoLiu
Copy link
Contributor Author

@carllerche ok, but #449 (comment) problem still exists, maybe it’s not related to tokio-rs/mio#942 ...

@carllerche
Copy link
Member

The latest release of mio fixes this on kqueue platforms.

@Darksonn Darksonn removed the E-help-wanted Call for participation: Help is requested to fix this issue. label Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants
@carllerche @kpp @jerryz920 @Darksonn @SerhoLiu @hawkw and others