Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP SYN backlog change likely has concurrent global var access issues #729

Closed
pfalcon opened this issue Jul 9, 2017 · 14 comments
Closed
Assignees
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@pfalcon
Copy link
Contributor

pfalcon commented Jul 9, 2017

ab -n1000 http://192.0.2.1:8080/ run against http_server.py (see #728) of MicroPython built against tag v1.8.0:

Time taken for tests:   21.512 seconds
Complete requests:      1000
Failed requests:        910
   (Connect: 0, Receive: 0, Length: 910, Exceptions: 0)
Total transferred:      45920 bytes
HTML transferred:       28920 bytes
Requests per second:    46.49 [#/sec] (mean)
Time per request:       21.512 [ms] (mean)
Time per request:       21.512 [ms] (mean, across all concurrent requests)
Transfer rate:          2.08 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        5    9   9.8      7     116
Processing:     8   12   9.2     11     118
Waiting:        8   11   8.8     10     118
Total:         14   21  14.9     18     158

ab -n10 http://192.0.2.1:8080/ (note - 10 requests!) run against uPy build against Z master:

Time taken for tests:   17.383 seconds
Complete requests:      10
Failed requests:        1
   (Connect: 0, Receive: 0, Length: 1, Exceptions: 0)
Total transferred:      441 bytes
HTML transferred:       271 bytes
Requests per second:    0.58 [#/sec] (mean)
Time per request:       1738.346 [ms] (mean)
Time per request:       1738.346 [ms] (mean, across all concurrent requests)
Transfer rate:          0.02 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        6 1723 1170.2   1030    3051
Processing:    10   16  10.0     12      41
Waiting:        8   15  10.5     10      41
Total:         19 1738 1173.0   1042    3092
@pfalcon
Copy link
Contributor Author

pfalcon commented Jul 9, 2017

Bisecting...

@pfalcon
Copy link
Contributor Author

pfalcon commented Jul 9, 2017

No unsurprisingly it's #672. Specifically, the first broken commit is 117771e

@pfalcon
Copy link
Contributor Author

pfalcon commented Jul 9, 2017

@pfl: FYI ^^

@pfalcon
Copy link
Contributor Author

pfalcon commented Jul 9, 2017

D'oh. So, by default there's 1 entry in backlog table. When we send SYN/ACK, we take that entry and set up worker timer there. When we receive ACK, we've done with that entry, but can't remove it, as there sits counting down worker timer. At which point a new connection arrives, and being dropped, many times, until long ACK delay expires and worker handler clear the entry.

That's how the patch developed to allow multiple concurrent connections regressed multiple sequential connections. Using any other arbitrary sizes for backlog queue aren't going to help of course, the culprit is in the usage of worker API and its deficiency (#682) in Zephyr.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jul 9, 2017

Actually, I guess there's confusion with how k_delayed_work_cancel() works. Let's consider this specific example, which uses ACK timeout of 1000ms. That's big timeout, and all this 1000ms delayed work just counts down this timeout and can be perfectly cancelled. Only at the end of these 1000ms it gets added to scheduler queue, where it can't be cancelled. It spends there maybe 100us (well, depends on well-behavedness of other delay work handlers and perhaps other components of system, e.g. #672 calls send_reset(), dunno how fast that may finish). So, following patch works as expected:

+       if (k_delayed_work_cancel(&tcp_backlog[r].ack_timer) < 0) {
+               /* Too late to cancel - we may be preempted on the next
+                * instruction. But with few instructions we have left so
+                * far, we stay positive and send final hello to impending
+                * worker asking not to destroy our connection. I don't know
+                * why we're doing that though - too late is too late. Also,
+                * there can be race condition with us patching entry after
+                * it was cleared by the worker.
+                */
+               tcp_backlog[r].cancelled = true;
+       } else {
+               memset(&tcp_backlog[r], 0, sizeof(struct tcp_backlog_entry));
+       }
+#if 0
        if (k_delayed_work_remaining_get(&tcp_backlog[r].ack_timer) > 0) {
                tcp_backlog[r].cancelled = true;
        } else {
                k_delayed_work_cancel(&tcp_backlog[r].ack_timer);
                memset(&tcp_backlog[r], 0, sizeof(struct tcp_backlog_entry));
        }
+#endif

As can be seen I was skeptical for the need of "cancelled" flag at all, given then window of 1000ms when it can be cancelled vs 0.1ms when it can't. But here's another issue - I'm not sure how this (tcp_backlog_ack()) function is called, so conservative assumption is that it may be preempted by work handler. But then tcp_backlog_ack() both reads and writes tcp_backlog[r] and work handler frees it. Then, there must be synchronization when accessing tcp_backlog[r], and as work handler is handled by thread scheduler, I guess the (only) reasonable way is to use irq_lock. But then there's also contention who will free tcp_backlog[r] - if tcp_backlog_ack() can't cancel the worker, it must not free it, but it will know it only after call to k_delayed_work_cancel(). That means that the critical section would be quite big - literally from the start of tcp_backlog_ack() until it calls k_delayed_work_cancel(), 95% of the function body.

Given these concerns on how to synchronize access to tcp_backlog, I don't propose a patch, but instead would appreciate comments from @jukkar , @pfl, @Vudentz , and other folks.

@pfalcon pfalcon changed the title IP stack: Some commit since 1.8.0 adversely affected connection acceptance, leading to 1s+ delays TCP SYN backlog patch adversely affected connection acceptance, leading to 1s+ delays Jul 10, 2017
@pfalcon pfalcon added priority: medium Medium impact/importance bug and removed priority:high labels Jul 12, 2017
@pfalcon pfalcon changed the title TCP SYN backlog patch adversely affected connection acceptance, leading to 1s+ delays TCP SYN backlog change likely has concurrent access issues Jul 13, 2017
@pfalcon pfalcon changed the title TCP SYN backlog change likely has concurrent access issues TCP SYN backlog change likely has concurrent global var access issues Jul 13, 2017
@Vudentz
Copy link
Contributor

Vudentz commented Jul 17, 2017

I would appreciate any comments to:

#777

@pfalcon
Copy link
Contributor Author

pfalcon commented Sep 18, 2017

@pfl : Ping about concerns with concurrent access to tcp_backlog as described above.

@pfl
Copy link
Collaborator

pfl commented Sep 20, 2017

The TCP backlog code will not have any issues with IRQs. It is part of the TCP stack, and thus shielded from others because it is surrounded by FIFOs, both from applications and driver interrupts. So there should not be any synchronization issues - if that was the question. Now with #777 merged, there is no longer any need for the cancelled flag either.

But I'm not sure I'm able to follow the discussion to 100% in this issue anymore, as the Zephyr code has evolved since the issue was opened.

@pfalcon
Copy link
Contributor Author

pfalcon commented Sep 20, 2017

@pfl : Thanks for the reponse

The TCP backlog code will not have any issues with IRQs

Well, the question is not just about IRQs, but about thread-safety in general.

It is part of the TCP stack, and thus shielded from others because it is surrounded by FIFOs, both from applications and driver interrupts.

Yeah, but your code, running inside TCP stack, accesses the same global variable without any obvious attempts to synchronize this access. So, this is a very generic concern, which applies to any multithreaded (in the sense "multiple execution contexts") system, let's have a look at https://en.wikipedia.org/wiki/Thread_safety#Implementation_approaches , second subheading there:

The second class of approaches are synchronization-related, and are used in situations where shared state cannot be avoided:

This is exactly our case - there's a global array, and we can't easily drop it.

Mutual exclusion
Atomic operations

So, wikipedia (and common knowledge of anyone who worked with multithreading) says that above measures should be used. Your code doesn't have them. That's the concern.

If you say "there should not be any synchronization issues", then maybe it's like that, and myself should learn why the worst traits of preemptive multithreading, which apply to any such system out there, suddenly don't apply to Zephyr. Alternatively, it's still a chance to think whether we have a problem here. Thanks.

@pfl
Copy link
Collaborator

pfl commented Sep 21, 2017

AFAIK the IP and TCP stacks were not preemtible last time I checked - a few years ago. The initial design was to have access to the TCP/IP stack through FIFOs, and I think that still holds. I don't see any reason to have more than one thread handling TCP/IP packets, it will just use up more memory for minimal speed gains, unless numbers prove me wrong.

I must assume that @jukkar knew what he was doing in applying the initial implementation. If the pre-emption assumption turns out to be wrong now, please open up a new issue with @jukkar instead of beating on an already closed issue that fixed the usage of k_delayed_work_cancel(). And I do know what nice fireworks are to be seen by accessing a global array if the thread is pre-empted...

@nashif
Copy link
Member

nashif commented Feb 20, 2018

is this still an issue? any updates? @pfl, @pfalcon

@pfalcon pfalcon added priority: low Low impact/importance bug and removed priority: medium Medium impact/importance bug labels Feb 20, 2018
@pfalcon
Copy link
Contributor Author

pfalcon commented Feb 20, 2018

@nashif : Currently there're known issues with TCP support in Zephyr, and this one in my list for extra-detailed checking. (@pfl reviewed it and responded that everything should be ok, nor I have any specific issues to point at, except for lack of explicit synchronized access.) This is low priority, assigned directly to me to hopefully avoid confusion.

@pfl
Copy link
Collaborator

pfl commented Feb 21, 2018

If the IP stack is made re-entrant, this issue needs looking into.

@jukkar
Copy link
Member

jukkar commented Mar 9, 2019

This is quite old and we have recently added locking to net_context access -> closing it.

@jukkar jukkar closed this as completed Mar 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

5 participants