-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase ThreadPool.MinThreads limit #3845
Conversation
Roslyn like solution, 74 projects, mix of net472 and net6.0, small threadpool, batch size 10 - 47s big threadpool, batch size 10 - 45s |
// The correct fix would be to re-visit all code that offloads work to threadpool and avoid blocking any thread, | ||
// and also use async await when we need to await a completion of an action. But that is a far away goal, so this | ||
// is a "temporary" measure to remove the threadpool contention. | ||
var additionalThreadsCount = Environment.ProcessorCount * 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the concept of pushing the thread pool limit higher, but I'm not sure I get why we decided to raise the limit to exactly 4 times the processor count. Is "4" a constant determined empirically that yielded the best throughput overall or is there a more formal reason for choosing it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used 2*, then 5* then 10*. I know that the path blocks 2-3 threads, and there are double the threads blocked when you have a datacollector. So it is part empirical, part guessing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I ended up with 1* (workerThreads) + 4*, to land on 5*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write that down as comment in code for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried to change only the worker one? I don't know the I/O one makes difference here, we're not async at all so I suppose we're not using it. And let's hope to not generate too much contention/memory consuption(if I recall well on .NET 1mb per stack is commited) on huge test servers 🤞 const heuristics is...meh...
Maybe if it's only for VS responsiveness we could recognize it(design mode) and increase only in that case, but not mandatory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try to change, but if the IO pool is not exhausted then this option has no impact, it simply won't get past the limit. In both cases we are just telling the threadpool thread to start more threads without delay when it needs to. We will eventually stop blocking them, and thread pool can re-balance itself, we are only paying a small price say 1MB per thread, but we know we will cap out on the amount of threads that is x times the processor count. So if there is a server that has 100 cores, it probably will also have 1GB of ram to spare. If it ever gets to the point where it allocates that many threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the comment.
@nohwnd FYI this is better starting from 6.0 there's a CooperativeBlocking mechanism and at every Task.Wait/Task.Return the ThreadPool is injecting a new thread without wait. Also completion port is not more used like in the past but I/O callback are "batched" and run with normal worker thread cc: @Evangelink |
Description
Raise the limit of how many thread can ThreadPool start without waiting. The limit is by default the based on the number of logical cores. But vstest.console blocks many threads because it uses non-async locks to wait for actions to complete, and so when we start many testhosts in parallel, each testhost wants at least 1 thread from threadpool, and so the limit is quickly reached. Threadpool slowly adds new threads (about 1 per second), which makes the testhost callbacks take a long time to complete. This can be observed by looking at the time where testhost reports it connected to server, and the time where we actually see the client was accepted.
Or by putting this piece of code into TestRequestSender:
Before the change this elapsed time can be from 250ms (which it roughly the time that testhost spends setting up and looking up extensions), till 3-4seconds when there are many testhosts running in parallel and threadpool is saturated and does not have enough non-blocked threads to process the incoming callbacks.
Raising the limit allows ThreadPool to start a new thread without waiting, and process the callback as soon as it comes. So the measurement above is much closer to 250ms.
Raising the limit also allows us to see that SkipDefaultAdapters has some impact on the execution time, because before this change the impact it had (~200ms) was greatly offset by the time we spend waiting to connect (~2000ms). We can also see the single channel back to IDE (SocketCommunicationManager.WriteAndFlushToChannel) to be a contention point (because many testhosts are trying to push as much data as they can through this single channel). And so we can also see BatchSize to have significant impact on speed, where moving it up from 10 to 100 and 1000 makes the discovery time progressively shorter.
Fix AB#1563539