Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asio on linux with multiple threads io_service->run : performance? #886

Open
timprepscius opened this issue Aug 12, 2021 · 2 comments
Open

Comments

@timprepscius
Copy link

Hey there,

I realize this is the vaguest of questions, however I will ask in any case.
I am developing yet another com library, which uses asio as its backend.

In the code I use:

  • either 1 thread or multiple threads calling io_service->run()
  • receives and sends are performed using async_receive_from and async_send/async_send_to

There is just one socket A sending to socket B

On linux if there is only 1 thread
The packet rate is ~ 100,000

If there are multiple threads it drops to:
10,000

Of course the numbers are system dependent, and there are acks, retries when necessary, etc, however there is definitely a very large reduction when multiple threads are running on linux.

On osx, multiple threads increases performance considerably.

Has anyone else seen this?

I'll set up a simple tester sometime next week, but I thought I would ask today.
All the code for the asio portion of the com library is here:
https://github.com/timprepscius/MrUDP/tree/main/mrudp/imp

@mabrarov
Copy link

Hi @timprepscius,

IMHO, Asio is more performant on Linux when the sockets are spread b/w multiple instances of io_service (number of instances can be connected to the number of available CPUs) and there is just a one thread running on each instance of io_service (so that each instance of io_service has a dedicated thread accessing that instance exclusively).

Note that constructor of io_service (io_context) has concurrency hint (https://www.boost.org/doc/libs/1_76_0/doc/html/boost_asio/overview/core/concurrency_hint.html) which can be used to decrease number of locks if the "strategy" described above is used.

Note that spreading of sockets b/w multiples instances of io_service (b/w multiple epoll instances) has some drawbacks. Refer to https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/.

I do believe that there was some performance analysis (and there were some performance enhancements in Asio implemented later) for Linux somewhere at https://think-async.com/, but I can't find that page anymore.

I think that the most of performance issues of Asio on Linux are connected with / are caused by the nature of epoll.

On Windows, I guess, it's more performant to have a single instance of io_service and multiple threads (number of threads connected with number of available CPUs) running on it, because Windows IOCP provides some performance hints (e.g. LIFO scheduling) for the threads requesting completion packets on a single completion port. This can be found in Microsoft documentation for IOCP.

All the information I shared here is yet not confirmed. I have https://github.com/mabrarov/asio_samples where I have ma_echo_server application which supports both of approaches described above (and uses Asio concurrency hints and custom memory allocation to reduce number of locks and other things which can slow down the application) and allows to choose the "strategy" as demux-per-work-thread configuration option at the start (using different strategies for Linux and Windows by default). I wanted to run performance tests (refer to ma_asio_performance_test_client which is slightly modified performance test from Asio) and to profile ma_echo_server to understand what strategy works better (on Linux, on Windows) and what are the bottlenecks. Unfortunately, I still didn't find:

  1. hardware with 8+ CPUs to run test client and the same hardware to run ma_echo_server with sufficient network bandwidth b/w these hosts
  2. somebody willing to review the code and to help with design of test approach (what cases to test? how to run them? how to profile?)
  3. somebody willing to help with execution of tests and analysis of the results (I don't know Linux, just a newbie)
  4. time to work on these things.

BTW, please note that Asio integration with OpenSSL has additional pitfalls (locks becoming bottlenecks). I was looking at http://konradzemek.com/2015/08/16/asio-ssl-and-scalability/, but it's 6 years old and there are definitely changes in both Asio-OpenSSL integration and in OpenSSL implemented since that investigation.

@timprepscius
Copy link
Author

Thank you for all of this information.
This is really helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants