-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
struct ndpi_detection_module_struct
and multi-thread
#1344
Comments
) There are no real reasons to embed `struct ndpi_packet_struct` (i.e. "packet") in `struct ndpi_flow_struct` (i.e. "flow"). In other words, we can avoid saving dissection information of "current packet" into the "flow" state, i.e. in the flow management table. The nDPI detection module processes only one packet at the time, so it is safe to save packet dissection information in `struct ndpi_detection_module_struct`, reusing always the same "packet" instance and saving a huge amount of memory. Bottom line: we need only one copy of "packet" (for detection module), not one for each "flow". It is not clear how/why "packet" ended up in "flow" in the first place. It has been there since the beginning of the GIT history, but in the original OpenDPI code `struct ipoque_packet_struct` was embedded in `struct ipoque_detection_module_struct`, i.e. there was the same exact situation this commit wants to achieve. Most of the changes in this PR are some boilerplate to update something like "flow->packet" into something like "module->packet" throughout the code. Some attention has been paid to update `ndpi_init_packet()` since we need to reset some "packet" fields before starting to process another packet. There has been one important change, though, in ndpi_detection_giveup(). Nothing changed for the applications/users, but this function can't access "packet" anymore. The reason is that this function can be called "asynchronously" with respect to the data processing, i.e in context where there is no valid notion of "current packet"; for example ndpiReader calls it after having processed all the traffic, iterating the entire session table. Mining LRU stuff seems a bit odd (even before this patch): probably we need to rethink it, as a follow-up.
It makes sense to discuss. Fully initializing ndpi_detection_module_struct in every thread is very memory intensive. This is not only the ndpi_detection_module_struct structure itself(228 kB), but also all the data in it. ahocorasick, patricia, ndpi_default_ports_tree_node_t also take up memory. At this point, we know for sure that there are two elements in struct ndpi_detection_module_struct that are not multithread safe. These are libcache and LRU. I think that, first of all, we need a test case that could demonstrate two things: working in the mode of one thread gives and in the mode of several threads gives the same result, and the execution speed should decrease in proportion to the number of threads. I think an ideal candidate or starting point for this is examples/ndpiSimpleIntegration.c If successful, you can add constraints to the documentation. The implementation of multithreading does not require additional resources as long as we have an insignificant amount of common modifiable data. Before commit 730c236, one limitation was significant: one flow is simultaneously processed by one thread, since all the data we change is in the ndpi_flow structure. After commit 730c236, it is quite easy to restore multithreading:
I haven't come across very often implementations of blocking in userspace. You need to understand which locking mechanism to use, provided that there are no lock-free options. So far, I think mutex is the only portable solution. |
I agree that "sharing" a ndpi_struct.packet could save memory, but this if and only if all threads are supposed to do the same job (for example you use RSS to share the load across multiple threads). In this case you have a lot of traffic so a few KB extra isn't a problem if this guarantees you consistency. |
I started testing the variant with asynchronous work in ndpiReader. |
Closing: |
I am trying to summarize here the various discussions on multi threads support.
What is the current situation?
*)
struct ndpi_detection_module_struct
is not thread-safe and it has never been.*) This limitation is not well documented
*) ndpiReader example configures one
struct ndpi_detection_module_struct
for each thread*) any implementations sharing the same
struct ndpi_detection_module_struct
across multiple threads is buggy*)
struct ndpi_detection_module_struct
is not thread-safe because of the LRU caches (at least)*) commit 730c236 made
struct ndpi_detection_module_struct
even less thread-safe*) it seems possible to make
struct ndpi_detection_module_struct
thread-safe (even after 730c236)Now, what options do we have?
Do nothing (apart from improving documentation). Keep
struct ndpi_detection_module_struct
not thread safe; keep using one instance of it for each thread/coreWe acknowledge that having some shared structures/data among the thread/core might be useful but we don't want to change
struct ndpi_detection_module_struct
. We might implement two different "contexts": one "global context" (one for application; thread safe; to be defined) and we keepstruct ndpi_detection_module_struct
as a "local context" (one for thread/core). Nothing new here: this is a common pattern.We make
struct ndpi_detection_module_struct
full thread safe (does that imply a penalty cost when using one instance per thread?)...
Let me know if I got something wrong and let's discuss!
The text was updated successfully, but these errors were encountered: