Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
) There are no real reasons to embed `struct ndpi_packet_struct` (i.e. "packet") in `struct ndpi_flow_struct` (i.e. "flow"). In other words, we can avoid saving dissection information of "current packet" into the "flow" state, i.e. in the flow management table. The nDPI detection module processes only one packet at the time, so it is safe to save packet dissection information in `struct ndpi_detection_module_struct`, reusing always the same "packet" instance and saving a huge amount of memory. Bottom line: we need only one copy of "packet" (for detection module), not one for each "flow". It is not clear how/why "packet" ended up in "flow" in the first place. It has been there since the beginning of the GIT history, but in the original OpenDPI code `struct ipoque_packet_struct` was embedded in `struct ipoque_detection_module_struct`, i.e. there was the same exact situation this commit wants to achieve. Most of the changes in this PR are some boilerplate to update something like "flow->packet" into something like "module->packet" throughout the code. Some attention has been paid to update `ndpi_init_packet()` since we need to reset some "packet" fields before starting to process another packet. There has been one important change, though, in ndpi_detection_giveup(). Nothing changed for the applications/users, but this function can't access "packet" anymore. The reason is that this function can be called "asynchronously" with respect to the data processing, i.e in context where there is no valid notion of "current packet"; for example ndpiReader calls it after having processed all the traffic, iterating the entire session table. Mining LRU stuff seems a bit odd (even before this patch): probably we need to rethink it, as a follow-up.
- Loading branch information
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit completely kills the nDPI multithreading feature that ndpi-netfilter uses successfully.
"struct ndpi_packet_struct" cannot be static for all threads!
The number of such structures must be no less than the number of cores / threads.
I totally agree that "struct ndpi_packet_struct" is not needed for every flow. But making it static is the worst possible option!
This commit needs to be rewritten.
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vel21ripn, I don't know ndpi-netfilter (sorry) but I think you get it wrong.
This change does NOT make
struct ndpi_packet_struct
static for all threads.It does make it static for each
struct ndpi_detection_module_struct
.Since you can't (AFAIK) safely use the same detection module across multiple threads, in multiple threads/cores scenario you need multiple detection modules anyway, regardless of this specific change.
Am I missing something obvious?
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vel21ripn @IvanNardi nDPI detection module is not thread-safe as stated here.
The fact that you are using 1 detection module shared across threads is specific to your implementation.
Is there something that prevents you from using one detection module per thread as done within ndpiReader and ndpiSimpleIntegration?
Zied
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following scheme has worked so far: initialized the structure of the ndpi_init_detection_module() module,
specified the list of detected protocols ndpi_set_protocol_detection_bitmask2(), executed ndpi_finalize_initialization().
After that, several threads could work in parallel with different flows.
I did not rewrite ahocorasick in vain, since there was common data. I deliberately moved into a separate structure AC_MATCH_t everything that is modified during the processing of the package.
Can you please tell me what data of the ndpi_detection_module_struct structure are modified during the operation of ndpi_detection_process_packet()?
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ndpi_detection_module_struct
is not meant to be thread-safe. It is as simple as that.Even if it worked for you before 730c236, invalid API usage of a library can have unforeseen consequences at any time.
The real issue I see here is that there is no documentation available that states what is allowed in MT environments and what not.
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the LRU caches are updated at runtime
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@IvanNardi
All the LRU caches are updated at runtime
Thanks.
It's not a problem. It has very simple code. It is very easy to make this code reentrant.
Are there any other common data besides lru and libcache?
Ndpi-netfilter uses libcache with locks.
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we replace "& ndpi_struct.packet" with an inline function, then there is no problem with re-entering the ndpi_detection_process_packet() code.
I can prepare such a PR. I also remember about the need to fix the LRU cache code.
I just now noticed example/ndpiSimpleIntegration.c. This is a very good starting point for testing multi-threaded use of libndpi.
730c236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, continue the discussion on #1344