Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid synchronization on GetQueues #116

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

klockla
Copy link
Collaborator

@klockla klockla commented Jan 22, 2025

The purpose of this PR is to avoid the synchronization on the internal queue map.

The synchronized(getQueues()) call causes contention during long queue traversals which are very likely to occur during calls to countURLs & ListURLs.

This change is based on the ConcurrentOrdereredMap class which uses StampLock to provide fine
fine-grained locking over the concurrent map.

(alternative in ConcurrentLinkedHashMap based on ReentrantLock, need to compare under stress tests)

Signed-off-by: Laurent Klock [email protected]

Signed-off-by: Laurent Klock <[email protected]>
Finalized AbstractFrontierService modification (removal of synchronizations on queues)

Signed-off-by: Laurent Klock <[email protected]>
@klockla klockla self-assigned this Jan 22, 2025
@jnioche
Copy link
Collaborator

jnioche commented Jan 22, 2025

Thanks @klockla, sounds good
Would be great to have a test or benchmark to compare the existing approach vs ConcurrentOrdereredMap vs ConcurrentLinkedHashMap indeed.
Given that the same approach is used in key parts of StormCrawler it would be great to have a better approach.
Is the code for ConcurrentOrdereredMap and ConcurrentLinkedHashMap from another project or did you write that? We would need a strong set of unit tests to make sure they are bullet proof

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants