Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locking functionality to the priority queue #293

Merged
merged 9 commits into from
Feb 28, 2023
Merged

Conversation

jpbruinsslot
Copy link
Contributor

@jpbruinsslot jpbruinsslot commented Feb 21, 2023

Solve race conditions of multiple workers popping off the same task from the queue

@jpbruinsslot jpbruinsslot added the mula Issues related to the scheduler label Feb 21, 2023
@jpbruinsslot jpbruinsslot self-assigned this Feb 21, 2023
@jpbruinsslot jpbruinsslot linked an issue Feb 21, 2023 that may be closed by this pull request
@dekkers
Copy link
Contributor

dekkers commented Feb 22, 2023

The for_update isn't necessary when we always hold the lock when calling pop.

I also think we will keep running into issues like this if we keep on using threading, because threading is just very hard to get right. I also can't think of a reason why we need to use threading, so I think it would be a good idea if we switch to asyncio in the future and get rid of most concurrency issues doing this. FastAPI is build around asyncio, so it shouldn't be hard to do, see https://fastapi.tiangolo.com/async/

With asyncio we can the just use for_update and don't need to do any locking in mula itself. In my opinion we should just make formal that we don't support any other database than PostgreSQL because we really want to be able to use all the power of PostgreSQL and not be limited to only things that are also supported by other databases.

@jpbruinsslot
Copy link
Contributor Author

The for_update isn't necessary when we always hold the lock when calling pop.

I also think we will keep running into issues like this if we keep on using threading, because threading is just very hard to get right. I also can't think of a reason why we need to use threading, so I think it would be a good idea if we switch to asyncio in the future and get rid of most concurrency issues doing this. FastAPI is build around asyncio, so it shouldn't be hard to do, see https://fastapi.tiangolo.com/async/

With asyncio we can the just use for_update and don't need to do any locking in mula itself. In my opinion we should just make formal that we don't support any other database than PostgreSQL because we really want to be able to use all the power of PostgreSQL and not be limited to only things that are also supported by other databases.

The for_update isn't necessary when we always hold the lock when calling pop.

I also think we will keep running into issues like this if we keep on using threading, because threading is just very hard to get right. I also can't think of a reason why we need to use threading, so I think it would be a good idea if we switch to asyncio in the future and get rid of most concurrency issues doing this. FastAPI is build around asyncio, so it shouldn't be hard to do, see https://fastapi.tiangolo.com/async/

With asyncio we can the just use for_update and don't need to do any locking in mula itself. In my opinion we should just make formal that we don't support any other database than PostgreSQL because we really want to be able to use all the power of PostgreSQL and not be limited to only things that are also supported by other databases.

yes was about to remove it, this is an artifact of debugging. I'll have a look for the implications of using asyncio. And I agree on the formalization of postgresql

@jpbruinsslot jpbruinsslot marked this pull request as ready for review February 22, 2023 11:15
@jpbruinsslot jpbruinsslot requested a review from a team as a code owner February 22, 2023 11:15
Copy link
Contributor

@ammar92 ammar92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Although I'm not sure yet if using queue.Queue in the tests is the way to go since that queue implementation itself is already thread-safe

@github-actions
Copy link
Contributor

File Coverage
All files 67%
bits/definitions.py 64%
bits/runner.py 56%
bits/https_availability/https_availability.py 93%
bits/oois_in_headers/oois_in_headers.py 72%
bits/spf_discovery/internetnl_spf_parser.py 55%
bits/spf_discovery/spf_discovery.py 77%
octopoes/api/api.py 89%
octopoes/api/models.py 75%
octopoes/api/router.py 56%
octopoes/core/app.py 69%
octopoes/core/service.py 58%
octopoes/events/events.py 96%
octopoes/events/manager.py 65%
octopoes/models/__init__.py 80%
octopoes/models/datetime.py 66%
octopoes/models/exception.py 83%
octopoes/models/origin.py 70%
octopoes/models/path.py 99%
octopoes/models/types.py 95%
octopoes/models/ooi/certificate.py 95%
octopoes/models/ooi/email_security.py 95%
octopoes/models/ooi/findings.py 94%
octopoes/models/ooi/network.py 97%
octopoes/models/ooi/service.py 91%
octopoes/models/ooi/software.py 71%
octopoes/models/ooi/web.py 81%
octopoes/models/ooi/dns/records.py 95%
octopoes/models/ooi/dns/zone.py 82%
octopoes/repositories/ooi_repository.py 40%
octopoes/repositories/origin_parameter_repository.py 52%
octopoes/repositories/origin_repository.py 52%
octopoes/repositories/scan_profile_repository.py 45%
octopoes/xtdb/client.py 39%
octopoes/xtdb/query_builder.py 69%
octopoes/xtdb/related_field_generator.py 73%
tests/conftest.py 91%

Minimum allowed coverage is 75%

Generated by 🐒 cobertura-action against 251bc56

@ammar92 ammar92 merged commit 20be308 into main Feb 28, 2023
@ammar92 ammar92 deleted the fix/pq-locking branch February 28, 2023 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mula Issues related to the scheduler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Mula] Parallel workers pick up the same task
5 participants