Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing in AWS Lambda #1219

Open
raulsperoni opened this issue Feb 6, 2025 · 0 comments
Open

Multiprocessing in AWS Lambda #1219

raulsperoni opened this issue Feb 6, 2025 · 0 comments

Comments

@raulsperoni
Copy link

Hello, I'm facing an issue when running Dedupe in a Lambda, even when setting num_cores=0:

[ERROR] OSError: [Errno 38] Function not implemented

Traceback:

    clustered_dupes = deduper.partition(data_d, 0.5)
  File "/var/lang/lib/python3.11/site-packages/dedupe/api.py", line 190, in partition
    pair_scores = self.score(pairs)
  File "/var/lang/lib/python3.11/site-packages/dedupe/api.py", line 115, in score
    matches = core.scoreDuplicates(
  File "/var/lang/lib/python3.11/site-packages/dedupe/core.py", line 129, in scoreDuplicates
    offset = multiprocessing.Value("Q", 0, lock=RLock())
  File "/var/lang/lib/python3.11/multiprocessing/context.py", line 73, in RLock
    return RLock(ctx=self.get_context())
  File "/var/lang/lib/python3.11/multiprocessing/synchronize.py", line 194, in __init__
    SemLock.__init__(self, RECURSIVE_MUTEX, 1, 1, ctx=ctx)
  File "/var/lang/lib/python3.11/multiprocessing/synchronize.py", line 57, in __init__
    sl = self._semlock = _multiprocessing.SemLock(

These lines are not inside the num_cores condition:

# explicitly defining the lock from the "spawn context" seems to
# be necessary for python 3.7 on mac os.
 offset = multiprocessing.Value("Q", 0, lock=RLock())

Can anyone help?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant