Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Python multiprocessing supported? #232

Open
charles-firth opened this issue Jul 8, 2020 · 4 comments
Open

Is Python multiprocessing supported? #232

charles-firth opened this issue Jul 8, 2020 · 4 comments

Comments

@charles-firth
Copy link

Hi,

I've been trying to get AWS X-Ray working in my Python application where I spin up separate Python processes via Pool() and apply_async. I'm passing in the segment from .get_trace_entity and setting it in the child process, but I'm still getting exceptions that seem to suggest the segment isn't getting passed properly, e.g.:

2020-07-08 15:57:54,703] {{taskinstance.py:1088}} ERROR - 'Segment' object has no attribute 'sampled'
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.7/site-packages/xxxx-0.1-py3.7.egg/xxxx/elt/extract/kinesis_handler.py", line 52, in push_to_kinesis
    xray_recorder.begin_subsegment('push_to_kinesis')
  File "/usr/local/lib/python3.7/site-packages/aws_xray_sdk/core/recorder.py", line 297, in begin_subsegment
    if not segment.sampled:
AttributeError: 'Segment' object has no attribute 'sampled'
"""

Does X-Ray support tracking segments through separate processes, or am I just doing something wrong? I noticed in the documentation it only mentions ThreadPoolExecutor, and I know sharing state with separate processes can be a bit fiddly due to pickling.

Thanks in advance!

@bhautikpip
Copy link
Contributor

Hi @charles-firth ,

I believe X-Ray does not support tracking segments through separate processes. As you have mentioned we do support async execution with different threads (ThreadPoolExecutor). Github REAMDE also provides an example (https://github.com/aws/aws-xray-sdk-python#trace-threadpoolexecutor) for that. Moreover, I would want to learn more about your use case so that we can plan if in the long run we would want to support this or not.

@srprash
Copy link
Contributor

srprash commented Jul 16, 2020

Hi @charles-firth
Are you by any chance deep copying the segment (or does spinning up new processes through Pool() deep copies context) over to the child process? If so, there is a possibility that the sampled field (among others) is getting removed as the __getstate__ method is used by deepcopy as well (we intend to use it for serializing the Segment object only).
Its mostly my guess but it would be helpful if you can verify this. Thanks

@stale
Copy link

stale bot commented Jan 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in next 7 days. Thank you for your contributions.

@stale stale bot added the stale label Jan 8, 2022
@hb2638
Copy link

hb2638 commented Dec 28, 2023

Would be great if you could support this. I created an implentation of concurrent.futures.Executor that uses multiprocessing so I can do multiprocessing in AWS lambdas but use the below hack to progragate x-ray segments across processes.

The executor uses _get_xray_trace_entity to capture the trace entity and sends it to the worker process. The worker process uses _set_xray_trace_entity to get xray working in the worker process.

... The atomic counters was breaking the serialization.

    @staticmethod
    def _set_xray_trace_entity(trace_entity):
        trace_entity = pickle.loads(trace_entity)
        if isinstance(trace_entity, aws_xray_sdk.core.models.subsegment.Subsegment):
            trace_entity.parent_segment.ref_counter = aws_xray_sdk.core.utils.atomic_counter.AtomicCounter()
            trace_entity.parent_segment._subsegments_counter = aws_xray_sdk.core.utils.atomic_counter.AtomicCounter()
        else:
            trace_entity.ref_counter = aws_xray_sdk.core.utils.atomic_counter.AtomicCounter()
            trace_entity._subsegments_counter = aws_xray_sdk.core.utils.atomic_counter.AtomicCounter()
        xray_recorder.set_trace_entity(trace_entity)


    def _get_xray_trace_entity(self):
        te = xray_recorder.get_trace_entity()
        parent_ref_counter = None
        parent_subsegments_counter = None
        ref_counter = None
        subsegments_counter = None

        if isinstance(te, aws_xray_sdk.core.models.subsegment.Subsegment):
            parent_ref_counter = te.parent_segment.ref_counter
            parent_subsegments_counter = te.parent_segment._subsegments_counter
            te.parent_segment._subsegments_counter = None
            te.parent_segment.ref_counter = None
        else:
            ref_counter = te.ref_counter
            subsegments_counter = te._subsegments_counter
            te._subsegments_counter = None
            te.ref_counter = None

        te_copy = pickle.dumps(te)

        if parent_ref_counter is not None:
            te.parent_segment.ref_counter = parent_ref_counter
        if parent_subsegments_counter is not None:
            te.parent_segment._subsegments_counter = parent_subsegments_counter
        if ref_counter is not None:
            te.ref_counter = ref_counter
        if subsegments_counter is not None:
            te._subsegments_counter = subsegments_counter

        return te_copy

@stale stale bot removed the stale label Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants