Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytearray is not memory safe in free-threaded Python #127472

Closed
robsdedude opened this issue Dec 1, 2024 · 2 comments
Closed

bytearray is not memory safe in free-threaded Python #127472

robsdedude opened this issue Dec 1, 2024 · 2 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@robsdedude
Copy link

robsdedude commented Dec 1, 2024

Bug report

Bug description:

While working on improving the compatibility of PyO3's bytearray wrapper with free-threaded Python (3.13t), I ended up looking at the bytearray implementation and couldn't see any critical sections or other synchronization mechanisms. This led me to believe that using bytearrays concurrently (pure python) might not be memory safe. I managed to write a reproducer:

import copy
import sys
import threading

SIZE = 1_000_000_000  # 1GB


print("Allocating initial arrays", flush=True)
_original = bytearray(42 for _ in range(SIZE))
_garbage = bytearray(13 for _ in range(SIZE // 4))


array = copy.copy(_original)


def new_array():
    return copy.copy(_original)


def worker1():
    global array
    while True:
        print("Extending array", flush=True)
        array.extend(array)
        print("Recreating array", flush=True)
        array = new_array()


def worker2():
    while True:
        expected = {0, 42}
        # Arguably, we shouldn't even see 0, but let's be lenient and assume
        # it might be zeroed memory not yet set to the actual value. In
        # reality, seeing 0 very likely indicates reading uninitialized memory.
        # When changing the program to also fail on 0, we can see a failure
        # much faster.
        for i in (0, SIZE - 1, -SIZE, -1):
            value = array[i]
            if value not in expected:
                print(
                    f"Array corrupted (observed array[{i}] = {value})",
                    file=sys.stderr,
                    flush=True,
                )
                return


def worker3():
    print("Putting other stuff into the memory", flush=True)
    while True:
        foo = [copy.copy(_garbage) for _ in range(5)]
        for f in foo:
            f.extend(f)
        del foo


t1 = threading.Thread(target=worker1, daemon=True)
t2 = threading.Thread(target=worker2, daemon=True)
t3 = threading.Thread(target=worker3, daemon=True)

t1.start()
t2.start()
t3.start()

t2.join()

Obviously, this is a racy program and I don't expect it to be very useful. But worker2 should never be seeing the garbage of worker3. Yet it does.

The output I got:
Allocating initial arrays
Extending array
Putting other stuff into the memory
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Recreating array
Extending array
Array corrupted (observed array[-1000000000] = 13)

CPython versions tested on:

3.13

Operating systems tested on:

Linux

@robsdedude robsdedude added the type-bug An unexpected behavior, bug, or error label Dec 1, 2024
@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Dec 1, 2024
@sergey-miryanov
Copy link
Contributor

This should be fixed with gh-129107

@kumaraditya303
Copy link
Contributor

Closing in favor of #129107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants