Free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks #387

jonmmease · 2023-09-06T19:32:57Z

Previously we forced the use of a single-threaded tokio runtime whenever a Python Datasource was in use. This was done to work around deadlocks, but has the unfortunate side affect of disabling multi-threaded parallelization of queries in these situations. This PR applies the technique discussed in See PyO3/pyo3#2182 to use the PyO3 allow_threads construct to release the Python GIL before performing blocking operations that may themselves need to acquire the GIL in separate threads.

In turns out that the DuckDbDatasource still requires running on the main thread in order to access the kernel's top-level DataFrames, so I made the main thread behavior configurable on a per-datasource level, where the default is to maintain the prior behavior of running on the main thread.

I started thinking about this again as a result of the discussion in #386. With these changes, it should be possible to write a __dataframe__ protocol-based VegaFusion Datasource that implements a custom DataFusion datasource without requiring everything to run on the main thread.

This avoids deadlocks when using the multithreaded runtime with Python data sources. The Python datasource implementation now has control over whether it must be run on the main thread (which duckdb requires).

free Python GIL on blocking operations

1faf66f

This avoids deadlocks when using the multithreaded runtime with Python data sources. The Python datasource implementation now has control over whether it must be run on the main thread (which duckdb requires).

jonmmease changed the title ~~free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks~~ Free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks Sep 6, 2023

jonmmease merged commit 7127d04 into main Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks #387

Free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks #387

jonmmease commented Sep 6, 2023 •

edited

Loading

Free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks #387

Free Python GIL on blocking operations to allow multi-threading runtime usage without deadlocks #387

Conversation

jonmmease commented Sep 6, 2023 • edited Loading

jonmmease commented Sep 6, 2023 •

edited

Loading