-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Memory error when cudf is used in python multiprocessing Pool #5515
Comments
Looks related to the new memory resource bindings; will investigate. |
Thanks for reporting. This is likely due to the call to import cudf
import pandas as pd
from multiprocessing import get_context
def get_df(idx):
pdf = pd.DataFrame({
'a':[1,2],
'b':[3,4]
})
return cudf.from_pandas(pdf)
if __name__ == "__main__":
ctx = get_context("spawn")
# Parallelize the method calls
with ctx.Pool(2) as pool:
print(pool.map(get_df, [1,2])) Does that help with your problem? |
No. The error message is AttributeError: Can't get attribute 'get_df' on <module '__main__' (built-in)> |
Hmm, how are you running this test? Interactively with IPython/Jupyter or invoking it as a script? |
Looks like in iPython.
Looks like the known limitation (the 2nd gray box in the link) of the python's When launched as a script, @shwina 's suggestion shouldn't see the But not sure about the original error message below, is it related to usage of
|
Verified that this works when run as a script. With the default |
@shwina your suggestion works fine when run as a script. Thanks. |
Thanks for letting us know! |
Cudf 0.14 gives error when used in python multiprocessing Pool. However, it works in version 0.12. Here is the code to reproduce.
Error is
cudf was installed using Anaconda on bare metal. I am attaching the outputs of
cudf/print_env.sh
print_env_12.txt
print_env_14.txt
The text was updated successfully, but these errors were encountered: