Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Memory error when cudf is used in python multiprocessing Pool #5515

Closed
achinta opened this issue Jun 19, 2020 · 8 comments
Closed

[BUG] Memory error when cudf is used in python multiprocessing Pool #5515

achinta opened this issue Jun 19, 2020 · 8 comments
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@achinta
Copy link

achinta commented Jun 19, 2020

Cudf 0.14 gives error when used in python multiprocessing Pool. However, it works in version 0.12. Here is the code to reproduce.

import cudf
import pandas as pd
from multiprocessing import Pool

def get_df(idx):
    pdf = pd.DataFrame({
        'a':[1,2],
        'b':[3,4]
    })
    return cudf.from_pandas(pdf)

# Parallelize the method calls
with Pool(2) as pool:
    pool.map(get_df, [1,2])

Error is

MemoryError                               Traceback (most recent call last)
<ipython-input-1-757e5676c563> in <module>
     11 
     12 with Pool(2) as pool:
---> 13     pool.map(get_df, [1,2])

~/miniconda3/envs/gpu/lib/python3.6/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    264         in a list that is returned.
    265         '''
--> 266         return self._map_async(func, iterable, mapstar, chunksize).get()
    267 
    268     def starmap(self, func, iterable, chunksize=None):

~/miniconda3/envs/gpu/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
    642             return self._value
    643         else:
--> 644             raise self._value
    645 
    646     def _set(self, i, obj):

MemoryError: std::bad_alloc: CUDA error at: /conda/conda-bld/librmm_1591196551527/work/include/rmm/mr/device/cuda_memory_resource.hpp66: cudaErrorInitializationError initialization error

cudf was installed using Anaconda on bare metal. I am attaching the outputs of cudf/print_env.sh
print_env_12.txt
print_env_14.txt

@achinta achinta added Needs Triage Need team to review and classify bug Something isn't working labels Jun 19, 2020
@shwina
Copy link
Contributor

shwina commented Jun 19, 2020

Looks related to the new memory resource bindings; will investigate.

@shwina shwina self-assigned this Jun 19, 2020
@kkraus14 kkraus14 added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Jun 19, 2020
@shwina
Copy link
Contributor

shwina commented Jun 19, 2020

Thanks for reporting. This is likely due to the call to fork(), which will attempt to share the CUDA context created in the parent process. One fix is to use spawn() instead:

import cudf
import pandas as pd
from multiprocessing import get_context

def get_df(idx):
    pdf = pd.DataFrame({
        'a':[1,2],
        'b':[3,4]
    })
    return cudf.from_pandas(pdf)

if __name__ == "__main__":
    ctx = get_context("spawn")

    # Parallelize the method calls
    with ctx.Pool(2) as pool:
        print(pool.map(get_df, [1,2]))

Does that help with your problem?

@achinta
Copy link
Author

achinta commented Jun 20, 2020

No. The error message is

AttributeError: Can't get attribute 'get_df' on <module '__main__' (built-in)>

@shwina
Copy link
Contributor

shwina commented Jun 22, 2020

Hmm, how are you running this test? Interactively with IPython/Jupyter or invoking it as a script?

@philtrade
Copy link
Contributor

philtrade commented Jun 23, 2020

MemoryError                               Traceback (most recent call last)
<ipython-input-1-757e5676c563> in <module>

Looks like in iPython.

No. The error message is

AttributeError: Can't get attribute 'get_df' on <module '__main__' (built-in)>

Looks like the known limitation (the 2nd gray box in the link) of the python's multiprocessing when used interactively with iPython.

When launched as a script, @shwina 's suggestion shouldn't see the AttributeError, should it?

But not sure about the original error message below, is it related to usage of fork vs spawn or sth else.

MemoryError: std::bad_alloc: CUDA error at: /conda/conda-bld/librmm_1591196551527/work/include/rmm/mr/device/cuda_memory_resource.hpp66: cudaErrorInitializationError initialization error

@philtrade
Copy link
Contributor

... This is likely due to the call to fork(), which will attempt to share the CUDA context created in the parent process. One fix is to use spawn() instead:

import cudf
import pandas as pd
from multiprocessing import get_context

def get_df(idx):
    pdf = pd.DataFrame({
        'a':[1,2],
        'b':[3,4]
    })
    return cudf.from_pandas(pdf)

if __name__ == "__main__":
    ctx = get_context("spawn")

    # Parallelize the method calls
    with ctx.Pool(2) as pool:
        print(pool.map(get_df, [1,2]))

Verified that this works when run as a script. With the default fork() starting method, it would hit the initialization error.

@achinta
Copy link
Author

achinta commented Jun 24, 2020

@shwina your suggestion works fine when run as a script. Thanks.

@shwina
Copy link
Contributor

shwina commented Jun 24, 2020

Thanks for letting us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

4 participants