Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix doc style #575

Merged
merged 3 commits into from
Aug 14, 2020
Merged

Fix doc style #575

merged 3 commits into from
Aug 14, 2020

Conversation

jakirkham
Copy link
Member

Fixes a linting error observed in PR ( #574 ).

This is needed to fix a lint error in the docs.
@jakirkham jakirkham changed the base branch from branch-0.15 to branch-0.16 August 11, 2020 22:56
@jakirkham
Copy link
Member Author

Seeing the test failure from Distributed below (and some variants of it) on CI. This is a consequence of the change in PR ( rapidsai/rmm#477 ), which causes RMM's DeviceBuffer to have strides as None in the header when serialized. Simply always setting RMM's DeviceBuffer's strides in the header as is done in PR ( dask/distributed#4039 ) should fix the issue.

_______________________ test_serialize_numba_from_rmm[0] _______________________

size = 0

    @pytest.mark.parametrize("size", [0, 3, 10])
    def test_serialize_numba_from_rmm(size):
        np = pytest.importorskip("numpy")
        rmm = pytest.importorskip("rmm")
    
        if not cuda.is_available():
            pytest.skip("CUDA is not available")
    
        x_np = np.arange(size, dtype="u1")
    
        x_np_desc = x_np.__array_interface__
        (x_np_ptr, _) = x_np_desc["data"]
        (x_np_size,) = x_np_desc["shape"]
        x = rmm.DeviceBuffer(ptr=x_np_ptr, size=x_np_size)
    
        header, frames = serialize(x, serializers=("cuda", "dask", "pickle"))
        header["type-serialized"] = pickle.dumps(cuda.devicearray.DeviceNDArray)
    
>       y = deserialize(header, frames, deserializers=("cuda", "dask", "pickle", "error"))

/opt/conda/envs/gdf/lib/python3.7/site-packages/distributed/protocol/tests/test_numba.py:53: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/protocol/serialize.py:335: in deserialize
    return loads(header, frames)
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/protocol/cuda.py:28: in cuda_loads
    return loads(header, frames)
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/protocol/numba.py:46: in cuda_deserialize_numba_ndarray
    gpu_data=numba.cuda.as_cuda_array(frame).gpu_data,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <numba.cuda.cudadrv.devicearray.DeviceNDArray object at 0x7fc3376ace50>
shape = (0,), strides = None, dtype = dtype('uint8'), stream = 0
writeback = None
gpu_data = <numba.cuda.cudadrv.driver.MemoryPointer object at 0x7fc3376acd90>

    def __init__(self, shape, strides, dtype, stream=0, writeback=None,
                 gpu_data=None):
        """
        Args
        ----
    
        shape
            array shape.
        strides
            array strides.
        dtype
            data type as np.dtype coercible object.
        stream
            cuda stream.
        writeback
            Deprecated.
        gpu_data
            user provided device memory for the ndarray data buffer
        """
        if isinstance(shape, int):
            shape = (shape,)
        if isinstance(strides, int):
            strides = (strides,)
        dtype = np.dtype(dtype)
        self.ndim = len(shape)
>       if len(strides) != self.ndim:
E       TypeError: object of type 'NoneType' has no len()

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/cudadrv/devicearray.py:90: TypeError

@jakirkham
Copy link
Member Author

Interestingly CuPy does not run into this error, this appears to be due to the fact that it will just interpret strides is None as C-contiguous (similar to CuPy). This seems like desirable behavior to have in Numba as well. Submitted PR ( numba/numba#6122 ) to make that change in Numba.

@jakirkham
Copy link
Member Author

rerun tests

Copy link
Member

@raydouglass raydouglass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakirkham
Copy link
Member Author

jakirkham commented Aug 12, 2020

Why is that? We still have version defined. It's just extracted from release instead.

Edit: Oh, do you mean we should delete those lines?

@raydouglass
Copy link
Member

That script is run when a branch and release is cut. So your change in this PR will be reverted again when branch-0.17 is created unless you update the script to no longer overwrite the value.

These get autogenerated based on the git tag. So we don't need to use
`sed` to update them. Hence we drop these version update lines.
@jakirkham jakirkham requested a review from a team as a code owner August 12, 2020 01:42
@jakirkham
Copy link
Member Author

Yep, that makes sense. Thanks for the explanation, Ray! 😄

Does this look right?

@jakirkham
Copy link
Member Author

rerun tests

@jakirkham
Copy link
Member Author

jakirkham commented Aug 12, 2020

Looks like we are seeing a new CI failure. Wasn't able to reproduce the issue locally until upgrading rmm. Running the cuDF benchmark shown below (same as on CI) reproduces the issue for me. The CuPy benchmark runs without issues.

python benchmarks/cudf-merge.py --chunks-per-dev 4 --chunk-size 10000 --rmm-init-pool-size 100

My guess is it relates to PR ( rapidsai/rmm#466 ). What I'm less sure about is whether other things need to be rebuilt (like cudf) to include that rmm change. Though this particular benchmark errors when using CuPy, which suggests the issue is more fundamental.

Process SpawnProcess-1:
Traceback (most recent call last):
  File "/opt/conda/envs/gdf/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/opt/conda/envs/gdf/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/var/lib/jenkins/workspace/rapidsai/gpuci/ucx-py/prb/ucx-py-gpu-build_2/ucp/utils.py", line 163, in _worker_process
    ret = loop.run_until_complete(run())
  File "/opt/conda/envs/gdf/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
    return future.result()
  File "/var/lib/jenkins/workspace/rapidsai/gpuci/ucx-py/prb/ucx-py-gpu-build_2/ucp/utils.py", line 158, in run
    return await func(rank, eps, args)
  File "/var/lib/jenkins/workspace/rapidsai/gpuci/ucx-py/prb/ucx-py-gpu-build_2/benchmarks/cudf-merge.py", line 169, in worker
    df1 = generate_chunk(rank, args.chunk_size, args.n_chunks, "build", args.frac_match)
  File "/var/lib/jenkins/workspace/rapidsai/gpuci/ucx-py/prb/ucx-py-gpu-build_2/benchmarks/cudf-merge.py", line 114, in generate_chunk
    "key": cupy.arange(start, stop=stop, dtype="int64"),
  File "/opt/conda/envs/gdf/lib/python3.7/site-packages/cupy/creation/ranges.py", line 55, in arange
    ret = cupy.empty((size,), dtype=dtype)
  File "/opt/conda/envs/gdf/lib/python3.7/site-packages/cupy/creation/basic.py", line 22, in empty
    return cupy.ndarray(shape, dtype, order=order)
  File "cupy/core/core.pyx", line 134, in cupy.core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 544, in cupy.cuda.memory.alloc
  File "/opt/conda/envs/gdf/lib/python3.7/site-packages/rmm/rmm.py", line 270, in rmm_cupy_allocator
    buf = librmm.device_buffer.DeviceBuffer(size=nbytes)
  File "rmm/_lib/device_buffer.pyx", line 70, in rmm._lib.device_buffer.DeviceBuffer.__cinit__
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp68: cudaErrorMemoryAllocation out of memory

@jakirkham
Copy link
Member Author

jakirkham commented Aug 12, 2020

Am also able to reproduce this with the array benchmark. Just need to configure it to use RMM like so (on CI we use -o cupy instead of -o rmm). This means it is just an issue with RMM and UCX (not cuDF). Also the fact that the CuPy case (without RMM) does work rules out something has changed on the UCX side.

python benchmarks/local-send-recv.py -o rmm --server-dev 0 --client-dev 0 --reuse-alloc
Server Running at 10.33.225.165:45633
Client connecting to server at 10.33.225.165:45633
Process SpawnProcess-2:
Traceback (most recent call last):
  File "/datasets/jkirkham/miniconda/envs/rapids15dev/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/datasets/jkirkham/miniconda/envs/rapids15dev/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/datasets/jkirkham/devel/ucx-py/benchmarks/local-send-recv.py", line 147, in client
    loop.run_until_complete(run())
  File "/datasets/jkirkham/miniconda/envs/rapids15dev/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/datasets/jkirkham/devel/ucx-py/benchmarks/local-send-recv.py", line 126, in run
    t1 = np.arange(args.n_bytes, dtype="u1")
  File "/datasets/jkirkham/miniconda/envs/rapids15dev/lib/python3.8/site-packages/cupy/creation/ranges.py", line 55, in arange
    ret = cupy.empty((size,), dtype=dtype)
  File "/datasets/jkirkham/miniconda/envs/rapids15dev/lib/python3.8/site-packages/cupy/creation/basic.py", line 22, in empty
    return cupy.ndarray(shape, dtype, order=order)
  File "cupy/core/core.pyx", line 134, in cupy.core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 544, in cupy.cuda.memory.alloc
  File "/datasets/jkirkham/miniconda/envs/rapids15dev/lib/python3.8/site-packages/rmm/rmm.py", line 270, in rmm_cupy_allocator
    buf = librmm.device_buffer.DeviceBuffer(size=nbytes)
  File "rmm/_lib/device_buffer.pyx", line 70, in rmm._lib.device_buffer.DeviceBuffer.__cinit__
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp68: cudaErrorMemoryAllocation out of memory

@jakirkham
Copy link
Member Author

Trying to address these issues in PR ( #577 ) in combination with upstream changes to RMM ( rapidsai/rmm#490 ).

@jakirkham
Copy link
Member Author

rerun tests

Trying to pickup the new nightlies from PR ( rapidsai/rmm#493 ).

@jakirkham
Copy link
Member Author

jakirkham commented Aug 13, 2020

@jakirkham
Copy link
Member Author

Am beginning to think we are just choosing a problematic default for pool size.

@jakirkham
Copy link
Member Author

rerun tests

@quasiben
Copy link
Member

rerun tests

1 similar comment
@jakirkham
Copy link
Member Author

rerun tests

@jakirkham jakirkham merged commit 51c84fd into rapidsai:branch-0.16 Aug 14, 2020
@jakirkham jakirkham deleted the fix_doc_sty branch August 14, 2020 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants