-
-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception: ValueError('buffer source array is read-only') #3943
Comments
Is it possible to reproduce this without the Taxi data? Would some randomly generated data work as well? |
Perhaps, although it will take me more time to narrow down the example. I managed to reproduce the problem with these URLs (stripped out some files with mixed dtypes) and without the intermediate
# coding: utf-8
from dask import dataframe as dd
from distributed import Client
client = Client()
df = dd.read_csv("../data/yellow_tripdata_2019-*.csv", parse_dates=["tpep_pickup_datetime", "tpep_dropoff_datetime"])
op = df.groupby("payment_type")["tip_amount"].mean()
client.compute(op) However, removing one file resulted in the operation executing successfully. I'll try to see if this can be simplified further. |
Thanks Juan! 😀 |
When I ran this I got the mismatched dtype error:
|
FWIW I tried to ensure that we always create |
I confirm that #3918 fixed the issue with the same data:
|
Thanks Juan! 😀 Will look at cleaning it up and adding some more tests. |
Adding a test for a Pandas Series as well in PR ( #3995 ). |
(Comes from #1978 (comment))
What happened:
What you expected to happen: The operation finishes without error.
Minimal Complete Verifiable Example:
Data:
Anything else we need to know?: I managed to avoid this error by reducing the number of files, but then it hit me again at a later point. I expect this behavior to be dependent on the available RAM.
Environment:
The text was updated successfully, but these errors were encountered: