-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc: unknown error while reading: cannot allocate memory #1283
Comments
Just to clarify I am using |
I've just experienced the same issue (exception I'm running the code snippet below in a Docker container based on the python:3.12 image. The packages versions are:
Here's the code snippet (not the full function's code):
The function above fails after fetching a few tables. I've tried wrapping it in a for loop and also in a multiprocess Pool. Both don't succeed. |
About how large is the dataset you're trying to fetch? The ADBC driver tries to buffer the parts of the dataset concurrently. You could try setting the options to limit the queue size and concurrency to cut down on memory usage. (We could/should also probably limit the overall buffer size based on memory usage, I suspect.) https://arrow.apache.org/adbc/current/driver/snowflake.html#performance |
"fails after fetching a few tables" The other thing I would suspect here is if we somehow are keeping a reference to the dataset when we should not be... CC @zeroshade for any ideas too |
10 million rows , 166 MB( In snowflake ) , 19 columns |
Actually, it has just failed on the 1st or 2nd execution of the pool. And I can see that this time it was probably The order of tables that are read differ slightly on each run, so it might be really associated to the table size. I'll try setting the queue size and concurrency and will keep you posted. Thanks. edit: here's the Go output, in addition to the Exception above:
|
Ok. The Go traceback indicates an actual bug that needs fixing. The dataset described doesn't sound big enough to be worthy of causing memory issues, either. (Unless it's very highly compressed while in Snowflake.) So something seems off. |
@zeroshade from looking at the code, it seems a post-0.8.0 refactor should already have fixed this condition, do you agree? |
Sorry for the delay in me taking a look at this...
Looking at the code I think I'd agree with you. @aab200, @bascheibler are either of you able to reproduce this error/failure if you build from master? |
I've built from master (it installed
or (with more info printed from
The tables that failed have 2.2M, 98K and 13.3K rows with 48.0MB, 41.9MB and 229.5KB, respectively. Python packages:
Please let me know if there's anything else I could do to help debugging this. I'd be happy to contribute. |
@bascheibler could you provide any kind of equivalent dataset via parquet/csv/otherwise that I could easily load into my own snowflake instance to attempt to reproduce this? |
These The refactored code is detailed below. I had to use a lower-level code to be able to set the statement options.
|
Hmm, so increasing the concurrency/aggressiveness is what reduces it...? That doesn't sound like an out-of-memory issue then |
maybe timeouts or something on the snowflake side? yea, this would be easiest if we could have a dataset that can reproduce this. I'll try tomorrow to see if i can generate one unless @bascheibler is able to provide something roughly equivalent to what he's using (I assume the actual data you're using is likely not able to be released publicly) that is able to trigger the errors. |
Here's a repo with both the dataset and the scripts to reproduce the issue: https://github.com/bascheibler/arrow-adbc-issue-1283 I've created only 3 test samples with hashed data off of some of the tables that failed. Unfortunately, I wasn't able to test this new dataset because the Please let me know if this data is enough to reproduce the issues. We may need to force a longer loop with repeated tables in order to get an error. Not sure how it behaves with a shorter list of tables. |
We are also seeing this issue sporadically too when executing a |
By setting the prefetch concurrency option to "1" i was able to reproduce the |
sigh As with any complex system, it came down to a race condition. @shollington-rbi @bascheibler @aab200 could any of you try pulling down the branch from the linked PR and verifying it fixes the problem for you? |
I believe this branch hasn't totally fixed the issue yet. It is still returning the |
I'll try to use the updated repo scripts and see if i can reproduce it myself and figure out what I'm missing |
I don't think your dockerfile is picking up the updated version of the lib from my branch when it runs. I was able to reproduce the |
OK, thanks for checking that. I've added that FYI: in the meantime, I've switched to |
It is based on nanoarrow, but is unrelated to anything in ADBC. |
So after getting it to run with the updated build, I now got the failure to allocate memory error that was originally filed. I haven't been able to reproduce that with pure Go yet. So I'm not calling this solved just yet until I can determine the cause of that allocation error. |
So I managed to reproduce the "cannot allocate memory" error locally and interactive. With a bit of manipulation I was able to get stack traces and look closer at the situation. The "cannot allocate memory" error is being directly returned by the call to I'm only able to reproduce this issue via the python script in the repo provided by @bascheibler, I haven't been able to create a reproducer with just Go. |
Either way, I don't think this is related to the race condition, so I think the race condition should be fixed on its own and then we can look into the "cannot allocate memory" separately. I'll keep digging into this but it just seems REALLY strange to me |
What were both of the parameters for calloc? |
first argument somewhere between 800000 and 1200000, second argument is 1 |
@zeroshade did that PR resolve both issues? |
Gah, crap. No. It didn't resolve the unable to allocate issue. I forgot to update it before merging. My mistake |
apache/arrow#40902 provides an upstream fix for this issue |
Fixes #1283 by incorporating the upstream fix
I am still seeing this issue when running on machines running Ubuntu. No problem running on Mac.
I have the latest versions of pyarrow, adbc-driver-manager, and adbc-driver-snowflake. Is there something else I should be updating? |
Oops, I think 0.11.0 didn't actually incorporate the fix (the release happened right before the fix got in). There are nightly wheels here, do you want to try them? (Download the artifact "python-amd64-manylinux2014", extract it, and manually |
Thanks for the help! That is working as expected and fixes the bug we were seeing. Any idea when we can expect this to be released? |
I plan to kick off the release in the next week or two, actually. That will be called "1.0" because I believe the Snowflake driver (and a few others) are generally ready. (Still deciding whether I will spend the time to try to decouple our version numbers.) So TL;DR by mid-May, if all goes well. |
Is there any chance that there will be a patch release 0.11.1 that we will be able to use before the 1.0? |
For us all releases take an equal amount of effort so it would functionally be the same as waiting for the next release. |
Hello,
Was trying the adbc python snow driver and I see something odd . Simple function
Sometimes it works and some times it does not ( the regular snowflake python connector works every time )
When we run adbc part there are weird errors
(310) [aborissov@bhsys-data-dev-euw1c-lnx10 python_project2]$ python3 snow_python_adbc.py
Enters query_via_adbc_arrow
Got 10003143 rows in arrow . The execution time is 1.62057 secs
Exits query_via_adbc_arrow
Enters query_via_adbc_arrow
Traceback (most recent call last):
File "/<>/python_project2/snow_python_adbc.py", line 133, in
df = query_via_adbc_arrow(conn_adbc, "select * from private.main_td_limit_orders", return_type='pandas')
File "/bh<>/snow_python_adbc.py", line 91, in query_via_adbc_arrow
df = cursor.fetch_df() # Does not work well with large data running out of memory
File "/<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line 1050, in fetch_df
return self._results.fetch_df()
File "<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line 1139, in fetch_df
return self._reader.read_pandas()
File "adbc_driver_manager/_reader.pyx", line 108, in adbc_driver_manager._reader.AdbcRecordBatchReader.read_pandas
File "adbc_driver_manager/_reader.pyx", line 40, in adbc_driver_manager._reader._AdbcErrorHelper.check_error
adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc: unknown error while reading: cannot allocate memory
(310) [<>python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<>python_project2]$ python3 snow_python_adbc.py
Enters query_via_adbc_arrow
Got 10003143 rows in arrow . The execution time is 1.535479 secs
Exits query_via_adbc_arrow
Enters query_via_adbc_arrow
Traceback (most recent call last):
File "/<>/python_project2/python_project2/snow_python_adbc.py", line 133, in
df = query_via_adbc_arrow(conn_adbc, "select * from private.main_td_limit_orders", return_type='pandas')
File "/<>/python_project2/python_project2/snow_python_adbc.py", line 91, in query_via_adbc_arrow
df = cursor.fetch_df() # Does not work well with large data running out of memory
File "/<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line 1050, in fetch_df
return self._results.fetch_df()
File "<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line 1139, in fetch_df
return self._reader.read_pandas()
File "adbc_driver_manager/_reader.pyx", line 108, in adbc_driver_manager._reader.AdbcRecordBatchReader.read_pandas
File "adbc_driver_manager/_reader.pyx", line 40, in adbc_driver_manager._reader._AdbcErrorHelper.check_error
adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc: unknown error while reading: cannot allocate memory
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$ python3 snow_python_adbc.py
Enters query_via_adbc_arrow
Got 10003143 rows in arrow . The execution time is 1.560988 secs
Exits query_via_adbc_arrow
Enters query_via_adbc_arrow
Got 10003143 rows in pandas . The execution time is 2.706668 secs
Exits query_via_adbc_arrow
(310) [<> python_project2]$
Has anyone experienced that ? Thanks
The text was updated successfully, but these errors were encountered: