You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 2, 2024. It is now read-only.
The first step towards implementing shuffle using Ray is that we need a way to put the execution result into the Ray Object Store. Currently https://github.com/datafusion-contrib/ray-sql/blob/main/raysql/worker.py#L22 returns a string representation of the result set which would not work if we want to recover the result set from the worker return value. We need to make the ResultSet and therefore RecordBatch classes picklable in Python.
The text was updated successfully, but these errors were encountered:
Hi @franklsf95. Yes, it is possible to serialize RecordBatch in Arrow IPC format. DataFusion has an IPCWriter struct that does this. You can see how this is used in the current Ray SQL ShuffleWriter.
The first step towards implementing shuffle using Ray is that we need a way to put the execution result into the Ray Object Store. Currently https://github.com/datafusion-contrib/ray-sql/blob/main/raysql/worker.py#L22 returns a string representation of the result set which would not work if we want to recover the result set from the worker return value. We need to make the
ResultSet
and thereforeRecordBatch
classes picklable in Python.The text was updated successfully, but these errors were encountered: