You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our application, we use the dataframe library to parse CSV and preprocess it before we transform the rows into our internal domain objects. This works really well and feels quite natural (coming from spark, pandas, polars, ...).
However, for large datasets, working with eagerly created collections sometimes seems wasteful, as we sometimes immediately chain other mappings or filters - i.e. we discard the List that gets returned from DataFrame.toListOf.
AFAIK it would be way more memory efficient to use a asSequenceOf in those scenarios.
In our application, we use the dataframe library to parse CSV and preprocess it before we transform the rows into our internal domain objects. This works really well and feels quite natural (coming from spark, pandas, polars, ...).
However, for large datasets, working with eagerly created collections sometimes seems wasteful, as we sometimes immediately chain other mappings or filters - i.e. we discard the List that gets returned from DataFrame.toListOf.
AFAIK it would be way more memory efficient to use a asSequenceOf in those scenarios.
Looking at the implementation of
toListOf()
:and
and
It looks like a
asSequenceOf()
method could trivially be added?The text was updated successfully, but these errors were encountered: