You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
importpolarsaspla=pl.LazyFrame({"a": [1, 2, 3]})
b=pl.LazyFrame({"a": [3, 4, 5]})
result=a.join(b, on="a", how="outer_coalesce")
result_broken=a.join(b, on="a", how="outer_coalesce").select(pl.all().name.map(lambdac: c.upper()))
print(result.collect(streaming=False))
print(result.collect(streaming=True))
print(result_broken.collect(streaming=False))
print(result_broken.collect(streaming=True)) # this one for some reason only has rows from b
As per code, outer_coalesce join between two tables leaves only rows from the second one if a join is followed by a rename of columns and then a streaming collect.
My test didn't verify that it was explicitly streaming and since collect(streaming=True) automatically will revert to not-streaming when it can't stream that's all that was happening. Since 0.20.14 was the first version that outer joins could stream that means this never worked in streaming not that there was a regression like I originally thought.
Checks
Reproducible example
Log output
Issue description
As per code, outer_coalesce join between two tables leaves only rows from the second one if a join is followed by a rename of columns and then a streaming collect.
Expected behavior
All should have the same result
Installed versions
The text was updated successfully, but these errors were encountered: