-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duckplyr 1.0.0 #724
base: main
Are you sure you want to change the base?
duckplyr 1.0.0 #724
Conversation
Furthermore, this blog post might need a benchmark. Maybe it could be structured around "why bother" (despite already having code that works without duckplyr, despite the fallbacks and some "annoying" incompatibilities like factors and timezones): duckplyr already works fairly well, and is under active development. And the choice is IMHO probably not duckplyr vs dplyr but rather duckplyr vs other dplyr backends. So large data support is crucial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duckdb_tibble()
needs the dot, the other functions don't.
|
||
```{r} | ||
out <- babynames |> | ||
duckdb_tibble(prudence = "lavish") |> # default value of prudence :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as_duckdb_tibble()
or .prudence
:
duckdb_tibble(prudence = "lavish") |> # default value of prudence :-) | |
as_duckdb_tibble(prudence = "lavish") |> # default value of prudence :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh 🤦♀️
b0ccfba
to
0df0d53
Compare
- [computation to files](https://duckplyr.tidyverse.org/reference/compute_file.html) using `compute_parquet()` or `compute_csv()`. | ||
|
||
A drawback of analyzing large data with duckplyr is that the limitations of duckplyr won't be compensated by fallbacks, since fallbacks to dplyr necessitate putting data into memory. | ||
Therefore, if your pipeline encounters fallbacks, you might want to work around them by converting the duck frame into a table through `compute()` then running SQL code through the experimental `read_sql_duckdb()` function. Again, over time, we expect more native support for dplyr functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krlmlr could we tweak the example to use ceiling()
that isn't supported I think? So it'd look more realistic. (I do not know SQL 🙈 )
@krlmlr the "stingy" example does not work, it should generate an error but does not. 🤔
I'm a bit undecided regarding structure. I tried starting with basic usage, but even simply discussing
library()
vs individual activation viaduck_tibble()
is better done with some understanding of prudence I think.