-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix subassignment to an all-NA vector #903
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me.
Do we really want to make tibble behave differently from vctrs? It is not clear that we can make vctrs behave like this without extensive changes as the vctrs C code (recursive through data frames) generally assumes that assignment does not change the type of underlying C arrays. Given the small activity of the linked issues, this doesn't seem like an urgent fix. I would wait until we figure this out in vctrs to avoid having inconsistent semantics between tibble, dplyr, funs. |
I can look into fixing the vctrs issue end of August. Would that be a good timeframe @krlmlr? |
I no longer agree that this is behavior that is "different from vctrs". It was an undesired breaking change when we implemented our own version of subassignment. This PR undoes this breaking change. The corresponding issue has got a few "thumbs up"; the bug doesn't go unnoticed. It is surprising, and something that I think should "just work". I see vctrs as an infrastructure provider to simplify certain parts of the implementation; if vctrs is not ready, we can work around or prototype here. Given the number of issues in the vctrs repo, maybe there are other things that are even more important. The workaround here is small and self-contained, and could perhaps be ported to vctrs. Perhaps the goals of tibble's API are also slightly different from vctrs; we could conceivably maintain the "same type no matter what" guarantee in vctrs and relax in tibble. |
This breaking change aligned the semantics with vctrs and packages using vctrs. The PR undoes the alignment.
There's a few reactions in a 2 years old issue.
I think so too, but there's value in keeping semantics aligned. Could you take a look at #903 (comment)? |
I saw that comment. I'd prefer to keep the work decoupled as much as possible. We have tests in tibble, including tests against dev versions. If vctrs changes in an unpredicted and incompatible way, we get an early warning and can discuss. We seem to agree that What would the fix in vctrs look like? Would it be very different from this PR? I think we can gain a lot by merging here before we fix in vctrs. I just saw the following property of vctrs::vec_assign(tibble::tibble(a = 1:3, b = 4:6), 2, tibble::tibble(a = 4, b = 8))
#> # A tibble: 3 x 2
#> a b
#> <int> <int>
#> 1 1 4
#> 2 4 8
#> 3 3 6 Created on 2021-07-21 by the reprex package (v2.0.0.9000) This is not how tibble uses tbl <- data.frame(a = 1:3)
tbl$x <- data.frame(b = rep(NA_real_, 3), c = rep(NA_character_, 3))
tbl[2, "x"]$b <- 2
#> Warning in `[<-.data.frame`(`*tmp*`, 2, "x", value = structure(list(b = 2, :
#> provided 2 variables to replace 1 variables
tbl
#> a x.b x.c
#> 1 1 NA <NA>
#> 2 2 2 2
#> 3 3 NA <NA> Created on 2021-07-21 by the reprex package (v2.0.0.9000) It would be pretty cool if we could do that with the help of vctrs -- when it's ready there: library(tibble)
tbl <- tibble(a = 1:3, x = tibble(b = NA_real_, c = NA_character_))
tbl[2, "x"]$b <- 2
#> Error: Can't recycle input of size 2 to size 1. Created on 2021-07-21 by the reprex package (v2.0.0.9000) Until then I'm fine with a solution that covers the vast majority of the real-world use cases and idioms. |
Happy to collaborate on a proper solution in vctrs any time that works for you. |
Thank you. I'm 70% sure we can find a solution in vctrs. Let's start by discussing what should happen in the other 30%. Wouldn't it be unfortunate for tibble to have different assignment semantics than the rest of the tidyverse? This was my concern and reason for holding this PR until we have time to take a deep look. |
We may never be able to achieve perfect consistency, or only at too high a cost. I think this compromise gives us room to breathe for now. In the unfortunate (and less probable) case we still can make it stricter later if it's worthwhile. At the moment it seems that idiomatic behavior is more desirable. tibble 3.1.3 is on CRAN now. I'm sure it will play out one way or another. |
It is going to take quite some time until we fix in vctrs. The code here is reasonably lean and can be removed once
vec_slice<-()
has an option that implements this.Closes #773, closes #868.