Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update chk_blank() #46

Closed
gorcha opened this issue Sep 11, 2021 · 3 comments
Closed

Update chk_blank() #46

gorcha opened this issue Sep 11, 2021 · 3 comments

Comments

@gorcha
Copy link
Collaborator

gorcha commented Sep 11, 2021

Basically the same as the srcutils change for is_blank() (socialresearchcentre/srcutils#95) - using %in% is unnecessarily slow.

chk_blank <- function(x) {
  if (is.character(x)) {
    is.na(x) | x == ""
  } else {
    is.na(x)
  }
}

I noticed the final is_blank() had a couple of extra bits there, would be good to get your feedback on these @kinto-b:

  • There is a stopifnot in the character block to check for residual NA values - I think we have a guarantee of no NA values already because of the way we're checking, so going to leave this out.
  • is_blank() checks for "" in factors as well as character variables. I'm not sure how I feel about this for factors since they're conceptually quite different (i.e. I think it's reasonable to assume the user has appropriately checked their values), although this would be a breaking change.
@gorcha gorcha added this to the CRAN milestone Sep 11, 2021
@kinto-b
Copy link
Contributor

kinto-b commented Sep 12, 2021

Yeah, I don't think the stopifnot() clause is necessary. I'm not sure about factors though. I might put it to the team in an oblique way to see what they find most intuitive. I'll ask them to fill in the blanks and get back to you:

x <- c("cat", "dog", "", "cat", NA)
chk_blank(x)
#> FALSE FALSE TRUE FALSE TRUE

y <- factor(x)
chk_blank(y)
#> FALSE FALSE LGL1 FALSE LGL2

@gorcha
Copy link
Collaborator Author

gorcha commented Sep 13, 2021

I'm kind of torn - if factors are treated like character then a "" level on a factor will be treated as a blank. On the one hand this makes sense since the value is "", but on the other it's an assigned level so 🤷

I'm going to treat factors like characters for the moment since that's the current behaviour, but might change my mind at some point 😛

gorcha added a commit that referenced this issue Sep 13, 2021
@kinto-b
Copy link
Contributor

kinto-b commented Sep 13, 2021

Yeah, there weren't any strong intuitions in the team

@gorcha gorcha closed this as completed in d12eaa8 Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants