Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

highlight() trims long strings #20

Closed
moodymudskipper opened this issue Nov 6, 2022 · 4 comments · Fixed by #21
Closed

highlight() trims long strings #20

moodymudskipper opened this issue Nov 6, 2022 · 4 comments · Fixed by #21

Comments

@moodymudskipper
Copy link
Contributor

This causes issues in {styler} : r-lib/styler#216

x <- paste0('"', strrep("-", 1000), '"')
prettycode::highlight(x)
#> [1] "[1000 chars quoted with '\"']"

data <- getParseData(parsed, includeText = NA) is the culprit, we'd need includeText = TRUE and then a bit of wrangling.

If we want to keep the default behaviour could we add some argument to opt out ?

{prettycode} is sometimes used in use cases where faithfully representing code is important, so I hope it makes sense to fix it here.

For context such long strings can be found in the output of sessionInfo()

@gaborcsardi
Copy link
Member

tibble::as_tibble(getParseData(parse(text = paste0("fun('", strrep("-", 10), "')")), includeText=TRUE))
#> # A tibble: 7 × 9
#>   line1  col1 line2  col2    id parent token                terminal text       
#>   <int> <int> <int> <int> <int>  <int> <chr>                <lgl>    <chr>      
#> 1     1     1     1    17    10      0 expr                 FALSE    fun('-----…
#> 2     1     1     1     3     1      3 SYMBOL_FUNCTION_CALL TRUE     fun        
#> 3     1     1     1     3     3     10 expr                 FALSE    fun        
#> 4     1     4     1     4     2     10 '('                  TRUE     (          
#> 5     1     5     1    16     4      6 STR_CONST            TRUE     '---------…
#> 6     1     5     1    16     6     10 expr                 FALSE    '---------…
#> 7     1    17     1    17     5     10 ')'                  TRUE     )
tibble::as_tibble(getParseData(parse(text = paste0("fun('", strrep("-", 10), "')")), includeText=TRUE))$text[5:6]
#> [1] "'----------'" "'----------'"
tibble::as_tibble(getParseData(parse(text = paste0("fun('", strrep("-", 1000), "')")), includeText=TRUE))
#> # A tibble: 7 × 9
#>   line1  col1 line2  col2    id parent token                terminal text       
#>   <int> <int> <int> <int> <int>  <int> <chr>                <lgl>    <chr>      
#> 1     1     1     1  1007    10      0 expr                 FALSE    fun('-----…
#> 2     1     1     1     3     1      3 SYMBOL_FUNCTION_CALL TRUE     fun        
#> 3     1     1     1     3     3     10 expr                 FALSE    fun        
#> 4     1     4     1     4     2     10 '('                  TRUE     (          
#> 5     1     5     1  1006     4      6 STR_CONST            TRUE     [1000 char…
#> 6     1     5     1  1006     6     10 expr                 FALSE    '---------…
#> 7     1  1007     1  1007     5     10 ')'                  TRUE     )

Created on 2022-11-06 with reprex v2.0.2

@gaborcsardi
Copy link
Member

gaborcsardi commented Nov 6, 2022

I wonder if it is the same with raw strings. Probably.

@moodymudskipper
Copy link
Contributor Author

Yes it does, and with long symbols too, though I guess it's pretty rare to have a 1000 char long symbol.

This seems to do it :

# same as `getParseData(, includeText = NA)` but making sure strings and symbols are not trimmed
get_parse_data <- function(x) {
  # include text so we don't lose long strings and symbols
  data <- getParseData(x, includeText = TRUE)
  # fetch indices of potentially trimmed text
  ind <- which(data$token %in% c("STR_CONST", "SYMBOL")) 
  # replace with untrimmed
  data$text[ind] <- data$text[ind + 1]
  # remove text for non terminal tokens, as `getParseData(, includeText = NA)` would
  data$text[!data$terminal] <- ""
  data
}

Would you like a PR ?

@gaborcsardi
Copy link
Member

Would you like a PR ?

Yes, please. But please use the id and parent columns for the mapping, it is not guaranteed that the parent is in the next row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants