highlight() trims long strings #20

moodymudskipper · 2022-11-06T15:48:57Z

This causes issues in {styler} : r-lib/styler#216

x <- paste0('"', strrep("-", 1000), '"')
prettycode::highlight(x)
#> [1] "[1000 chars quoted with '\"']"

data <- getParseData(parsed, includeText = NA) is the culprit, we'd need includeText = TRUE and then a bit of wrangling.

If we want to keep the default behaviour could we add some argument to opt out ?

{prettycode} is sometimes used in use cases where faithfully representing code is important, so I hope it makes sense to fix it here.

For context such long strings can be found in the output of sessionInfo()

The text was updated successfully, but these errors were encountered:

gaborcsardi · 2022-11-06T21:29:09Z

tibble::as_tibble(getParseData(parse(text = paste0("fun('", strrep("-", 10), "')")), includeText=TRUE))
#> # A tibble: 7 × 9
#>   line1  col1 line2  col2    id parent token                terminal text       
#>   <int> <int> <int> <int> <int>  <int> <chr>                <lgl>    <chr>      
#> 1     1     1     1    17    10      0 expr                 FALSE    fun('-----…
#> 2     1     1     1     3     1      3 SYMBOL_FUNCTION_CALL TRUE     fun        
#> 3     1     1     1     3     3     10 expr                 FALSE    fun        
#> 4     1     4     1     4     2     10 '('                  TRUE     (          
#> 5     1     5     1    16     4      6 STR_CONST            TRUE     '---------…
#> 6     1     5     1    16     6     10 expr                 FALSE    '---------…
#> 7     1    17     1    17     5     10 ')'                  TRUE     )
tibble::as_tibble(getParseData(parse(text = paste0("fun('", strrep("-", 10), "')")), includeText=TRUE))$text[5:6]
#> [1] "'----------'" "'----------'"
tibble::as_tibble(getParseData(parse(text = paste0("fun('", strrep("-", 1000), "')")), includeText=TRUE))
#> # A tibble: 7 × 9
#>   line1  col1 line2  col2    id parent token                terminal text       
#>   <int> <int> <int> <int> <int>  <int> <chr>                <lgl>    <chr>      
#> 1     1     1     1  1007    10      0 expr                 FALSE    fun('-----…
#> 2     1     1     1     3     1      3 SYMBOL_FUNCTION_CALL TRUE     fun        
#> 3     1     1     1     3     3     10 expr                 FALSE    fun        
#> 4     1     4     1     4     2     10 '('                  TRUE     (          
#> 5     1     5     1  1006     4      6 STR_CONST            TRUE     [1000 char…
#> 6     1     5     1  1006     6     10 expr                 FALSE    '---------…
#> 7     1  1007     1  1007     5     10 ')'                  TRUE     )

^{Created on 2022-11-06 with reprex v2.0.2}

gaborcsardi · 2022-11-06T21:30:05Z

I wonder if it is the same with raw strings. Probably.

moodymudskipper · 2022-11-10T10:50:15Z

Yes it does, and with long symbols too, though I guess it's pretty rare to have a 1000 char long symbol.

This seems to do it :

# same as `getParseData(, includeText = NA)` but making sure strings and symbols are not trimmed
get_parse_data <- function(x) {
  # include text so we don't lose long strings and symbols
  data <- getParseData(x, includeText = TRUE)
  # fetch indices of potentially trimmed text
  ind <- which(data$token %in% c("STR_CONST", "SYMBOL")) 
  # replace with untrimmed
  data$text[ind] <- data$text[ind + 1]
  # remove text for non terminal tokens, as `getParseData(, includeText = NA)` would
  data$text[!data$terminal] <- ""
  data
}

Would you like a PR ?

gaborcsardi · 2022-11-10T10:57:56Z

Would you like a PR ?

Yes, please. But please use the id and parent columns for the mapping, it is not guaranteed that the parent is in the next row.

Closes #20

moodymudskipper mentioned this issue Nov 10, 2022

Fix handling of long strings and symbols in highlight() #21

Merged

gaborcsardi closed this as completed in #21 Nov 10, 2022

gaborcsardi added a commit that referenced this issue Nov 10, 2022

Merge pull request #21 from moodymudskipper/main

cf23660

Closes #20

moodymudskipper mentioned this issue Jan 13, 2023

long strings are not constructed correctly cynkra/constructive#90

Closed

moodymudskipper mentioned this issue Sep 30, 2024

code_highlight() trims long strings r-lib/cli#726

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

highlight() trims long strings #20

highlight() trims long strings #20

moodymudskipper commented Nov 6, 2022

gaborcsardi commented Nov 6, 2022

gaborcsardi commented Nov 6, 2022 •

edited

Loading

moodymudskipper commented Nov 10, 2022

gaborcsardi commented Nov 10, 2022

highlight() trims long strings #20

highlight() trims long strings #20

Comments

moodymudskipper commented Nov 6, 2022

gaborcsardi commented Nov 6, 2022

gaborcsardi commented Nov 6, 2022 • edited Loading

moodymudskipper commented Nov 10, 2022

gaborcsardi commented Nov 10, 2022

gaborcsardi commented Nov 6, 2022 •

edited

Loading