Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix load_all in windows with UTF8 codes #1378

Closed
wants to merge 3 commits into from

Conversation

shrektan
Copy link

If the source file contains nonASCII chars, like Chinese Chars, devtools::load_all() would fail with error as below, but Build & Reload the package works fine.

I think it's due to not explicitly setting the encoding param in readLines of source_one, see https://github.com/hadley/devtools/blob/master/R/source.r#L21

The error message sample

Loading package_abc
Error in parse(text = lines, n = -1, srcfile = srcfile) :
  invalid multibyte character in parser at line 12
In addition: Warning messages:
1: In grepl("\n", lines, fixed = TRUE) :
  input string 12 is invalid in this locale
2: In grepl("\n", lines, fixed = TRUE) :
  input string 14 is invalid in this locale
3: In grepl("\n", lines, fixed = TRUE) :
  input string 15 is invalid in this locale

Notes

  1. add encoding = "UTF-8" in readLines() to fix the reading from the file
  2. convert the lines to native encoding, so the printing of the function displays correctly

1. readLines should read with encoding = utf-8
1. convert to native encoding to ensure the printing of the function correct
@hansharhoff
Copy link
Contributor

I believe this is closely related to issue #1312

@@ -18,7 +18,8 @@ source_one <- function(file, envir = parent.frame()) {
stopifnot(file.exists(file))
stopifnot(is.environment(envir))

lines <- readLines(file, warn = FALSE)
lines <- readLines(file, warn = FALSE, encoding = "UTF-8")
lines <- enc2native(lines)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this line is necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is necessary, because I just reproduced this to confirm.

  1. Let's say we are working on a windows platform, whose native encoding is not utf-8,
  2. If we have a function (e.g., futf8()) that contains nonASCII chars in some package ,
  3. Without lines <- enc2native(lines), devtools::load_all() performs good and the function futf8() works well,
  4. But, if you print(futf8), you will find mess in the console,
  5. It will be solved by adding this line.

without the line

image

with this line

image

@shrektan
Copy link
Author

@hadley Oh, just notice that the load_all() has been moved to https://github.com/r-pkgs/pkgload

However, I just added support for reading encoding from DESCRIPTION as well as a unit test.

@shrektan
Copy link
Author

shrektan commented Jun 5, 2017

Close because related code has been moved to https://github.com/r-pkgs/pkgload

@shrektan shrektan closed this Jun 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants