Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding error with local .js files after openJDK 17.0.3 #1232

Closed
midgleyc opened this issue May 28, 2022 · 6 comments
Closed

Encoding error with local .js files after openJDK 17.0.3 #1232

midgleyc opened this issue May 28, 2022 · 6 comments
Labels
Java Compatibility Issues related to Rhino being compatible to (new) Java releases

Comments

@midgleyc
Copy link
Contributor

In https://bugs.openjdk.java.net/browse/JDK-8273655, a change was made so that ".js" files now are seen as having content type "text/javascript".

This leads to a change in the UrlModuleSourceProvider where local ".js" files (or any ".js" files without a Content-Type encoding) are interpreted as having a character encoding of "8859_1" while previously they were interpreted as having a character encoding of "utf-8".

This has lead to a problem with KoLMafia's ability to parse javascript scripts, as discussed in this thread.

I've put together a repo at java-probe-content-type which shows the difference for local files.

I don't know what a good fix to this would look like (as assuming files without an explicit encoding are cp-1252 is as the spec says), but it might be a quicker fix if getCharacterEncoding were protected instead of private static. That way we could override it on our end to expect utf-8 instead of 8859_1 without affecting other uses of this library.

@p-bakker
Copy link
Collaborator

p-bakker commented May 29, 2022

One way to solve this locally without any code changes is overiding the content-types.properties file used by setting the content.types.user.table system property to point to a custom file

@midgleyc
Copy link
Contributor Author

midgleyc commented May 29, 2022

Filed #1233 to make getCharacterEncoding protected. Thinking more I think Rhino can fix the problem on its end: we only have the issue for local files, and you can use url.getProtocol() to determine whether the file is local or not (if local, the protocol is file). The text/ adjustment should likely only be done if the file comes over the HTTP protocol, so

if (contentType != null && contentType.startsWith("text/")) {
    return "8859_1";
}

could be replaced with

String protocol = url.getProtocol();
boolean isHttp = protocol != null && (protocol.equals("http") || protocol.equals("https"));
if (isHttp && contentType != null && contentType.startsWith("text/")) {
    return "8859_1";
}

Does this seem reasonable?

@p-bakker
Copy link
Collaborator

p-bakker commented Jun 1, 2022

I haven't found any documentation that local files in CommonJS ought to be stored in utf-8. However, it seems to me that NodeJS expects this.

@botic would you have more insight, given that RingoJS supports CommonJS as well?

So imho I'd think that defaulting to utf-8 for local files might be the easiest.

Note that the defaulting to 8859_1 is basically wrong as well: that used to be the default in http, but has been removed from the spec

And nowadays the BOM ought to be checked (even for utf8 where a BOM is optional)

@p-bakker p-bakker added Java Compatibility Issues related to Rhino being compatible to (new) Java releases labels Jul 1, 2022
@botic
Copy link

botic commented Jul 8, 2022

In Ringo the module loading process uses the charset specified in the Ringo config. Every module load will enforce this charset. Default is utf-8 for all text-based resources.

@alinbrici46
Copy link

Anyone have any idea when this will be fixed?

@midgleyc
Copy link
Contributor Author

midgleyc commented Jun 21, 2024

Fix (workaround) released with 1.7.15.

@p-bakker p-bakker added this to the Release 1.7.15 milestone Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Compatibility Issues related to Rhino being compatible to (new) Java releases
Projects
None yet
Development

No branches or pull requests

4 participants