make tests more robust #1588

rbri · 2024-08-25T11:25:05Z

at least in my win environment, these tests are failing when running from the command line with jdk11.

@andreabergia i guess you are the original author of the test. Please have a look if this is still ok.

p-bakker · 2024-08-25T12:26:52Z

Might the reason why it fails be due to using Java 11, which supports an older version of the Unicode standard than which the latest EcmaScript spec demands?

In the test262 sure there are also a couple of unicode-related tests that fail on just Java 11 die to this issue

As both Java and EcmaScript track the latest unicode version, older Java versions only support older Unicode versions and as Unicode support in Rhino depends on the Unicode support of the Java version on which Rhino is run, even if all unicode logic in Rhino is according to spec, on an older version of Java Rhino may behave in an incompatible way

andreabergia · 2024-08-25T13:39:41Z

Looks good to me, although it should not be necessary - the java source files are utf8, and in the build.gradle file for tests there is:

    systemProperty 'file.encoding', 'UTF-8'

So... the real question is why this it failing for you, locally, without this change?

rbri · 2024-08-25T14:51:39Z

So... the real question is why this it failing for you, locally, without this change?

Agree, but at the moment i have no idea why this fails, and my gradle knowledge is not that good... my assumption was to simple have it like the other tests

tonygermano · 2024-08-27T07:37:15Z

I ran the test with

./gradlew :tests:test --tests org.mozilla.javascript.tests.ParserTest

On java 11, testParseUnicodeMultibyteCharacter passed. I did get a failure on testParseUnicodeIdentifierPartWhichIsNotJavaIdentifierPart with the following stack trace.

java.lang.AssertionError: extra error: illegal character: 鿫
	at org.junit.Assert.fail(Assert.java:89)
	at org.mozilla.javascript.testing.TestErrorReporter.error(TestErrorReporter.java:42)
	at org.mozilla.javascript.Parser.addError(Parser.java:241)
	at org.mozilla.javascript.Parser.addError(Parser.java:227)
	at org.mozilla.javascript.Parser.addError(Parser.java:232)
	at org.mozilla.javascript.TokenStream.getToken(TokenStream.java:1357)
	at org.mozilla.javascript.Parser.peekToken(Parser.java:380)
	at org.mozilla.javascript.Parser.name(Parser.java:3220)
	at org.mozilla.javascript.Parser.primaryExpr(Parser.java:3117)
	at org.mozilla.javascript.Parser.memberExpr(Parser.java:2722)
	at org.mozilla.javascript.Parser.unaryExpr(Parser.java:2625)
	at org.mozilla.javascript.Parser.expExpr(Parser.java:2545)
	at org.mozilla.javascript.Parser.mulExpr(Parser.java:2528)
	at org.mozilla.javascript.Parser.addExpr(Parser.java:2514)
	at org.mozilla.javascript.Parser.shiftExpr(Parser.java:2497)
	at org.mozilla.javascript.Parser.relExpr(Parser.java:2475)
	at org.mozilla.javascript.Parser.eqExpr(Parser.java:2451)
	at org.mozilla.javascript.Parser.bitAndExpr(Parser.java:2442)
	at org.mozilla.javascript.Parser.bitXorExpr(Parser.java:2433)
	at org.mozilla.javascript.Parser.bitOrExpr(Parser.java:2424)
	at org.mozilla.javascript.Parser.andExpr(Parser.java:2415)
	at org.mozilla.javascript.Parser.orExpr(Parser.java:2406)
	at org.mozilla.javascript.Parser.condExpr(Parser.java:2373)
	at org.mozilla.javascript.Parser.assignExpr(Parser.java:2331)
	at org.mozilla.javascript.Parser.expr(Parser.java:2310)
	at org.mozilla.javascript.Parser.nameOrLabel(Parser.java:2068)
	at org.mozilla.javascript.Parser.statementHelper(Parser.java:1258)
	at org.mozilla.javascript.Parser.statement(Parser.java:1122)
	at org.mozilla.javascript.Parser.parse(Parser.java:616)
	at org.mozilla.javascript.Parser.parse(Parser.java:546)
	at org.mozilla.javascript.tests.ParserTest.parse(ParserTest.java:1462)
	at org.mozilla.javascript.tests.ParserTest.parse(ParserTest.java:1429)
	at org.mozilla.javascript.tests.ParserTest.parse(ParserTest.java:1424)
	at org.mozilla.javascript.tests.ParserTest.parse(ParserTest.java:1420)
	at org.mozilla.javascript.tests.ParserTest.testParseUnicodeIdentifierPartWhichIsNotJavaIdentifierPart(ParserTest.java:1216)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)

I switched to java 17 and reran the test and everything passed.

Likewise, in the shell, java 11:

js> var a鿫='hi'
js: "<stdin>", line 4: illegal character: 鿫
js: var a鿫='hi'
js: .....^
js: "<stdin>", line 4: Compilation produced 1 syntax errors.

java 17:

js> var a鿫='hi'
js> print(a鿫)
hi

p-bakker · 2024-08-27T12:15:08Z

Isn't this because Java 11 not recognizing the astral plane character as such, thus returning only the first part of the surrogate pair, which in turn is an invalid character?

andreabergia · 2024-08-27T13:13:43Z

That passes locally for me on mac os (and I assume on linux, in the github actions CI).
So, that seems a problem with the Windows JDK?

Regardless, the changes in this PR look good to me - the in-memory String objects have the same value regardless of how we write them in the source code (I checked).
So, LGTM 👍

tonygermano · 2024-08-27T20:11:57Z

I'm running linux for my gradle tests and rhino shell.

I think the test is good and the issue is in the parser. The test that is failing does not use a surrogate pair. The one with the surrogate pair is the one that is passing.

This is run with the same java 11 jvm that failed the test and threw the exception.

js> '鿫' === '\u9FEB'
true
js> '𠮷' === '\uD842\uDFB7'
true
js> 鿫=1
js: "<stdin>", line 4: illegal character: 鿫
js: 鿫=1
js: ^
js: "<stdin>", line 4: Compilation produced 1 syntax errors.

js> 𠮷=1
1

The problem occurs when the particular unicode character appears as part of an identifier, but it's fine in a string.

tonygermano · 2024-08-27T20:44:48Z

I upgraded from java 11.0.15 to 11.0.24 and everything works now.

tonygermano · 2024-08-27T20:50:48Z

java 11.0.15

js> var unicodeString = new java.lang.String('鿫')
js> unicodeString.codePointAt(0)
40939
js> java.lang.Character.isUnicodeIdentifierStart(unicodeString.codePointAt(0))
false
js> java.lang.Character.isUnicodeIdentifierPart(unicodeString.codePointAt(0))
false

java 11.0.24

js> var unicodeString = new java.lang.String('鿫')
js> unicodeString.codePointAt(0)
40939
js> java.lang.Character.isUnicodeIdentifierStart(unicodeString.codePointAt(0))
true
js> java.lang.Character.isUnicodeIdentifierPart(unicodeString.codePointAt(0))
true

rbri · 2024-08-27T21:02:06Z

@tonygermano this means we do not need ro change the test?

tonygermano · 2024-08-27T21:21:53Z

@tonygermano this means we do not need ro change the test?

For me the test failed on an older java 11, but passed when I ran it on a current java 11. I would say that means we don't need to change it, but I'll defer to others on this.

gbrail · 2024-08-28T04:39:27Z

On Windows, with the latest JDK 11, the tests failed before and passed with this change. Plus, I agree that putting in the literal code points is more portable anyway. Thanks!

tonygermano · 2024-08-28T18:32:20Z

@gbrail I'm curious what the failure message was when running on the latest windows jdk 11. Was it different than the error I shared? What java distro were you using? I was testing with Azul Zulu.

For me there was definitely something broken in the rhino shell related to identifier names using unicode characters depending on which java version I was using. I think I would expect there to be a failing test to go along with that when the issue is java related, or we probably don't have full test coverage.

gbrail · 2024-08-28T20:06:19Z

Here's what I get on Windows 11 with Java 11.0.24: testParseUnicodeIdentifierPartWhichIsNotJavaIdentifierPart org.junit.ComparisonFailure: expected:<a[é¿«]> but was:<a[鿫]> at org.junit.Assert.assertEquals(Assert.java:117) at org.junit.Assert.assertEquals(Assert.java:146) at org.mozilla.javascript.tests.ParserTest.testParseUnicodeIdentifierPartWhichIsNotJavaIdentifierPart(ParserTest.java:1218) at java.base/java.lang.reflect.Method.invoke(Method.java:566) testParseUnicodeMultibyteCharacter org.junit.ComparisonFailure: expected:<[ð ®·]> but was:<[𠮷]> at org.junit.Assert.assertEquals(Assert.java:117) at org.junit.Assert.assertEquals(Assert.java:146) at org.mozilla.javascript.tests.ParserTest.testParseUnicodeMultibyteCharacter(ParserTest.java:1207) at java.base/java.lang.reflect.Method.invoke(Method.java:566) Wrap lines

…

On Wed, Aug 28, 2024 at 11:32 AM Tony Germano ***@***.***> wrote: @gbrail <https://github.com/gbrail> I'm curious what the failure message was when running on the latest windows jdk 11. Was it different than the error I shared? What java distro were you using? I was testing with Azul Zulu. For me there was definitely something broken in the rhino shell related to identifier names using unicode characters depending on which java version I was using. I think I would expect there to be a failing test to go along with that when the issue is java related, or we probably don't have full test coverage. — Reply to this email directly, view it on GitHub <#1588 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAD7I23LCF5WXNJM4HCKUNDZTYJUVAVCNFSM6AAAAABNCORYFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJWGAYDSNRSGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

gbrail · 2024-08-28T20:07:00Z

To be clear, that's the error I got before merging this PR. With this PR merged, the tests now all pass.

…

On Wed, Aug 28, 2024 at 1:06 PM Gregory Brail ***@***.***> wrote: Here's what I get on Windows 11 with Java 11.0.24: testParseUnicodeIdentifierPartWhichIsNotJavaIdentifierPart org.junit.ComparisonFailure: expected:<a[é¿«]> but was:<a[鿫]> at org.junit.Assert.assertEquals(Assert.java:117) at org.junit.Assert.assertEquals(Assert.java:146) at org.mozilla.javascript.tests.ParserTest.testParseUnicodeIdentifierPartWhichIsNotJavaIdentifierPart(ParserTest.java:1218) at java.base/java.lang.reflect.Method.invoke(Method.java:566) testParseUnicodeMultibyteCharacter org.junit.ComparisonFailure: expected:<[ð ®·]> but was:<[𠮷]> at org.junit.Assert.assertEquals(Assert.java:117) at org.junit.Assert.assertEquals(Assert.java:146) at org.mozilla.javascript.tests.ParserTest.testParseUnicodeMultibyteCharacter(ParserTest.java:1207) at java.base/java.lang.reflect.Method.invoke(Method.java:566) Wrap lines On Wed, Aug 28, 2024 at 11:32 AM Tony Germano ***@***.***> wrote: > @gbrail <https://github.com/gbrail> I'm curious what the failure message > was when running on the latest windows jdk 11. Was it different than the > error I shared? What java distro were you using? I was testing with Azul > Zulu. > > For me there was definitely something broken in the rhino shell related > to identifier names using unicode characters depending on which java > version I was using. I think I would expect there to be a failing test to > go along with that when the issue is java related, or we probably don't > have full test coverage. > > — > Reply to this email directly, view it on GitHub > <#1588 (comment)>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAD7I23LCF5WXNJM4HCKUNDZTYJUVAVCNFSM6AAAAABNCORYFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJWGAYDSNRSGU> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

tonygermano · 2024-08-28T21:40:52Z

Interesting... so different than the error I was getting in the older linux jdk 11. Your error looks like an issue with character encoding (I think it's probably reading the source file as windows-1252 instead of UTF-8.)

tonygermano · 2024-08-28T21:46:52Z

js> new java.lang.String(new java.lang.String('é¿«').getBytes('windows-1252'), 'UTF-8')
鿫

tonygermano · 2024-08-28T22:09:14Z

@gbrail If you revert the change and add systemProp.file.encoding=UTF-8 to gradle.properties does it still fail?

tonygermano · 2024-08-28T23:23:25Z

Ignore that last request. I got my linux jvm to fail in the same way as your windows one by adding this

diff --git a/buildSrc/src/main/groovy/rhino.java-conventions.gradle b/buildSrc/src/main/groovy/rhino.java-conventions.gradle
index 9e6d3b49..54587f4f 100644
--- a/buildSrc/src/main/groovy/rhino.java-conventions.gradle
+++ b/buildSrc/src/main/groovy/rhino.java-conventions.gradle
@@ -17,6 +17,10 @@ dependencies {
     testImplementation "javax.xml.soap:javax.xml.soap-api:1.4.0"
 }
 
+tasks.withType(JavaCompile) {
+    options.encoding = "windows-1252"
+}
+
 compileJava {
     options.compilerArgs = [
         '-Xlint:deprecation,unchecked'

So, I think the solution to allowing ~~unicode~~ non-ASCII characters in the source files would be to make that same change, but force it to "UTF-8" instead of "windows-1252." The tasks.withType(JavaCompile) takes care of both the compileJava and compileTestJava tasks.

make tests more robust

596e258

gbrail merged commit c0936ca into mozilla:master Aug 28, 2024
3 checks passed

tonygermano mentioned this pull request Aug 29, 2024

Use UTF-8 decoding for java source files #1590

Merged

gbrail mentioned this pull request Aug 30, 2024

Revert "make tests more robust" #1595

Merged

rbri deleted the unicode_jdk11 branch September 29, 2024 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make tests more robust #1588

make tests more robust #1588

rbri commented Aug 25, 2024

p-bakker commented Aug 25, 2024

andreabergia commented Aug 25, 2024

rbri commented Aug 25, 2024

tonygermano commented Aug 27, 2024 •

edited

Loading

p-bakker commented Aug 27, 2024

andreabergia commented Aug 27, 2024

tonygermano commented Aug 27, 2024 •

edited

Loading

tonygermano commented Aug 27, 2024

tonygermano commented Aug 27, 2024

rbri commented Aug 27, 2024

tonygermano commented Aug 27, 2024

gbrail commented Aug 28, 2024

tonygermano commented Aug 28, 2024

gbrail commented Aug 28, 2024 via email

gbrail commented Aug 28, 2024 via email

tonygermano commented Aug 28, 2024

tonygermano commented Aug 28, 2024

tonygermano commented Aug 28, 2024

tonygermano commented Aug 28, 2024 •

edited

Loading

make tests more robust #1588

make tests more robust #1588

Conversation

rbri commented Aug 25, 2024

p-bakker commented Aug 25, 2024

andreabergia commented Aug 25, 2024

rbri commented Aug 25, 2024

tonygermano commented Aug 27, 2024 • edited Loading

p-bakker commented Aug 27, 2024

andreabergia commented Aug 27, 2024

tonygermano commented Aug 27, 2024 • edited Loading

tonygermano commented Aug 27, 2024

tonygermano commented Aug 27, 2024

rbri commented Aug 27, 2024

tonygermano commented Aug 27, 2024

gbrail commented Aug 28, 2024

tonygermano commented Aug 28, 2024

gbrail commented Aug 28, 2024 via email

gbrail commented Aug 28, 2024 via email

tonygermano commented Aug 28, 2024

tonygermano commented Aug 28, 2024

tonygermano commented Aug 28, 2024

tonygermano commented Aug 28, 2024 • edited Loading

tonygermano commented Aug 27, 2024 •

edited

Loading

tonygermano commented Aug 27, 2024 •

edited

Loading

tonygermano commented Aug 28, 2024 •

edited

Loading