Add Conformance Test and implement L1 rule #30

behnam · 2017-05-12T06:18:20Z

Add conformance tests using BidiTest.txt

Initial results: 60494 test cases failed! (196253 passed)

Because of the current limitations of rust's test framework, and the huge number of failures, the base (not including matching pairs) conformance test is executed in one run, and a summary is panic'ed (if there's any failure) and for the integration test to pass, it's marked as should_panic, with summary of the test run as expected string.

Fix #1

Implement L1 rule

To be able to implement L1, we need access to more information from
BidiInfo, namely original_classes of the text, in visual_runs(),
which would mean it should pass through reorder_line().

The fact that information from BidiInfo is needed for both steps of
the public API (generating BidiInfo and consuming it
per-paragraph/per-level) made me change the API design and move these
methods into impl BidiInfo.

Then, since we needed access to text for every BidiInfo consumption,
I added a reference to text to BidiInfo, which also enables more
compile-time checks for BidiInfo isntance not outliving the text in
the user code.

NOTE: We are already breaking API in version 0.3.0 and paving for full
spec support is a good reason to do so, IMHO.

The L1 rule works by one pass on the text of the line.

Conformance Test: this implementation reduces the number of failures
from 60494 to 23770 (out of total 256747 cases).

Fix #2

This change is

behnam · 2017-05-12T06:23:08Z

src/lib.rs

+
+        let mut levels = self.levels.clone();
+
+        // Reset some whitespace chars to paragraph level.


Here's the L1 implementation.

The `Level` struct always represents a valid bidi embedding level, with checks for explicit and implicit boundries (max_depth and max_depth+1, respectively) therefore can fail on `new*()` and mutation (`raise*()`/`lower()`) methods. So, we have one level type, which is always a valid number of *implicit embedding level resolution*, but can be made invalid for *explicit embedding level resolution*. This is fine because external usage of the data structure would normally deal with the final resolved levels, which are the *implicit* ones (and that's why its boundry is made the implicit boundry here, as well), and the *explicit* bounrdy is supposed to be checked only during the early stages of the algorithm (the explicit stages), which can do so explicitly by calling the appropriate methods.

To be able to implement L1, we need access to more information from `BidiInfo`, namely `original_classes` of the `text`, in `visual_runs()`, which would mean it should pass through `reorder_line()`. The fact that information from `BidiInfo` is needed for both steps of the public API (generating `BidiInfo` and consuming it per-paragraph/per-level) made me change the API design and move these methods into `impl BidiInfo`. Then, since we needed access to `text` for every `BidiInfo` consumption, I added a reference to `text` to `BidiInfo`, which also enables more compile-time checks for `BidiInfo` isntance not outliving the text in the user code. NOTE: We are already breaking API in version 0.3.0 and paving for full spec support is a good reason to do so, IMHO. The L1 rule works by one pass on the text of the line. Conformance Test: this implementation reduces the number of failures from 60494 to 23770 (out of total 256747 cases). Fix #2

behnam · 2017-05-12T17:20:34Z

Actually, found a bug in my L1 code that just fixed, updating the conf-test results: 12827 test cases failed! (243920 passed)

behnam · 2017-05-12T17:38:29Z

Dependents status:

https://github.com/servo/rust-url : works fine with this stack of changes, no updated needed.
https://github.com/servo/servo : Depends on unicode_bidi_serde, which I'm working on now.

mbrubeck · 2017-05-15T18:04:27Z

@bors-servo r+

This is great, thanks! Do you have any more changes planned before we publish version 0.3.0?

bors-servo · 2017-05-15T18:04:28Z

📌 Commit 6427532 has been approved by mbrubeck

bors-servo · 2017-05-15T18:04:31Z

⌛ Testing commit 6427532 with merge 0fa0cfe...

Add Conformance Test and implement L1 rule Initial results: `60494 test cases failed! (196253 passed)` Because of the current limitations of rust's test framework, and the huge number of failures, the base (not including matching pairs) conformance test is executed in one run, and a summary is panic'ed (if there's any failure) and for the integration test to pass, it's marked as `should_panic`, with summary of the test run as `expected` string. Fix #1 To be able to implement L1, we need access to more information from `BidiInfo`, namely `original_classes` of the `text`, in `visual_runs()`, which would mean it should pass through `reorder_line()`. The fact that information from `BidiInfo` is needed for both steps of the public API (generating `BidiInfo` and consuming it per-paragraph/per-level) made me change the API design and move these methods into `impl BidiInfo`. Then, since we needed access to `text` for every `BidiInfo` consumption, I added a reference to `text` to `BidiInfo`, which also enables more compile-time checks for `BidiInfo` isntance not outliving the text in the user code. NOTE: We are already breaking API in version 0.3.0 and paving for full spec support is a good reason to do so, IMHO. The L1 rule works by one pass on the text of the line. Conformance Test: this implementation reduces the number of failures from 60494 to 23770 (out of total 256747 cases). Fix #2  --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/unicode-bidi/30)

bors-servo · 2017-05-15T18:06:21Z

☀️ Test successful - status-travis
Approved by: mbrubeck
Pushing 0fa0cfe to master...

behnam · 2017-05-15T20:28:55Z

Thanks for the review, @mbrubeck. Here's what I have in mind right now:

Make sure serde works and servo builds properly.
Working on the left over rules and try to reach 100% conformance for the base test.
Then add support for paired brackets tests and the rules.

I think I need a little help with the serde support, as looks like the current servo repo depends on serde 0.9. I have a logically backward-compatible serde 1.0 implementation with tests right now (working on the 0.9 version this week), but then I realized that we may not need special implementation and derive(Serialize, Deserialize) could actually be enough.

So, the question is, are there any cases where a serde data from unicode-bidi < 0.3 would be passed on to a serde deserializer of unicode-bidi >= 0.3?

Right now, I'm organizing serde implementation as two features: with_serde0 and with_serde1, but they won't be able to be enabled at the same time. So, do we need to be able to build unicode-bidi with support for both ~1.0 and <1.0 at the same time?

Also, I can see that https://github.com/serde-rs/legacy and relevant crates enable support for multiple versions of serde being linked, but I'm not sure how to do a similar thing for the serde_derive macros.

So, in short: if we only need either with_serde0 or with_serde1, and there's no need for cross-unicode-bidi-version compatibility, I can condition them into the library with short and easy derive attributes. Otherwise, I would need more details on the expectation.

mbrubeck · 2017-05-16T15:09:34Z

So, the question is, are there any cases where a serde data from unicode-bidi < 0.3 would be passed on to a serde deserializer of unicode-bidi >= 0.3?

No, I don't think we need to support this.

behnam commented May 12, 2017

View reviewed changes

behnam added 2 commits May 12, 2017 11:49

[lib] Drop pub use char_data::tables

3bfedcf

behnam force-pushed the rule-l1 branch 2 times, most recently from 9907c1d to e20e2c0 Compare May 12, 2017 17:01

behnam added 2 commits May 12, 2017 12:07

[tests] Add conformance tests using BidiTest.txt

1508534

[tests] Add test for gen_char_from_bidi_class()

19d9858

behnam force-pushed the rule-l1 branch 2 times, most recently from 79d1a87 to a59b450 Compare May 12, 2017 17:13

behnam force-pushed the rule-l1 branch from a59b450 to 6427532 Compare May 12, 2017 17:20

bors-servo merged commit 6427532 into servo:master May 15, 2017

behnam deleted the rule-l1 branch May 15, 2017 20:14

behnam mentioned this pull request May 15, 2017

Conformance Rust tests added #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Conformance Test and implement L1 rule #30

Add Conformance Test and implement L1 rule #30

behnam commented May 12, 2017 •

edited by larsbergstrom

Loading

behnam May 12, 2017

behnam commented May 12, 2017

behnam commented May 12, 2017

mbrubeck commented May 15, 2017

bors-servo commented May 15, 2017

bors-servo commented May 15, 2017

bors-servo commented May 15, 2017

behnam commented May 15, 2017

mbrubeck commented May 16, 2017


		let mut levels = self.levels.clone();

		// Reset some whitespace chars to paragraph level.

Add Conformance Test and implement L1 rule #30

Add Conformance Test and implement L1 rule #30

Conversation

behnam commented May 12, 2017 • edited by larsbergstrom Loading

Add conformance tests using BidiTest.txt

Implement L1 rule

behnam May 12, 2017

Choose a reason for hiding this comment

behnam commented May 12, 2017

behnam commented May 12, 2017

mbrubeck commented May 15, 2017

bors-servo commented May 15, 2017

bors-servo commented May 15, 2017

bors-servo commented May 15, 2017

behnam commented May 15, 2017

mbrubeck commented May 16, 2017

behnam commented May 12, 2017 •

edited by larsbergstrom

Loading