Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Conformance Test and implement L1 rule #30

Merged
merged 5 commits into from
May 15, 2017
Merged

Conversation

behnam
Copy link
Contributor

@behnam behnam commented May 12, 2017

Add conformance tests using BidiTest.txt

Initial results: 60494 test cases failed! (196253 passed)

Because of the current limitations of rust's test framework, and the huge number of failures, the base (not including matching pairs) conformance test is executed in one run, and a summary is panic'ed (if there's any failure) and for the integration test to pass, it's marked as should_panic, with summary of the test run as expected string.

Fix #1

Implement L1 rule

To be able to implement L1, we need access to more information from
BidiInfo, namely original_classes of the text, in visual_runs(),
which would mean it should pass through reorder_line().

The fact that information from BidiInfo is needed for both steps of
the public API (generating BidiInfo and consuming it
per-paragraph/per-level) made me change the API design and move these
methods into impl BidiInfo.

Then, since we needed access to text for every BidiInfo consumption,
I added a reference to text to BidiInfo, which also enables more
compile-time checks for BidiInfo isntance not outliving the text in
the user code.

NOTE: We are already breaking API in version 0.3.0 and paving for full
spec support is a good reason to do so, IMHO.

The L1 rule works by one pass on the text of the line.

Conformance Test: this implementation reduces the number of failures
from 60494 to 23770 (out of total 256747 cases).

Fix #2


This change is Reviewable


let mut levels = self.levels.clone();

// Reset some whitespace chars to paragraph level.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the L1 implementation.

behnam added 2 commits May 12, 2017 11:49
The `Level` struct always represents a valid bidi embedding level, with
checks for explicit and implicit boundries (max_depth and max_depth+1,
respectively) therefore can fail on `new*()` and mutation
(`raise*()`/`lower()`) methods.

So, we have one level type, which is always a valid number of *implicit
embedding level resolution*, but can be made invalid for *explicit
embedding level resolution*. This is fine because external usage of the
data structure would normally deal with the final resolved levels, which
are the *implicit* ones (and that's why its boundry is made the implicit
boundry here, as well), and the *explicit* bounrdy is supposed to be
checked only during the early stages of the algorithm (the explicit
stages), which can do so explicitly by calling the appropriate methods.
@behnam behnam force-pushed the rule-l1 branch 2 times, most recently from 9907c1d to e20e2c0 Compare May 12, 2017 17:01
@behnam behnam force-pushed the rule-l1 branch 2 times, most recently from 79d1a87 to a59b450 Compare May 12, 2017 17:13
To be able to implement L1, we need access to more information from
`BidiInfo`, namely `original_classes` of the `text`, in `visual_runs()`,
which would mean it should pass through `reorder_line()`.

The fact that information from `BidiInfo` is needed for both steps of
the public API (generating `BidiInfo` and consuming it
per-paragraph/per-level) made me change the API design and move these
methods into `impl BidiInfo`.

Then, since we needed access to `text` for every `BidiInfo` consumption,
I added a reference to `text` to `BidiInfo`, which also enables more
compile-time checks for `BidiInfo` isntance not outliving the text in
the user code.

NOTE: We are already breaking API in version 0.3.0 and paving for full
spec support is a good reason to do so, IMHO.

The L1 rule works by one pass on the text of the line.

Conformance Test: this implementation reduces the number of failures
from 60494 to 23770 (out of total 256747 cases).

Fix #2
@behnam
Copy link
Contributor Author

behnam commented May 12, 2017

Actually, found a bug in my L1 code that just fixed, updating the conf-test results: 12827 test cases failed! (243920 passed)

@behnam
Copy link
Contributor Author

behnam commented May 12, 2017

Dependents status:

@mbrubeck
Copy link
Contributor

@bors-servo r+

This is great, thanks! Do you have any more changes planned before we publish version 0.3.0?

@bors-servo
Copy link
Contributor

📌 Commit 6427532 has been approved by mbrubeck

@bors-servo
Copy link
Contributor

⌛ Testing commit 6427532 with merge 0fa0cfe...

bors-servo pushed a commit that referenced this pull request May 15, 2017
Add Conformance Test and implement L1 rule

Initial results: `60494 test cases failed! (196253 passed)`

Because of the current limitations of rust's test framework, and the huge number of failures, the base (not including matching pairs) conformance test is executed in one run, and a summary is panic'ed (if there's any failure) and for the integration test to pass, it's marked as `should_panic`, with summary of the test run as `expected` string.

Fix #1

To be able to implement L1, we need access to more information from
`BidiInfo`, namely `original_classes` of the `text`, in `visual_runs()`,
which would mean it should pass through `reorder_line()`.

The fact that information from `BidiInfo` is needed for both steps of
the public API (generating `BidiInfo` and consuming it
per-paragraph/per-level) made me change the API design and move these
methods into `impl BidiInfo`.

Then, since we needed access to `text` for every `BidiInfo` consumption,
I added a reference to `text` to `BidiInfo`, which also enables more
compile-time checks for `BidiInfo` isntance not outliving the text in
the user code.

NOTE: We are already breaking API in version 0.3.0 and paving for full
spec support is a good reason to do so, IMHO.

The L1 rule works by one pass on the text of the line.

Conformance Test: this implementation reduces the number of failures
from 60494 to 23770 (out of total 256747 cases).

Fix #2

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/unicode-bidi/30)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - status-travis
Approved by: mbrubeck
Pushing 0fa0cfe to master...

@bors-servo bors-servo merged commit 6427532 into servo:master May 15, 2017
@behnam behnam deleted the rule-l1 branch May 15, 2017 20:14
@behnam
Copy link
Contributor Author

behnam commented May 15, 2017

Thanks for the review, @mbrubeck. Here's what I have in mind right now:

  1. Make sure serde works and servo builds properly.
  2. Working on the left over rules and try to reach 100% conformance for the base test.
  3. Then add support for paired brackets tests and the rules.

I think I need a little help with the serde support, as looks like the current servo repo depends on serde 0.9. I have a logically backward-compatible serde 1.0 implementation with tests right now (working on the 0.9 version this week), but then I realized that we may not need special implementation and derive(Serialize, Deserialize) could actually be enough.

So, the question is, are there any cases where a serde data from unicode-bidi < 0.3 would be passed on to a serde deserializer of unicode-bidi >= 0.3?

Right now, I'm organizing serde implementation as two features: with_serde0 and with_serde1, but they won't be able to be enabled at the same time. So, do we need to be able to build unicode-bidi with support for both ~1.0 and <1.0 at the same time?

Also, I can see that https://github.com/serde-rs/legacy and relevant crates enable support for multiple versions of serde being linked, but I'm not sure how to do a similar thing for the serde_derive macros.

So, in short: if we only need either with_serde0 or with_serde1, and there's no need for cross-unicode-bidi-version compatibility, I can condition them into the library with short and easy derive attributes. Otherwise, I would need more details on the expectation.

@mbrubeck
Copy link
Contributor

So, the question is, are there any cases where a serde data from unicode-bidi < 0.3 would be passed on to a serde deserializer of unicode-bidi >= 0.3?

No, I don't think we need to support this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants