Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New DSL plugin #492

Merged
merged 1 commit into from
Jun 27, 2023
Merged

New DSL plugin #492

merged 1 commit into from
Jun 27, 2023

Conversation

ilius
Copy link
Owner

@ilius ilius commented Jun 17, 2023

@soshial Can you please start testing this branch?

Note that we don't care about unclosed or mismatched tags (due to user error), as long as it looks right in browser. Since both HTML and DSL are permissive, and any browser or HTML rendering engine should be able to handle these problems better than us.

@ratijas I would appreciate if you can take a look at the code as well.

This design was inspired to me by this video from Rob Pike several years ago, which also inspired me to write repassgen.

@ilius ilius marked this pull request as draft June 17, 2023 19:17
@ilius ilius force-pushed the dsl2 branch 10 times, most recently from 457ff80 to 0d410e5 Compare June 18, 2023 03:28
@soshial
Copy link
Contributor

soshial commented Jun 19, 2023

I checked several dicts and it looks pretty well, awesome job!

Although, I found these 2 cases that didn't work as expected (see the files HERE):

  1. Empty lines not supported yet:
Screenshot 2023-06-19 at 09 10 43
  1. Multi line paragraph not supported yet:

Screenshot 2023-06-19 at 09 11 57\

IMPORTANT. I think we need to convert [/m] to </p> and convert empty line to <br/> and HTML would do the rest.

UPDATED: Do you have Goldendict for Windows installed? Can you check how they render those DSLs?

@ilius
Copy link
Owner Author

ilius commented Jun 20, 2023

Thanks.
I pushed.
Please try again.

Also looks like [m] is taking the last margin ([m4] in the case of fist dsl file) rather than [m0]. Right?

@ilius
Copy link
Owner Author

ilius commented Jun 20, 2023

GoldenDict crashes when I add a directory that only contains 001-empty-lines-br.dsl!

@ilius
Copy link
Owner Author

ilius commented Jun 20, 2023

The same goes for 001-m-tag_closed-multiline-paragraph.dsl!

@soshial
Copy link
Contributor

soshial commented Jun 20, 2023

Also looks like [m] is taking the last margin ([m4] in the case of fist dsl file) rather than [m0]. Right?

I will ask dictionary creators about this on a forum. But the best way to know this — is to compile DSL into LSD and open in Lingvo program on WIndows.

GoldenDict crashes when I add a directory that only contains 001-empty-lines-br.dsl!

I am clueless, because I have no problems in Goldendict mobile app. Should we try DSL compiler ourselves? Unfortunately, I won't have Windows machine till the end of summer.

@ilius
Copy link
Owner Author

ilius commented Jun 20, 2023

If this plugin is as good as the master branch or better, I can merge it and figure out the details later.
Because it's much easier to maintain.

@ilius ilius marked this pull request as ready for review June 20, 2023 16:01
@soshial
Copy link
Contributor

soshial commented Jun 20, 2023

What do the tests report? Does this module produce identical results?

@ilius
Copy link
Owner Author

ilius commented Jun 21, 2023

Most of tests in dsl_test.py are kinda useless.
We don't have parser anymore.
We have a transformer that converts DSL directly to HTML. Similar to our XDXF transformer (not in design).

@ilius
Copy link
Owner Author

ilius commented Jun 21, 2023

The easiest way is to convert small glossaries, look at their output in browser, and then add them to tests.

@ilius ilius force-pushed the dsl2 branch 2 times, most recently from 8aecea0 to 990c067 Compare June 21, 2023 00:40
@ilius
Copy link
Owner Author

ilius commented Jun 21, 2023

I think this plugin will almost never produce the same HTML codes as the current one, because of the reasons I explained (we don't mess with openings and closings) but it's probably better in most cases.

@ilius
Copy link
Owner Author

ilius commented Jun 21, 2023

Question: what do lines starting with - should look like? Should we add a newline before them?
Here is the first entry in 100-RussianAmericanEnglish-ru-en.dsl:

а
	[m1][p]союз[/p][/m]
	[m1]1) [trn]but[/trn][/m]
	[m2][*][ex][lang id=1049]крас[']и[/']в, а не умён[/lang] — he is handsome but not clever[/ex][/*][/m]
	[m1]2) [trn]and[/trn][/m]
	[m2][*][ex][lang id=1049]что ты сег[']о[/']дня д[']е[/']лаешь? а з[']а[/']втра?[/lang] — what are you doing today? and tomorrow?[/ex][/*][/m]
	[m1]3) [trn]while[/trn][/m]
	[m2][*][ex][lang id=1049]я пошёл гул[']я[/']ть, а он ост[']а[/']лся д[']о[/']ма[/lang] — I went for a walk while he stayed at home[/ex][/*][/m]
	[m1][*]•[/*][/m]
	[m1][*]- [ref]а именно[/ref][/*][/m]
	[m1][*]- [ref]а также[/ref]
	- [ref]а то[/ref][/*][/m]
	[m1][*]- [ref]а не то[/ref][/*][/m]
	[m1][*]- [ref]А Васька слушает да ест[/ref][/*][/m]

@ilius
Copy link
Owner Author

ilius commented Jun 21, 2023

We also transform the whole entry text rather parsing one line at a time which makes no sense to me.

@ilius ilius changed the title New DSL plugin (WIP) New DSL plugin Jun 21, 2023
@soshial
Copy link
Contributor

soshial commented Jun 21, 2023

We also transform the whole entry text rather parsing one line at a time which makes no sense to me.

I agree.

because DSL are permissive

DSL is exactly the opposite. Only what is DSL parser allows — only this formatting is allowed. In case of any mistake, parser returns error and doesn't compile a dictionary.

Should we add a newline before them?

[m1][*]- [ref]а также[/ref]
- [ref]а то[/ref][/*][/m]

I think that probably we need to add <br/> between those 2 lines and contain it all inside <p> with 1 indent.

@ilius
Copy link
Owner Author

ilius commented Jun 23, 2023

Confirmed that GoldenDict does add a newline as well.

Now we add a <br/> on each newline that does not start with [m (after whitespaces).

Can you also confirm this fixes #491?

I have seen many examples that this PR fixes existing issues with the old code.
So I want to merge it ASAP.

@soshial
Copy link
Contributor

soshial commented Jun 23, 2023

The script crashes while converting 001-unclosed-m-tags.dsl from DSL tests.

@soshial
Copy link
Contributor

soshial commented Jun 23, 2023

proccessTagClose() -> processTagClose()
proccessTag() -> processTag()

@soshial
Copy link
Contributor

soshial commented Jun 23, 2023

Screenshot 2023-06-23 at 12 16 41

@ilius ilius force-pushed the dsl2 branch 2 times, most recently from 1b4da4c to b2bee47 Compare June 23, 2023 11:23
@ilius
Copy link
Owner Author

ilius commented Jun 23, 2023

Fixed.

@soshial
Copy link
Contributor

soshial commented Jun 24, 2023

I am driving today and tomorrow and I will be free to test tomorrow.
You have done an amazing job, but I don't think we should hurry before some extensive testing.

You wrote this code, therefore I have a question, whether you have thought about adding unit testing of the new DSL plugin? Testing "DSl-to-tabfile" is too unstable (test files change and tests fail each time formatting changes). I really liked the tests that @ratijas

@soshial
Copy link
Contributor

soshial commented Jun 26, 2023

I am looking at the resulting HTML. I am wondering, why you don't use CSS styles instead of inline <p style="padding-left:3em;margin:0"> or <font color="green">, @ilius? This would make glossary files thinner.

@ilius
Copy link
Owner Author

ilius commented Jun 26, 2023

That is CSS.
If you mean css file, it's not supported by every dictionary app/format.
We can optimize it in AppleDict or slob plugin for example.

@soshial
Copy link
Contributor

soshial commented Jun 26, 2023

OK, I didn't know that! That would be great, but it's not that critical.
I will prepare tests in 10-15 mins! I found one bug with [m] tags

@ilius
Copy link
Owner Author

ilius commented Jun 26, 2023

Actually for slob it's not going to make a huge difference because of compression. Only makes sense to me to optimize if the output is not compressable.

@soshial
Copy link
Contributor

soshial commented Jun 26, 2023

BMP file is disappeared (see sample dir). <img src= attribute is without res/ folder.

@ilius
Copy link
Owner Author

ilius commented Jun 26, 2023

BMP file is disappeared (see sample dir). <img src= attribute is without res/ folder.

What dictionary app are you testing with? Are you sure it works with master?
The res/ is never included (that folder name is for StarDict)

@soshial
Copy link
Contributor

soshial commented Jun 26, 2023

I used this dsl2 branch to convert DSL into HtmlDir.

@ilius
Copy link
Owner Author

ilius commented Jun 26, 2023

Oh okay. That's a bug in html dir plugin.

@ilius ilius merged commit 53c2fca into master Jun 27, 2023
@ilius ilius deleted the dsl2 branch October 21, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants