Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USX to Validate - PrintSpecialVerseSummary #48

Closed
Michahel opened this issue Mar 29, 2021 · 4 comments
Closed

USX to Validate - PrintSpecialVerseSummary #48

Michahel opened this issue Mar 29, 2021 · 4 comments
Milestone

Comments

@Michahel
Copy link

I am converting USX to Validate, and I use PrintSpecialVerseSummary argument.
When I run the converter, I get the following warnings:

...
Exception in thread "main" java.lang.RuntimeException: Validation error at Gen 1
:1
        at biblemulticonverter.data.FormattedText.validate(FormattedText.java:86
)
        at biblemulticonverter.data.Chapter.validate(Chapter.java:37)
        at biblemulticonverter.data.Book.validate(Book.java:35)
        at biblemulticonverter.data.Bible.validate(Bible.java:51)
        at biblemulticonverter.tools.Validate.doExport(Validate.java:130)
        at biblemulticonverter.Main.main(Main.java:67)
Caused by: java.lang.IllegalStateException: No whitespace allowed at end of elem
ent
        at biblemulticonverter.data.FormattedText$ValidatingVisitor.visitEnd(For
mattedText.java:950)
        at biblemulticonverter.data.FormattedText.accept(FormattedText.java:46)
        at biblemulticonverter.data.FormattedText$CSSFormatting.acceptThis(Forma
ttedText.java:279)
        at biblemulticonverter.data.FormattedText.accept(FormattedText.java:45)
        at biblemulticonverter.data.FormattedText$Footnote.acceptThis(FormattedT
ext.java:233)
        at biblemulticonverter.data.FormattedText.accept(FormattedText.java:45)
        at biblemulticonverter.data.FormattedText.validate(FormattedText.java:84
)
        ... 5 more
schierlm added a commit that referenced this issue Mar 29, 2021
- In USX, <ref> tags can appear everywhere, even in verses or headlines.
  Cross references in most formats are only allowed as parts of
  footnotes. Therefore wrap cross reference in a footnote. If you prefer
  the old behaviour, you can pass
  `-Dparatext.allowrefsoutsidefootnotes=true`.

- Eliminate whitespace at end of verses when exporting to non-Paratext
  format. This can happen when importing from USX and a paragraph
  contains multiple verses separated by spaces.

- Move whitespace at the end of formatting outside the formatting. This
  can happen mostly for \fr tags from USX.

See #48.
@schierlm
Copy link
Owner

I found some validation errors when importing from USX 2 and fixed them (used your example module from #42). If there are more left, feel free to share them in this issue.

You can skip all whitespace validation by passing -Dbiblemulticonverter.validate.ignore.whitespace=true at the beginning of the command line (i.e. before the -jar). However, this is not a fix, just a way to get past whitespace issues to find more pressing issues first (like printing the special verse summary).

@Michahel
Copy link
Author

I don’t know which change is causing the error. Here's what happens now:

C:\PROGS\BibleMultiConverter>java -Dbiblemulticonverter.validate.ignore.whitespa
ce=true -jar BibleMultiConverter.jar USX N:\Bibles\CARS\Text Validate PrintSpeci
alVerseSummary
WARNING: Unsupported structured reference format at 1CH.usx line 12, column 255
- replaced by plain text: 1CH 11-9
WARNING: Unsupported structured reference format at 2CH.usx line 582, column 260
 - replaced by plain text: 1KI 17-2
WARNING: Unsupported structured reference format at LUK.usx line 69, column 864
- replaced by plain text: 1KI 17-2
WARNING: Unsupported structured reference format at MAL.usx line 91, column 586
- replaced by plain text: 1KI 17-2
WARNING: Unsupported structured reference format at MAT.usx line 603, column 183
1 - replaced by plain text: 1KI 17-2
WARNING: Unsupported structured reference format at MRK.usx line 360, column 366
 - replaced by plain text: 1KI 17-2
WARNING: Unsupported book abbreviation Нач., using Gen instead
WARNING: Unsupported book abbreviation Исх., using Exod instead
WARNING: Unsupported book abbreviation Лев., using Lev instead
WARNING: Unsupported book abbreviation Чис., using Num instead
WARNING: Unsupported book abbreviation Втор., using Deut instead
WARNING: Unsupported book abbreviation Иеш., using Josh instead
WARNING: Unsupported book abbreviation Суд., using Judg instead
WARNING: Unsupported book abbreviation Руфь, using Ruth instead
WARNING: Unsupported book abbreviation 1Цар., using 1Sam instead
WARNING: Unsupported book abbreviation 2Цар., using 2Sam instead
WARNING: Unsupported book abbreviation 3Цар., using 1Kgs instead
WARNING: Unsupported book abbreviation 4Цар., using 2Kgs instead
WARNING: Unsupported book abbreviation 1Лет., using 1Chr instead
WARNING: Unsupported book abbreviation 2Лет., using 2Chr instead
WARNING: Unsupported book abbreviation Узайр, using Ezra instead
WARNING: Unsupported book abbreviation Неем., using Neh instead
WARNING: Unsupported book abbreviation Есф., using Esth instead
WARNING: Unsupported book abbreviation Аюб, using Job instead
WARNING: Unsupported book abbreviation Заб., using Ps instead
WARNING: Unsupported book abbreviation Мудр., using Prov instead
WARNING: Unsupported book abbreviation Разм., using Eccl instead
WARNING: Unsupported book abbreviation Песн., using Song instead
WARNING: Unsupported book abbreviation Ис., using Isa instead
WARNING: Unsupported book abbreviation Иер., using Jer instead
WARNING: Unsupported book abbreviation Плач, using Lam instead
WARNING: Unsupported book abbreviation Езек., using Ezek instead
WARNING: Unsupported book abbreviation Дан., using Dan instead
WARNING: Unsupported book abbreviation Ос., using Hos instead
WARNING: Unsupported book abbreviation Иоиль, using Joel instead
WARNING: Unsupported book abbreviation Ам., using Amos instead
WARNING: Unsupported book abbreviation Авд., using Obad instead
WARNING: Unsupported book abbreviation Юнус, using Jonah instead
WARNING: Unsupported book abbreviation Мих., using Mic instead
WARNING: Unsupported book abbreviation Наум, using Nah instead
WARNING: Unsupported book abbreviation Авв., using Hab instead
WARNING: Unsupported book abbreviation Соф., using Zeph instead
WARNING: Unsupported book abbreviation Агг., using Hag instead
WARNING: Unsupported book abbreviation Зак., using Zech instead
WARNING: Unsupported book abbreviation Мал., using Mal instead
WARNING: Unsupported book abbreviation Мат., using Matt instead
WARNING: Unsupported book abbreviation Мк., using Mark instead
WARNING: Unsupported book abbreviation Лк., using Luke instead
WARNING: Unsupported book abbreviation Ин., using John instead
WARNING: Unsupported book abbreviation Деян., using Acts instead
WARNING: Unsupported book abbreviation Рим., using Rom instead
WARNING: Unsupported book abbreviation 1Кор., using 1Cor instead
WARNING: Unsupported book abbreviation 2Кор., using 2Cor instead
WARNING: Unsupported book abbreviation Гал., using Gal instead
WARNING: Unsupported book abbreviation Эф., using Eph instead
WARNING: Unsupported book abbreviation Флп., using Phil instead
WARNING: Unsupported book abbreviation Кол., using Col instead
WARNING: Unsupported book abbreviation 1Фес., using 1Thess instead
WARNING: Unsupported book abbreviation 2Фес., using 2Thess instead
WARNING: Unsupported book abbreviation 1Тим., using 1Tim instead
WARNING: Unsupported book abbreviation 2Тим., using 2Tim instead
WARNING: Unsupported book abbreviation Тит, using Titus instead
WARNING: Unsupported book abbreviation Флм., using Phlm instead
WARNING: Unsupported book abbreviation Евр., using Heb instead
WARNING: Unsupported book abbreviation Якуб, using Jas instead
WARNING: Unsupported book abbreviation 1Пет., using 1Pet instead
WARNING: Unsupported book abbreviation 2Пет., using 2Pet instead
WARNING: Unsupported book abbreviation 1Ин., using 1John instead
WARNING: Unsupported book abbreviation 2Ин., using 2John instead
WARNING: Unsupported book abbreviation 3Ин., using 3John instead
WARNING: Unsupported book abbreviation Иуда, using Jude instead
WARNING: Unsupported book abbreviation Отк., using Rev instead
Exception in thread "main" java.lang.NullPointerException
        at biblemulticonverter.format.paratext.ParatextBook$ParatextCharacterCon
tentContainer.accept(ParatextBook.java:613)
        at biblemulticonverter.format.paratext.AbstractParatextFormat$1.visitPar
atextCharacterContent(AbstractParatextFormat.java:231)
        at biblemulticonverter.format.paratext.ParatextCharacterContent.acceptTh
is(ParatextCharacterContent.java:35)
        at biblemulticonverter.format.paratext.ParatextBook.accept(ParatextBook.
java:116)
        at biblemulticonverter.format.paratext.AbstractParatextFormat.importPara
textBook(AbstractParatextFormat.java:129)
        at biblemulticonverter.format.paratext.AbstractParatextFormat.doImport(A
bstractParatextFormat.java:112)
        at biblemulticonverter.Main.main(Main.java:66)

C:\PROGS\BibleMultiConverter>

schierlm added a commit that referenced this issue Mar 30, 2021
@schierlm
Copy link
Owner

Thank you for sharing the subset in #47. Was able to fix this issue and also that subset now validates perfectly.

In general when sharing modules under NDA, BibleMultiConverter has a ScrambledDiffable option that replaces all letters and digits by constants, but

  1. this module so far only covered Latin and Greek and let Cyrillic unscrambled
  2. It did not interact well with Paratext formats (first did the conversion to verse-based format, which already caused the error in your example)

I have now changed the scrambling to scramble all Unicode letters and digits regardless of script, and created a ScrambledParatextDump module that can do the same before converting Paratext to verse-based.

So in case you still have trouble, you can run

java -jar BibleMultiConverter.jar ParatextConverter USX N:\Bibles\CARS\Text ScrambledParatextDump dump.txt =23

(IMPORTANT If you leave out the ParatextConverter you will convert from Paratext to verse-based and back, causing different dump file which will likely not reproduce bugs)

Try if the bug appears as well if you use

... ParatextDump dump.txt ...

instead of

... USX N:\Bibles\CARS\Text ...

and if yes, you can share the created dump file without sharing any actual text (you can verify it should contain mostly X and x).

@Michahel
Copy link
Author

... if yes, you can share the created dump file

Yes, the bug appears. dump.zip

schierlm added a commit that referenced this issue Apr 2, 2021
When a validation error occurs, print a summary which other verses also
contain validation errors. Also, when additional output (like special
verse summary) is requested, try to print this even when validation
fails.

Validation errors are grouped into categories, and it is possible to
ignore each of them separately (like you can ignore whitespace issues).

See #48.
@schierlm schierlm added this to the v0.0.8 milestone Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants