Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtitles show an extra empty line #45

Closed
dagwieers opened this issue Dec 21, 2018 · 37 comments
Closed

Subtitles show an extra empty line #45

dagwieers opened this issue Dec 21, 2018 · 37 comments
Labels
bug Something isn't working inputstream.adaptive Related to the inputstream.adaptive add-on released A fix for this issue has been released by upstream

Comments

@dagwieers
Copy link
Collaborator

dagwieers commented Dec 21, 2018

Describe the bug
The subtitles show an extra empty line between subtitles.
Most likely this is an issue with DOS (\r\n) vs Unix text-files (\n).
(And this issue is likely in Kodi's subtitle handling, not in the addon)

(please complete the following information):

  • Operating system: LibreELEC v8.90.009 on RPi3
  • Kodi Version: v18.0-RC2
@dagwieers
Copy link
Collaborator Author

Looking into Kodi and inputstream.adaptive, I think the problem is with the conversion to TTML (https://github.com/peak3d/inputstream.adaptive/blob/master/src/parser/TTML.cpp#L87) and SRT (https://github.com/peak3d/inputstream.adaptive/blob/master/src/parser/TTML.cpp#L262).

Without actual debugging (can't find an easy way to dump the files) I think it is adding both \r\n and additionally an \n.

@mediaminister
Copy link
Collaborator

mediaminister commented Dec 22, 2018

I noticed that this problem only occurs with the on-demand streams, the livestreams don't show the extra empty line. So, I thought there might something wrong with the TTML-subtitles in the on-demand streams from VRT.

To investigate, I extracted the TTML-stream from the last episode of "De Ideale Wereld" with ffmpeg:

ffmpeg -i "https://remix-vrt.akamaized.net/remix/f1d863ef-4a58-464a-ab80-c6ab2c044ac1/remix.ism/.mpd" -map 0:7 -c:d copy -copy_unknown -f data deidealewereld.ttml

I did the same for the livestream of channel "Eén" while "FC De Kampioenen" was on air.

ffmpeg -i "https://live-vrt.akamaized.net/groupc/live/8edf3bdf-7db3-41c3-a318-72cb7f82de66/live.isml/.mpd" -map 0:8 -c:d copy -copy_unknown -f data live.ttml

Then I compared the TTML-subtitles from "De Ideale Wereld" and the "Eén"-livestream.

De Ideale Wereld:

<p begin="00:06:55.268" end="00:06:58.148" region="speaker" xml:id="s108"><span style="textStyle">Daar heb je niks aan,<br></br>
aan valse Hollanders.</span></p>

Eén Live:

<p begin="429307:51:09.720" end="429307:51:11.120" region="speaker">Niet overdrijven. Er zijn <br></br>geen cowboys en indianen meer.</p>

I determined that in "De Ideale Wereld"-stream there is an (invisible) newline character after every <br></br> line break. In the livestream, there is no newline character.

The inputstream.adaptive TTML-parser treats the newline character inside a <span></span> tag as a newline in the subtitle text and that causes the extra empty line between the subtitles.
Since the <span></span> tag is an explicit inline-element in all HTML-standards, I think the inputstream.adaptive TTML-parser should definitely ignore these newline characters inside <span></span> tags.

@mediaminister
Copy link
Collaborator

I fixed this bug already: mediaminister/inputstream.adaptive@be10ef1

And I fixed two other bugs in the inputstream.adaptive TTML parser:

  1. No subtitles on Ketnet livestream cause fractional seconds with 3 digits were not supported correctly
  2. Fixed typo to support font colors.

@pietje666
Copy link
Collaborator

fixed in inputstream.adaptive by @mediaminister

@dagwieers
Copy link
Collaborator Author

@pietje666 I understand the wish to close issues (especially if they are not in this project), but closing them hides them from existing users. And the issue is not fixed upstream, nor is there an easy fix available from anyone so people will experience this issue in the field.

So maybe it's better to keep it open until a fix is available to users from default repositories, and maybe label these issues as fix_upstream or waiting_on_upstream.

@pietje666
Copy link
Collaborator

The problem i see in this approach is that there will be alot of open issues, and we might loose track of the real issues. So personally i do not like it :), and people can still view the closed issues and comments.

@dagwieers
Copy link
Collaborator Author

dagwieers commented Mar 15, 2019

So it appears that the VRT is following the official standard very closely: https://www.w3.org/TR/ttml1/#ttml-example-body

Strange they insist on doing <br></br> instead of simply using <br/>.

Let us hope this is closed upstream real soon now.

@dagwieers
Copy link
Collaborator Author

dagwieers commented Mar 15, 2019

Fixed, wh00p wh00p ! I think we can close this one now ;-)

@dagwieers dagwieers added the bug Something isn't working label Mar 19, 2019
@dagwieers dagwieers pinned this issue Mar 27, 2019
@dagwieers
Copy link
Collaborator Author

The fix is still not released :-(

@dagwieers
Copy link
Collaborator Author

Not for ARM. I noticed the newer 18.1 builds also shipped with newer inputstream.adaptive, so I guess we need to wait for newer LibeELEC snapshot releases...

@dagwieers
Copy link
Collaborator Author

Yesterday, 2019-05-02, inpustream.adaptive-2.3.17.1 was released for Kodi Leia (on 18.1) which fixes this issue finally for me. Thanks everyone, especially @mediaminister :-)

@dagwieers dagwieers unpinned this issue May 3, 2019
@mediaminister mediaminister reopened this Jun 27, 2019
@mediaminister
Copy link
Collaborator

mediaminister commented Jun 27, 2019

I think VRT NU changed something to MPEG-DASH TTML subtitling last week and for some programs the extra empty line is back again. Can someone comfirm this?

@mediaminister mediaminister added the inputstream.adaptive Related to the inputstream.adaptive add-on label Jun 27, 2019
@dagwieers
Copy link
Collaborator Author

I noticed this as well yesterday. I assumed recent inputstream.adaptive update had a regression, but first wanted to investigate the cause before reporting.

@mediaminister
Copy link
Collaborator

mediaminister commented Jun 27, 2019

Okay, I found the cause already. VRT NU is now using multiple span tags and separate subtitle colours for different characters in a soap like Thuis.

OLD VRT NU TTML (Thuis, June 20)

<p begin="00:04:37.674" end="00:04:40.954" region="speaker" xml:id="s79"><span style="textStyle">Amai. Mooie zwembroek.<br></br>
- Ah, ja. Ja, merci.</span></p><p begin="00:04:41.034" end="00:04:43.994" region="speaker" xml:id="s80"><span style="textStyle">Mannekes, gaan wij zien?<br></br>
Bob is aan het wachten.</span></p>

NEW VRT NU TTML (Thuis, June 21)

<p begin="00:07:33.333" end="00:07:35.773" region="region-11" tts:textAlign="center">
<span style="singleHeightStyle" xml:space="preserve" tts:color="cyan" tts:backgroundColor="black">Eddy. Kom eens.</span>
</p><p begin="00:07:40.733" end="00:07:42.213" region="region-11" tts:textAlign="center">
<span style="singleHeightStyle" xml:space="preserve" tts:backgroundColor="black">Wat heb ik nu weer misdaan?</span>
</p>

InputStream Adaptive doesn't process these <span> tags right. Multiple span tags inside a <p> tag should be treated as a single line subtitle, only a <br> tag can break a subtitle line. I think InputStream Adaptive breaks these <span> tags into multiple lines.

I will fix this.

@dagwieers
Copy link
Collaborator Author

I guess we are blessed with VRT being on the forefront of subtitle technology ;-)

@mediaminister
Copy link
Collaborator

The bar can always be set higher! No subtitles today for Thuis on VRT NU: https://www.vrt.be/vrtnu/a-z/thuis/24/thuis-s24a4663/

@mediaminister
Copy link
Collaborator

mediaminister commented Jun 28, 2019

I have a fix. Can you test this?
Zips: (install Kodi add-on from zip file)
Linux x64: inputstream.adaptive-2.3.22_linux_x64.zip
Windows x64: inputstream.adaptive-2.3.22_winx64.zip

Source: https://github.com/mediaminister/inputstream.adaptive/tree/multispan

Compiling instructions for Linux:

git clone https://github.com/xbmc/xbmc.git
git clone https://github.com/mediaminister/inputstream.adaptive/tree/multispan
cd ~/inputstream.adaptive && mkdir build && cd build
cmake -DADDONS_TO_BUILD=inputstream.adaptive -DADDON_SRC_PREFIX=../.. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../xbmc/addons -DPACKAGE_ZIP=1 ../../xbmc/cmake/addons
make
cd ~/xbmc/addons/
zip -r inputstream.adaptive-2.3.22.zip inputstream.adaptive

@dagwieers
Copy link
Collaborator Author

dagwieers commented Jun 28, 2019

I'm afraid I won't be able to test this easily. I would need to cross-compile to ARM to make that work, and that's not something I would like to spend my time on. And I can't easily run things on my dated laptop either. That's why it would have been nice if I could run LibreELEC on VirtualBox or KVM. I don't understand why they don't invest in making that possible, rather than supporting only VMware.

(I may be able to test this evening on Windows though)

@mediaminister
Copy link
Collaborator

No problem, I tested this already on Windows and Linux and it seems to work completely as expected.
I just have to test if it doesn't break other TTML streams like Netflix or HBO. Then I will create a pull request on the official inputstream.adaptive repo.

@mediaminister
Copy link
Collaborator

Added a pull request to the inputstream.adaptive repo: xbmc/inputstream.adaptive#284

@dagwieers dagwieers pinned this issue Jul 2, 2019
@dagwieers
Copy link
Collaborator Author

This is now fixed in the just released inputstream.adaptive v2.4.0.
Thanks to @mediaminister and @peak3d !

@dagwieers
Copy link
Collaborator Author

It appears inputstream.adaptive v2.4.0 is not available for Raspberry Pi (or ARM in general).

@dagwieers dagwieers reopened this Sep 2, 2019
@dagwieers
Copy link
Collaborator Author

dagwieers commented Sep 24, 2019

With LibreELEC v9.1.501 (v9.2 Beta 1) on Raspberry Pi using Kodi v18.4 and inputstream.adaptive v2.4.2.1 this issue is still present :-(

@dagwieers
Copy link
Collaborator Author

When watching Terzake live on Canvas, I also noticed that the subtitles appeared later, and there was a large gap between the next subtitle. I then compared to the online player, and there this doesn't happen. Subtitles quickly follow up, with less pause, and they appear longer on screen. Much more in sync with the spoken words.

And I also noticed that the interview of Bart De Wever appeared in blue, whereas the interviewer appears as white. I was surprised that this wasn't supported, as I have seen Kodi do this for other subtitles as well, but never saw this on VRT NU.

@mediaminister
Copy link
Collaborator

mediaminister commented Sep 24, 2019

With LibreELEC v9.1.501 (v9.2 Beta 1) on Raspberry Pi using Kodi v18.4 and inputstream.adaptive v2.4.2.1 this issue is still present :-(

That's right, unfortunately, my multispan commit is not merged in the Leia branch, it's only merged in the Matrix branch: xbmc/inputstream.adaptive@f90cab1

When watching Terzake live on Canvas, I also noticed that the subtitles appeared later, and there was a large gap between the next subtitle. I then compared to the online player, and there this doesn't happen.

I'll investigate this, but for current affair programs, only subtitles that are prepared in advance of the live broadcast will be in sync. Subtitles for live interviews or live reports are created live and will appear a couple of seconds later.

Differences between the VRT NU web player and Kodi should not occur provided that you are approximately simultaneously watching in Kodi and the VRT NU web player. If you watch a current affair program at a later time than it's possible you get a different on demand stream. I'll check the subtitle timings for the live streams.

And I also noticed that the interview of Bart De Wever appeared in blue, whereas the interviewer appears as white. I was surprised that this wasn't supported, as I have seen Kodi do this for other subtitles as well, but never saw this on VRT NU.

Subtitle colors are not yet implemented in InputStream Adaptive's TTML parser. I can make a pull request for this.

@peak3d
Copy link

peak3d commented Sep 24, 2019

@mediaminister colors are implemented in TTML, but maybe not in the format used in @dagwieers sample. https://github.com/peak3d/inputstream.adaptive/blob/master/src/parser/TTML.cpp#L47

@mediaminister
Copy link
Collaborator

@peak3d You're right, it's implemented in the <style> tag, but not yet in the <span> tag: https://www.w3.org/TR/2018/REC-ttml1-20181108/#style-attribute-color

I'm currently testing an experimental fix for this: https://github.com/mediaminister/inputstream.adaptive/tree/ttmlcolor

@dagwieers
Copy link
Collaborator Author

I'll investigate this, but for current affair programs, only subtitles that are prepared in advance of the live broadcast will be in sync. Subtitles for live interviews or live reports are created live and will appear a couple of seconds later.

Well, in this case it appeared longer and with shorter delays between subtitles when watching online on vrtnu.be. So there must be something wrong with the timings as subtitles seem to have a delayed start in Kodi.

Differences between the VRT NU web player and Kodi should not occur provided that you are approximately simultaneously watching in Kodi and the VRT NU web player. If you watch a current affair program at a later time than it's possible you get a different on demand stream. I'll check the subtitle timings for the live streams.

I was watching simultaneously. And to ensure I was seeing this correctly, paused one of the streams so that both streams would be synchronous.

@mediaminister
Copy link
Collaborator

Well, in this case it appeared longer and with shorter delays between subtitles when watching online on vrtnu.be. So there must be something wrong with the timings as subtitles seem to have a delayed start in Kodi.

I did a test with the livestreams of Eén en Canvas at 8 p.m.
I suspect there is something wrong with the timings of the livestreams from VRT NU, but I didn't see a difference between VRT NU live webplayer(https://www.vrt.be/vrtnu/livestream/) and Kodi VRT NU Live TV on a Linux pc. So, I think it's a problem with the source.

I didn't see this problem with on demand streams like Thuis, subtitling is perfectly in sync.

You can easily distinguish live and on demand subtitle streams in VRT NU webplayer: On demand subtitles have colors and a black background, livestream subtitles are white with a thin black border.

@dagwieers
Copy link
Collaborator Author

@mediaminister I will check next time I see this happening.

Also, In case you need more examples using colours, I noticed that De Ideale Wereld (met Stef Kamil Carlens) on-demand also has coloured subtitles online.

@mediaminister
Copy link
Collaborator

Colors is implemented in mediaminister/inputstream.adaptive@fca16d9 and works flawlessly now, but I still need to improve the C++ code and test if this doesn't break other TTML subtitle streams.

@dagwieers
Copy link
Collaborator Author

The Raspberry Pi build of inputstream.adaptive from @peno64 (at https://github.com/michaelarnauts/plugin.video.vtm.go/issues/1#issuecomment-536220725) seems to have fixed the subtitles-newlines.

@mediaminister
Copy link
Collaborator

Obviously @peno64 compiled the source from master branch. Still not fixed in the Leia branch.

@peno64
Copy link

peno64 commented Sep 29, 2019

@mediaminister
I took a branch as follows:
git clone https://github.com/peak3d/inputstream.adaptive
Did this a day or two ago

@dagwieers dagwieers unpinned this issue Jan 14, 2020
@dagwieers dagwieers added merged A fix for this issue has been merged by upstream and removed fixed An upstream fix is available but not yet released labels Jan 17, 2020
@dagwieers dagwieers pinned this issue Jan 19, 2020
@mediaminister
Copy link
Collaborator

@dagwieers I think this can be closed because InputStream Adaptive 2.4.3 is released.

@dagwieers
Copy link
Collaborator Author

Indeed, thanks everyone, especially @mediaminister !

@dagwieers dagwieers unpinned this issue Mar 9, 2020
@dagwieers dagwieers added released A fix for this issue has been released by upstream and removed merged A fix for this issue has been merged by upstream labels May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working inputstream.adaptive Related to the inputstream.adaptive add-on released A fix for this issue has been released by upstream
Projects
None yet
Development

No branches or pull requests

5 participants