Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORF Radiothek has changed the URLs #29394

Closed
jogitgl opened this issue Jun 25, 2021 · 4 comments · Fixed by #32802
Closed

ORF Radiothek has changed the URLs #29394

jogitgl opened this issue Jun 25, 2021 · 4 comments · Fixed by #32802

Comments

@jogitgl
Copy link

jogitgl commented Jun 25, 2021

youtube-dl version 2021.06.06

ORF has changed the URLs for Radiothek.
I have tested the URLs in the file "orf.py.new" and the downloads are working now without problems.
But how these URLs can be changed in the official youtube-dl version?

diff orf.py orf.py.new
191c191
< 'http://audioapi.orf.at/%s/api/json/current/broadcast/%s/%s'
---
> 'https://audioapi.orf.at/%s/api/json/current/broadcast/%s/%s'
228c228
< _VALID_URL = r'https?://(?Pfm4).orf.at/player/(?P[0-9]+)/(?P4\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pfm4)/(?P[0-9]+)/(?P4\w+)'
252c252
< _VALID_URL = r'https?://(?Pnoe).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pnoe)/(?P[0-9]+)/(?P\w+)'
265c265
< _VALID_URL = r'https?://(?Pwien).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pwie)/(?P[0-9]+)/(?P\w+)'
278c278
< _VALID_URL = r'https?://(?Pburgenland).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pbgl)/(?P[0-9]+)/(?P\w+)'
291c291
< _VALID_URL = r'https?://(?Pooe).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pooe)/(?P[0-9]+)/(?P\w+)'
304c304
< _VALID_URL = r'https?://(?Psteiermark).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pstm)/(?P[0-9]+)/(?P\w+)'
317c317
< _VALID_URL = r'https?://(?Pkaernten).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pktn)/(?P[0-9]+)/(?P\w+)'
330c330
< _VALID_URL = r'https?://(?Psalzburg).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Psbg)/(?P[0-9]+)/(?P\w+)'
343c343
< _VALID_URL = r'https?://(?Ptirol).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Ptir)/(?P[0-9]+)/(?P\w+)'
356c356
< _VALID_URL = r'https?://(?Pvorarlberg).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Pvbg)/(?P[0-9]+)/(?P\w+)'
369c369
< _VALID_URL = r'https?://(?Poe3).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Poe3)/(?P[0-9]+)/(?P\w+)'
382c382
< _VALID_URL = r'https?://(?Poe1).orf.at/player/(?P[0-9]+)/(?P\w+)'
---
> _VALID_URL = r'https?://radiothek.orf.at/(?Poe1)/(?P[0-9]+)/(?P\w+)'

@jkirk
Copy link

jkirk commented Jul 28, 2021

Actually the station URLs still work fine, but you are right: the radiothek URLs are not supported (I am working on it).

For example, this station URL (currently) works: https://oe1.orf.at/player/20210723/645522 this radiothek URL does not: https://radiothek.orf.at/oe1/20210723/645522 (both URLs provide the same mp3 file).

So your patch would break the ability to download the station URLs (although I was not able to apply your patch, because of missing white spaces. You should use fenced code blocks to paste your code).

But you inspired me to look at an older issue (#26043) where we already talked about the radiothek URLs and now try to implement the extractor.

You also asked:

But how these URLs can be changed in the official youtube-dl version?

You might want to read the GitHub Docs first, but basically you fork and clone the project, make changes and submit a pull request.

jkirk added a commit to jkirk/youtube-dl that referenced this issue Jul 29, 2021
The ORF radio stations are also provided via https://radiothek.orf.at.
When doing so, the URLs are different. Currently only OE1 and OE3 have
been tested.

The ORFRADIOTHEK extractor has been tested with:

  ❯ python -m unittest -v test/test_download.py -k ORFRADIOTHEK
  test_ORFRADIOTHEK (test.test_download.TestDownload): ... [orf:radiothek] 645522: Downloading JSON metadata
  [download] Downloading playlist: Pandemie, Fossilien, Mars
  [orf:radiothek] playlist Pandemie, Fossilien, Mars: Collected 1 video ids (downloading 1 of them)
  [download] Downloading video 1 of 1
  [info] Writing video description metadata as JSON to: test_ORFRADIOTHEK_2021-07-23_1355_tl_51_7DaysFri25_1594313.info.json
  [debug] Invoking downloader on 'https://loopstream01.apa.at/?channel=oe1&id=2021-07-23_1355_tl_51_7DaysFri25_1594313.mp3'
  [download] Destination: test_ORFRADIOTHEK_2021-07-23_1355_tl_51_7DaysFri25_1594313.mp3
  [download] 100% of 10.00KiB in 00:00
  [download] Finished downloading playlist: Pandemie, Fossilien, Mars
  ok
  test_ORFRADIOTHEK_1 (test.test_download.TestDownload): ... [orf:radiothek] 3SDL: Downloading JSON metadata
  [download] Downloading playlist: Der Song deines Lebens
  [orf:radiothek] playlist Der Song deines Lebens: Collected 1 video ids (downloading 1 of them)
  [download] Downloading video 1 of 1
  [info] Writing video description metadata as JSON to: test_ORFRADIOTHEK_1_2021-07-28_1200_tl_53_7DaysWed3_1610654.info.json
  [debug] Invoking downloader on 'https://loopstream01.apa.at/?channel=oe3&id=2021-07-28_1200_tl_53_7DaysWed3_1610654.mp3'
  [download] Destination: test_ORFRADIOTHEK_1_2021-07-28_1200_tl_53_7DaysWed3_1610654.mp3
  [download] 100% of 10.00KiB in 00:00
  [download] Finished downloading playlist: Der Song deines Lebens
  ok

  ----------------------------------------------------------------------
  Ran 2 tests in 0.863s

  ❯ flake8 youtube_dl/extractor/orf.py

But as the files are only available for 7 days, "only_matching" was set to
true. While at it removed 'skip' in favour of it.

Closes: ytdl-org#29394
jkirk added a commit to jkirk/youtube-dl that referenced this issue Mar 28, 2022
The ORF radio stations are also provided via https://radiothek.orf.at.
When doing so, the URLs are different. Currently only OE1 and OE3 have
been tested.

The ORFRADIOTHEK extractor has been tested with:

  ❯ python -m unittest -v test/test_download.py -k ORFRADIOTHEK
  test_ORFRADIOTHEK (test.test_download.TestDownload): ... [orf:radiothek] 645522: Downloading JSON metadata
  [download] Downloading playlist: Pandemie, Fossilien, Mars
  [orf:radiothek] playlist Pandemie, Fossilien, Mars: Collected 1 video ids (downloading 1 of them)
  [download] Downloading video 1 of 1
  [info] Writing video description metadata as JSON to: test_ORFRADIOTHEK_2021-07-23_1355_tl_51_7DaysFri25_1594313.info.json
  [debug] Invoking downloader on 'https://loopstream01.apa.at/?channel=oe1&id=2021-07-23_1355_tl_51_7DaysFri25_1594313.mp3'
  [download] Destination: test_ORFRADIOTHEK_2021-07-23_1355_tl_51_7DaysFri25_1594313.mp3
  [download] 100% of 10.00KiB in 00:00
  [download] Finished downloading playlist: Pandemie, Fossilien, Mars
  ok
  test_ORFRADIOTHEK_1 (test.test_download.TestDownload): ... [orf:radiothek] 3SDL: Downloading JSON metadata
  [download] Downloading playlist: Der Song deines Lebens
  [orf:radiothek] playlist Der Song deines Lebens: Collected 1 video ids (downloading 1 of them)
  [download] Downloading video 1 of 1
  [info] Writing video description metadata as JSON to: test_ORFRADIOTHEK_1_2021-07-28_1200_tl_53_7DaysWed3_1610654.info.json
  [debug] Invoking downloader on 'https://loopstream01.apa.at/?channel=oe3&id=2021-07-28_1200_tl_53_7DaysWed3_1610654.mp3'
  [download] Destination: test_ORFRADIOTHEK_1_2021-07-28_1200_tl_53_7DaysWed3_1610654.mp3
  [download] 100% of 10.00KiB in 00:00
  [download] Finished downloading playlist: Der Song deines Lebens
  ok

  ----------------------------------------------------------------------
  Ran 2 tests in 0.863s

  ❯ flake8 youtube_dl/extractor/orf.py

But as the files are only available for 7 days, "only_matching" was set to
true. While at it removed 'skip' in favour of it.

Closes: ytdl-org#29394
@jkirk
Copy link

jkirk commented May 31, 2024

Thanks @dirkf for the clean up!

This triggered me to take another look into this issue.
I've tested an ORF OE1 (station) URL with youtube-dl "https://oe1.orf.at/player/20210723/645522" (with youtube-dl 2021.12.17, I haven't tested the latest version) . This still works.

ORF Radiothek does not exist anymore. It seems it has been superseded by ORF Sound: https://sound.orf.at/.

I think this issue can be closed; I will close my PR: #29679.

@dirkf
Copy link
Contributor

dirkf commented May 31, 2024

Is https://oe1.orf.at/player/20210723/645522 an actual URL that can be found from sound.orf.at or any other orf.at site?

Currently I'm just finding sound.orf.at/radio/{station}/sendung/{id} which may also have a final /{slug}. However I see that the player URL is still valid (with no redirect) and playable, so I'll have to make sure it still gets extracted.

There are also /collection/... URLs where /collection/{id} brings up a playlist page and /collection/{collection_id}/{item_id} is one item in the playlist. Such an item may be an upload or podcast-episode, or it may be a broadcast, broadcastitem or program associated with a station, each with its own API.

I'll close the issue when an new ORF extractor module gets merged.

@dirkf dirkf mentioned this issue Jun 1, 2024
11 tasks
@dirkf
Copy link
Contributor

dirkf commented Jun 1, 2024

Please test #32802 and report there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants