Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor search/provider.py #1150

Closed
wants to merge 8 commits into from
Closed

Refactor search/provider.py #1150

wants to merge 8 commits into from

Conversation

s1as3r
Copy link
Contributor

@s1as3r s1as3r commented Jan 25, 2021

Refactored provider.py

  • Dropped Entire Fuzzy String Matching
  • Compare results on the basis of song length

@s1as3r s1as3r added this to the Better Code Hub 6/10 milestone Jan 25, 2021
@s1as3r s1as3r requested review from aklajnert and a user January 25, 2021 06:33
@Silverarmor Silverarmor added the Enhancement Enhancing spotDL label Jan 25, 2021
@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

I also dropped getting useless metadata from ytmusic since it is not used anywhere.

@s1as3r s1as3r changed the title Refactor provider.py Refactor search/provider.py Jan 25, 2021
@s1as3r s1as3r marked this pull request as draft January 25, 2021 06:48
Copy link
Contributor

@aklajnert aklajnert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure that all CI checks pass. In this case, mypy is failing.

spotdl/search/provider.py Outdated Show resolved Hide resolved
@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

Please make sure that all CI checks pass. In this case, mypy is failing.

There are still some things i need to do. At the end I'll make sure the tests pass.

@s1as3r s1as3r marked this pull request as ready for review January 25, 2021 12:51
@s1as3r s1as3r requested a review from aklajnert January 25, 2021 12:54
@ghost
Copy link

ghost commented Jan 25, 2021

What is the reason for dropping fuzzy string matching? (Just asking).

[EDIT]
See also #1149, Ytmapi missed out on some very obvious songs like "Samuel Kim - Boba Fett Epic theme", the guy has a popular YouTube channel and has released the exact same song tagged as a song and not as a video.

Results show on YTM but spotDL never gets the correct song - Ytmapi is not recieving the same "display results" from YouTube music. That might affect the result filtering process.

You don't have to take that into consideration for this PR, just letting you know.

[EDIT 2]
#1113 (comment)

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

What is the reason for dropping fuzzy string matching? (Just asking).

It really doesn't provide any advantage. The thing is that YouTube's search algorithm is already pretty good at finding stuff, we really don't need to do some hard-core filtering.
Comparing the duration should be enough.
It also gets rid of unnecessary complexity and also cuts down on the dependencies.

In my testing the results without fuzzy matching and with fuzzy matching were mostly identical ( though i didn't test much, this needs more Testing )

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

[EDIT]
See also #1149, Ytmapi missed out on some very obvious songs like "Samuel Kim - Boba Fett Epic theme", the guy has a popular YouTube channel and has released the exact same song tagged as a song and not as a video.

Results show on YTM but spotDL never gets the correct song - Ytmapi is not recieving the same "display results" from YouTube music. That might affect the result filtering process.

You don't have to take that into consideration for this PR, just letting you know.

I looked into this. It seems when we have filter="video" in the ytm search, it misses out on some results and that's the reason why it wasn't able to find the song. It's pretty easy to fix. Should i add it to this PR or create a seperate one?

@ghost
Copy link

ghost commented Jan 25, 2021

Tried that, still doesn't work.

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

Tried that, still doesn't work.

Well, it worked for me.
Heres the tracking file generated:
gist

Btw, i used the non-rapidfuzz version of spotdl.

@ghost
Copy link

ghost commented Jan 25, 2021

Hmmm. What did you change the filters to? I'll check again.

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

Hmmm. What did you change the filters to? I'll check again.

I removed the filters. But while mapping the song data instead of directly subscrpting, i used '.get()' method.

@ghost
Copy link

ghost commented Jan 25, 2021

Could you share a gist

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

Could you share a gist

here

@ghost
Copy link

ghost commented Jan 25, 2021

It really doesn't provide any advantage. The thing is that YouTube's search algorithm is already pretty good at finding stuff, we really don't need to do some hard-core filtering.
Comparing the duration should be enough.
It also gets rid of unnecessary complexity and also cuts down on the dependencies.

In my testing the results without fuzzy matching and with fuzzy matching were mostly identical ( though i didn't test much, this needs more Testing )

I think you should check for :

P.S
I agree that in most cases, a simple time Delta will give accurate results but considering that the additional checks don't take a lot of time or processing power and might be helpful for that taste case when time Delta isn't enough (I think I came across few of those during the rewrite) - these would help.

@ghost
Copy link

ghost commented Jan 25, 2021

Could you share a gist

here

Thanks

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 25, 2021

It really doesn't provide any advantage. The thing is that YouTube's search algorithm is already pretty good at finding stuff, we really don't need to do some hard-core filtering.
Comparing the duration should be enough.
It also gets rid of unnecessary complexity and also cuts down on the dependencies.
In my testing the results without fuzzy matching and with fuzzy matching were mostly identical ( though i didn't test much, this needs more Testing )

I think you should check for :

It does provide different links for the last 2 cases but provides the same link for the first case, but so does the original spotdl. They're probably the same song

@ghost
Copy link

ghost commented Jan 25, 2021

There aren't the same. It isn't available on YouTube yet though.

This reverts commit 7cfa274.
@ghost
Copy link

ghost commented Jan 27, 2021

Hey @sigma67, could please help us out with this?

The changes is this commit cause error on parsing playlists, we would like to get song as well as video results.

@ghost
Copy link

ghost commented Jan 27, 2021

Also, @s1as3r is sigma67/ytmusicapi#121 the same issue you faced when including sung results?

@sigma67
Copy link

sigma67 commented Jan 27, 2021

I'm not sure about the rest of your code, but looks like you're now using default search without a filter. That means you will also get playlist/artist results etc., which do not have duration fields for example.

I admit that the docs example is not very good at showing what you get for default search, but essentially the resultType indicates what type of result it is. You'd have to filter by songs and videos if you only want those. Keep in mind that you'll only get top 3 results for each without a filter. If you want more you need to use the filters and two requests (one with filter songs, one with filter videos). These are limitations of YouTube Music's API, that's how the client works.

Hope that helps.

Edit: Oh and I just saw you're matching Spotify results with YouTube results, I also wrote some code for that purpose for my playlist transfer app. It's a great idea to use song duration for matching, that improved my results a lot as well. The code is here

@s1as3r
Copy link
Contributor Author

s1as3r commented Jan 27, 2021

Keep in mind that you'll only get top 3 results for each without a filter. If you want more you need to use the filters and two requests (one with filter songs, one with filter videos).

This would be the appropriate solution. Would try this out.

@sigma67
Copy link

sigma67 commented Jan 27, 2021

Also keep in mind that YouTube Music Premium subscribers get higher audio quality for song type videos (256kbps AAC), while free users are limited to normal quality (128kbps AAC). https://support.google.com/youtubemusic/thread/338369?msgid=348540

@ghost
Copy link

ghost commented Jan 27, 2021

Edit: Oh and I just saw you're matching Spotify results with YouTube results, I also wrote some code for that purpose for my playlist transfer app. It's a great idea to use song duration for matching, that improved my results a lot as well. The code is here

Used to be part of the old filtering that's getting an over haul rn.

@ghost
Copy link

ghost commented Jan 27, 2021

I'm not sure about the rest of your code, but looks like you're now using default search without a filter. That means you will also get playlist/artist results etc., which do not have duration fields for example.

That change was reverted as it causes error after error when getting playlists and works just fine with individual songs. Which is what we couldn't figure out.

1fd776a

@sigma67
Copy link

sigma67 commented Jan 27, 2021

It's a pretty long thread now so I might have missed something. Was the source of the error within ytmusicapi? If yes can you elaborate what calls were performed (i.e. which search terms or similar)?

@ghost
Copy link

ghost commented Jan 27, 2021

Minor changes (commit refered to below): 7cfa274

So the problem lies with the mentioned commit - it runs a general search, i.e - top result, 3 song results, 3 video results and the rest, the absence of a duration field would explain the errors but there are 2 standouts:

  • it works just fine for single songs, but when we get a series of songs (fetch a playlist so to say), it errors out on the 5th or 7th song
  • when I personally tried
    #rough code, am using GitHub Android rn
    results = YTM.search(search term, filter="video")
    results += YTM.search(search term, filter="song")
    It still errored out. I can provide you more detail by tmrw.
  • The calls are formatted as {songArtist1, SongArtist2, ...} - {songName acc to Spotify}

@s1as3r s1as3r marked this pull request as draft January 27, 2021 16:41
@sigma67
Copy link

sigma67 commented Jan 29, 2021

There was an issue with unavailable songs in 0.13.1, perhaps it works better with 0.14.0

@s1as3r s1as3r closed this Feb 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Enhancing spotDL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants