-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search: avoid indexing spam and giving spam results #9899
Comments
The easiest way to archive this without having to introduce a new field or have to keep a track of projects to filter by (which probably won't scale) is to just have a task that removes the files from the index when a project is marked as spam or has a score greater than x. If there was a mistake, we can just trigger the re-index task after un-marking the project as spam. |
Can this be closed? I think since this was opened, we've leaned more towards deprioritizing this UI in favor of bringing some of these features to our in-doc, Addons search instead. This UI doesn't get much use and global project search is community specific. |
If we are not going to do this work or similar, I'd propose to kill this view completely then. At this point, I think this is just bad UX and exposes the search as a broken and useless feature; degrading its trust. |
We hit this recently, so we should find a good way to move forward here. The goal should be removing these projects from the search index, because it will make our search faster, and not take up space. This could also include work on #11533, if that makes sense as well? |
Also a note, we are probably still not talking about trying to tune the global search UI, as that view is going away with the new dashboard. The work here is indeed solely working to avoid indexing and surfacing search results. There is some slightly related work in not surfacing spam projects in our dashboard as well: |
Currently, when using the global search, a lot of spam projects are shown in the search results. None of them should be shown on these results.
We could define a new threshold,
RTD_SPAM_THRESHOLD_DONT_SHOW_SEARCH_RESULTS
to skip these:readthedocs.org/readthedocs/settings/base.py
Lines 993 to 998 in 2f009cd
The text was updated successfully, but these errors were encountered: