Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search: avoid indexing spam and giving spam results #9899

Closed
humitos opened this issue Jan 16, 2023 · 5 comments
Closed

Search: avoid indexing spam and giving spam results #9899

humitos opened this issue Jan 16, 2023 · 5 comments
Assignees
Labels
Needed: design decision A core team decision is required

Comments

@humitos
Copy link
Member

humitos commented Jan 16, 2023

Currently, when using the global search, a lot of spam projects are shown in the search results. None of them should be shown on these results.

Screenshot_2023-01-09_18-09-04

We could define a new threshold, RTD_SPAM_THRESHOLD_DONT_SHOW_SEARCH_RESULTS to skip these:

RTD_SPAM_THRESHOLD_DONT_SHOW_ADS = 100
RTD_SPAM_THRESHOLD_DENY_ON_ROBOTS = 200
RTD_SPAM_THRESHOLD_DONT_SHOW_DASHBOARD = 300
RTD_SPAM_THRESHOLD_DONT_SERVE_DOCS = 500
RTD_SPAM_THRESHOLD_DELETE_PROJECT = 1000
RTD_SPAM_MAX_SCORE = 9999

@stsewd
Copy link
Member

stsewd commented Sep 13, 2023

The easiest way to archive this without having to introduce a new field or have to keep a track of projects to filter by (which probably won't scale) is to just have a task that removes the files from the index when a project is marked as spam or has a score greater than x. If there was a mistake, we can just trigger the re-index task after un-marking the project as spam.

@agjohnson
Copy link
Contributor

Can this be closed? I think since this was opened, we've leaned more towards deprioritizing this UI in favor of bringing some of these features to our in-doc, Addons search instead. This UI doesn't get much use and global project search is community specific.

@humitos
Copy link
Member Author

humitos commented Mar 4, 2024

If we are not going to do this work or similar, I'd propose to kill this view completely then. At this point, I think this is just bad UX and exposes the search as a broken and useless feature; degrading its trust.

@ericholscher
Copy link
Member

ericholscher commented Aug 13, 2024

We hit this recently, so we should find a good way to move forward here. The goal should be removing these projects from the search index, because it will make our search faster, and not take up space. This could also include work on #11533, if that makes sense as well?

@agjohnson
Copy link
Contributor

Also a note, we are probably still not talking about trying to tune the global search UI, as that view is going away with the new dashboard. The work here is indeed solely working to avoid indexing and surfacing search results.

There is some slightly related work in not surfacing spam projects in our dashboard as well:

@agjohnson agjohnson changed the title Search: do not show spam projects on search results Search: avoid indexing spam and giving spam results Aug 13, 2024
@stsewd stsewd closed this as completed Aug 22, 2024
@stsewd stsewd closed this as completed Aug 22, 2024
@github-project-automation github-project-automation bot moved this from Needs review to Done in 📍Roadmap Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needed: design decision A core team decision is required
Projects
Archived in project
Development

No branches or pull requests

4 participants