Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: track imported files for external versions #10690

Closed
stsewd opened this issue Aug 30, 2023 · 0 comments · Fixed by #10696
Closed

Build: track imported files for external versions #10690

stsewd opened this issue Aug 30, 2023 · 0 comments · Fixed by #10696
Assignees
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code

Comments

@stsewd
Copy link
Member

stsewd commented Aug 30, 2023

What's the problem this feature will solve?

We are using our DB instead of storage when serving 404s, this is great, but we aren't creating imported files for external versions

https://github.com/readthedocs/readthedocs.org/blob/f6e04302f3d0a8cf148f13f400894c15d081eda7/readthedocs/projects/tasks/search.py#L24C1-L26

This break index file redirects and custom 404 serving for external versions.

Describe the solution you'd like

Start creating imported files for external versions, maybe solve this together with #10623, so we don't fill our DB with things we don't use/need.

Alternative solutions

Just enable indexing for external versions, a quick solution, but it will fill our DB with things that we don't need.

Additional context

@humitos humitos added Improvement Minor improvement to code Accepted Accepted issue on our roadmap labels Aug 31, 2023
@github-project-automation github-project-automation bot moved this to Planned in 📍Roadmap Aug 31, 2023
stsewd added a commit that referenced this issue Aug 31, 2023
@agjohnson agjohnson moved this from Planned to Needs review in 📍Roadmap Sep 13, 2023
@github-project-automation github-project-automation bot moved this from Needs review to Done in 📍Roadmap Sep 14, 2023
stsewd added a commit that referenced this issue Sep 14, 2023
- Removed the "wipe" actions from the admin instead of porting them, since I'm not sure that we need an action in the admin just to delete the search index of a project. Re-index seems useful.
- `fileify` was replaced by `index_build`, and it only requires the build id to be passed, any other information can be retrieved from the build/version object.
- `fileify` isn't removed in this PR to avoid downtimes during deploy, it's safe to keep it around till next deploy.
- New code is avoiding any deep connection to the django-elasticsearch-dsl package, since it doesn't make sense anymore to have it, and I'm planning on removing it.
- We are no longer tracking all files in the DB, only the ones of interest.
- Re-indexing a version will also re-evaluate the files from the DB, useful for old projects that are out of sync.
- The reindex command now generates taks per-version rather than per-collection of files, since we no longer track all files in the DB.


- Closes #10623
- Closes #10690

We don't need to do anything special during deploy, zero downtime out of the box. We can trigger a re-index for all versions if we want to delete the HTML files that we don't need from the DB, but that operation will also re-index their contents in ES, so probably better do that after we are all settled with any changes to ES.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants