Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing Archivist content to increase performance and creating an archive #814

Open
6 tasks
HayleyMills opened this issue Sep 28, 2023 · 7 comments
Open
6 tasks
Assignees

Comments

@HayleyMills
Copy link
Contributor

HayleyMills commented Sep 28, 2023

The performance of the main Archivist it slow because of the amount of content in there. The plan is to have one Archivist act as an archive which contains everything which is in Discovery, then have smaller instances which only contain work-in-progress.

  • Clone main Archivist instance to hold Discovery content as an archive @spuddybike
  • Load all Discovery instruments from all instances into new archive instance
      - [ ] check list of exported content from Archivist (Jenny's script output)
      - [ ] use Jenny's script to load instruments, (do we want to also move the datasets and mappings, the final version of these are on the dev server?) @jli755
  • Rename main Archivist to 'ucl' for wip (~80) @spuddybike
  • Rename alspac to 'lps' for wip (~95) @spuddybike
  • Export and re-load remaining wip instruments, datasets and mappings @HayleyMills
      - [ ] from genscotland, heaf, onsls, wirral to 'lps'
      - [ ] from alspac to 'lps'
      - [ ] BCS, NCDS, NS, MCS, NSHD to remain in 'ucl'
      - [ ] ALSPAC, WHII, SWS, HCS, from main to 'lps'
  • Update Airtable with correct IDs @HayleyMills (need a list of before and after IDs)
@spuddybike
Copy link
Member

is there a problem with this approach with the update questionnaires into Discovery

@HayleyMills
Copy link
Contributor Author

@spuddybike can I check this process above is still the plan or are we simply saving all files in Discovery, rather than moving them to another instance?

@spuddybike
Copy link
Member

spuddybike commented May 21, 2024 via email

@HayleyMills
Copy link
Contributor Author

HayleyMills commented May 21, 2024

  • Export all questionnaire and mapping .txt files from main Archivists as a backup - Jenny using script
  • Export all questionnaire and mapping .txt files from other Archivists as a backup - Jenny using script
  • Create list of Discovery questionnaires to be deleted
  • Delete all Discovery questionnaires from main Archivist - Jenny using script
  • Delete all Discovery questionnaires from other Archivists - @jli755 using script
  • Rename main Archivist to 'ucl' for wip (~80) @spuddybike
  • Rename alspac to 'lps' for wip (~95) @spuddybike
  • Export and re-load remaining wip instruments, datasets and mappings @HayleyMills
    • from genscotland to 'lps' @beckyoldroyd
    • from heaf to 'lps'
    • from onsls, to 'lps'
    • from wirral to 'lps' @beckyoldroyd
    • BCS, NCDS, NS, MCS, NSHD to remain in 'ucl'
    • ALSPAC, WHII, SWS, HCS, from main to 'lps'
    • BIB from heaf to 'lps' @beckyoldroyd
  • Update Airtable with correct IDs @HayleyMills @beckyoldroyd

@spuddybike just checking we are only doing this for the instrument xml - No also need to export the .txt files too.

@HayleyMills
Copy link
Contributor Author

HayleyMills commented Sep 19, 2024

To archive (S:\IOECLOSER_Share\USP\Software\Archivist_instrument_backup), then delete after content update

delete_after_content_update.txt

@spuddybike
Copy link
Member

performance still not great - looking at database bloat/vacuum

@simonreed
Copy link
Contributor

@spuddybike a significant amount of the database space is the storage of documents used for import/export. I wonder if we should look to delete those beyond a certain timeframe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants