Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamlining Data Archival Requests #2722

Open
4 tasks
balajialg opened this issue Sep 8, 2021 · 5 comments
Open
4 tasks

Streamlining Data Archival Requests #2722

balajialg opened this issue Sep 8, 2021 · 5 comments
Labels
automation Manual things that shouldn't be

Comments

@balajialg
Copy link
Contributor

balajialg commented Sep 8, 2021

Summary

We currently get around 5-10 requests every month from students to retrieve their previous semester's worth of data. Some of the common challenges encountered while servicing such requests are,

  • Students request their previous semester's or year's worth of data. Currently, We don't have a clear policy around how many semesters' worth of data can be archived and shared. As a result, we store all the previous semester's data without deleting it regularly. Storage keeps growing continuously costing us money in the longer run.
  • Students are not aware of our existing archival policies. Without clarity, they request data from courses they took multiple years ago.
  • Entire responsibility to execute the end-to-end process for retrieving the archival data resides at @felder's end which creates an additional bottleneck from a bandwidth standpoint.

User Stories

  • As a current or former datahub user, I'd like to retrieve a copy of my archived files so that I can use them as a reference for my personal/professional goals.
  • As an infrastructure admin, I want to control the growth of the hub disk space in order to reduce the money spent.

Important information

Some of the proposed solutions from the sprint planning meeting:

Tasks to be done:

  • Define a policy around how many months/years' worth of data can be archived and retrieved from Datahub.
  • Communicate (preferably by email) with students during a specific timeline (probably at the end of the semester) to download their semester's worth of data.
  • For course-specific hubs, Get permission from instructors to wipe the disk at the end of the semester.
  • Develop a UI-based solution that allows students to retrieve their data without any manual intervention!
@balajialg balajialg added the automation Manual things that shouldn't be label Sep 8, 2021
@felder
Copy link
Contributor

felder commented Sep 8, 2021

I'd say the user story here should be modified.

As a current or former datahub user, I'd like to retrieve a copy of my archived files.

I believe accounts are only archived after 12 months of inactivity. So it's more than just the previous semester. In fact it's also possible requests could come in from people who have graduated!

@balajialg
Copy link
Contributor Author

@felder Thanks for the correction, Updated!

@yuvipanda
Copy link
Contributor

In the long run, it might be nice to just have a URL in the note we leave the user that they can use to fetch the files themselves, without any involvement from us whatsoever.

@balajialg
Copy link
Contributor Author

@yuvipanda Exactly, that's the ideal endpoint for this user story. We should not be involved in user's requests for their data!

@yuvipanda
Copy link
Contributor

Related to #2866

@balajialg balajialg changed the title [EPIC] Automating Data Archival Requests Data Archival Requests Path Forward Mar 31, 2022
@balajialg balajialg changed the title Data Archival Requests Path Forward Streamlining Data Archival Requests Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automation Manual things that shouldn't be
Projects
None yet
Development

No branches or pull requests

3 participants