Add copy of release_files.path
to file_registry
for long-term storage
#17750
Labels
release_files.path
to file_registry
for long-term storage
#17750
Background
A Filename registry is maintained to persist even after files have been removed, to prevent re-upload, re-use of that exact filename.
When a release's files are removed, the ability to surface their storage location is effectively lost, as the path generator tool uses hashers to determine the placement from the file's hashes, which are also not preserved.
warehouse/warehouse/forklift/legacy.py
Lines 1525 to 1536 in aeb9ccd
The
path
data exists in the BigQuery Project Metadata Table so it's semi-available, but harder to get to during routine operations and investigations.This could also feasibly be used via Inspector or something similar, if made available via some API.
Proposal
A few steps to tackle the problem, can definitely change based on further findings or ideas.
Filename.path
column (easy)Filename.path
during file upload, around here (easy)File.path
for filenames that match (easyish)The text was updated successfully, but these errors were encountered: