-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract and display original document filenames #452
Conversation
Files submitted to the SecureDrop server are gzipped with their original name, then encrypted with GPG. When downloaded, we can extract the original filename from the gzip header, so they can be opened with an appropriate application. This change adds the db.File.original_filename column, and populates it when downloading and decrypting files. The file is still stored as it has been, under the db.File.filename value stripped of extensions, but when opened, that file is first hard linked to the original filename. Since we can't wait for the file to be opened (starting the disposable VM is pretty slow), we can't immediately remove that link, so storage.delete_single_submission_or_reply_on_disk now cleans it up if the object being deleted has original_filename populated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two quick notes
@@ -166,7 +168,7 @@ def call_download_api(self, api: API, db_object: Reply) -> Tuple[str, str]: | |||
sdk_object.source_uuid = db_object.source.uuid | |||
return api.download_reply(sdk_object) | |||
|
|||
def call_decrypt(self, filepath: str, session: Session = None) -> None: | |||
def call_decrypt(self, filepath: str, session: Session = None) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to update the comment for this method in the DownloadJob
base class to explain what kind of string it should return since calling decrypt will return both the original filename and filepath strings. It would be helpful to know that we want to get the decrypted original filename so we can store it in the local db.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Whatever we decide to return from decrypt_submission_or_reply
should be explained here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with a few nits
Works as advertised, but one last question: Is there a reason to not show file extension? It seems like it would be more useful to show it so the journalist knows what they're downloading. An image, pdf, etc. |
ah, I know you were unable to access zeplin before, but it looks like the prototypes show file extension (see below) https://app.zeplin.io/project/5c807ea562f734bd2756b243/screen/5cd3b87518e9f734ae0218bb |
Description
Files submitted to the SecureDrop server are gzipped with their
original name, then encrypted with GPG. When downloaded, we can
extract the original filename from the gzip header, so they can be
opened with an appropriate application.
This change adds the db.File.original_filename column, and populates
it when downloading and decrypting files. The file is still stored as
it has been, under the db.File.filename value stripped of extensions,
but when opened, that file is first hard linked to the original
filename.
Since we can't wait for the file to be opened (starting the disposable
VM is pretty slow), we can't immediately remove that link, so
storage.delete_single_submission_or_reply_on_disk now cleans it up if
the object being deleted has original_filename populated.
Fixes #163.
Test Plan
Check out this branch. Run the client. Upload a file with a
.txt
extension via the source interface, and confirm that when you refresh the client, click on the source, then click on the Download item, that the name changes to the name of the file as you uploaded it.Then check that clicking on the file again opens it up in a suitable application in the disposable VM.
For bonus points, delete the source, or at least the submission, and confirm that upon refresh the data directory is properly cleaned up -- there should be no files with names matching the File's
filename
ororiginal_filename
.Checklist
If these changes modify code paths involving cryptography, the opening of files in VMs, network (via the RPC service) traffic, or fine tuning of the graphical user interface, Qubes testing is required. Please check as applicable: