Warn or stop a source from submitting documents with compromising metadata #122

bxjx · 2013-11-07T01:08:09Z

Uploading documents containing metadata may reveal information about the source. E.g. http://www.theguardian.com/world/2012/dec/04/john-mcafee-confirms-guatemala-vice

Metadata can be scrubbed by a journalist on the viewing station using the Metadata Anonymization Toolkit (MAT), but I think it would be better if the source was prevented, or at least warned, about submitting these documents in the first place. Ideally the source would use MAT, but this might be too much to expect.

Perhaps SecureDrop could have a configuration option to prevent the upload of any document except those that could be scanned for metadata on the client side?

I've written some code that scans PDFs before they are uploaded and asks the user to confirm that none of the metadata in the document compromises their identity. The code requires javascript but does not use any extensions or server communication. It does require a modern browser. It works on the version of Firefox included in the TOR Bundle. I think I could possibly write similar code for JPGs and possibly other formats.

On a related note, it might be worth preventing/warning the upload of certain document types that may contain hidden information other than metadata. E.g. Microsoft Word Documents may contain pass edits that can be reverted and therefore retrieved.

This issue is probably also related to #101 and #119.

bxjx · 2013-11-07T01:15:00Z

Screenshot..

dolanjs · 2013-11-07T01:26:01Z

@bxjx as you noted we are working on issue #119 and we recommend for the journalist to scrub files prior to transfering the files off the secure viewing station for publication. At the same time we need to take into consideration that metadata can also have journalistic value (though we try to as clear as possible to the source about various threats including metadata).

bxjx · 2013-11-07T02:51:42Z

Thanks @dolanjs, I think resolving #119 will really help!

I also take the point that if SecureDrop was to encourage the user to disable javascript, then it would be counterproductive to build functionality that requires it.

Asking the journalist to scrub the data does mean that there is an encrypted version of a potentially compromised file on the server until the journalist deletes it. Perhaps this is not a big issue.

I also do worry about sources having to read about and understand metadata and scrubbing rather than being prompted if the metadata can be detected.

garrettr · 2013-11-07T18:42:36Z

@bxjx That is a really cool PoC! Will you share your metadata-detecting Javascript?

The proposed interface needs careful consideration. As @dolanjs points out, metadata can have journalistic value. We should also keep the submission process as simple as possible to encourage sources, and UX that involves popups or confirmations makes the UX more complex.

I could see this being useful either as an optional service (maybe a "screen my submission for metadata" button on the upload page) or as something on the journalist interface (to give them an overview, easier to use than a separate program like MAT but complementary in function). Ultimately I think the responsibility of handling metadata lies with the journalists.

I also do worry about sources having to read about and understand metadata and scrubbing rather than being prompted if the metadata can be detected.

Your UX is certainly nicer. Again, I think both the documentation on metadata and any metadata-detecting service should be optional.

bxjx · 2013-11-07T23:10:58Z

@garrettr, it's on a branch at https://github.com/TheGlobalMail/securedrop/tree/metadata-scan. See TheGlobalMail@0c7d6d9. The UX is still pretty rough. Works on latest versions of Webkit, Firefox and IE10+.

The "screen my submission" is an interesting idea!

psivesely · 2016-11-02T18:53:03Z

Our stance on JS in the source interface has not shifted in years, so I'm going to go ahead and close this one, even though it's a cool idea.

redshiftzero · 2016-11-02T19:00:00Z

Does this need to be done in JS? What about having this done on the server side? I realize that this is not ideal, but until we have a magical browser extension that executes signed JS, it's probably better than just having sources submit documents with all kinds of metadata they might not even realize are there. App code examines the document, returns some useful feedback to source "Hey Col Biggins you might want to remove your name from the Author field" and let them strip off the metadata they don't want the journalist to know?

psivesely · 2016-11-02T19:08:49Z

Processing documents adds greatly to the attack surface. Submissions may take hours to transfer, only to warn the source they may want to use some tool they've never heard and then re-submit. How do we distinguish useful metadata (e.g., DKIM on someone else's emails) vs harmful metadata (e.g., time data in JPG metadata in a photo taken covertly by the source)?

redshiftzero · 2016-11-02T19:14:25Z

The processing you just described could be done on the server side no? It seems like a good first step would be just to display to the source "Here's the metadata your file has in it". Instead of referring them to another tool, we could just ask them to check the fields they'd like to be wiped. If we worry about handling every possible type of file then this could get really unwieldy, but if we stick to the most common files, e.g. PDFs, then this could be a very nice and useful feature for sources to understand what they are actually doing and protect their identity when they are leaking documents.

psivesely · 2016-11-02T19:29:20Z

The processing I described is heavily problematic for the reasons I described. Processing of documents on the Application Server opens up a huge vector for compromise, not to mention how this may hurt UX. I think we'll just end up scaring away sources who won't be able to grasp the significance of all the metadata we display to them.

garrettr mentioned this issue Nov 13, 2013

MAT integration for binary metadata removal #151

Closed

garrettr modified the milestones: 0.4, 0.3pre Dec 3, 2014

redshiftzero added the hackathon label Nov 1, 2016

redshiftzero mentioned this issue Nov 2, 2016

Improve UX for sources #1437

Closed

psivesely closed this as completed Nov 2, 2016

psivesely removed this from the 0.4 milestone Dec 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn or stop a source from submitting documents with compromising metadata #122

Warn or stop a source from submitting documents with compromising metadata #122

bxjx commented Nov 7, 2013

bxjx commented Nov 7, 2013

dolanjs commented Nov 7, 2013

bxjx commented Nov 7, 2013

garrettr commented Nov 7, 2013

bxjx commented Nov 7, 2013

psivesely commented Nov 2, 2016

redshiftzero commented Nov 2, 2016

psivesely commented Nov 2, 2016

redshiftzero commented Nov 2, 2016

psivesely commented Nov 2, 2016

Warn or stop a source from submitting documents with compromising metadata #122

Warn or stop a source from submitting documents with compromising metadata #122

Comments

bxjx commented Nov 7, 2013

bxjx commented Nov 7, 2013

dolanjs commented Nov 7, 2013

bxjx commented Nov 7, 2013

garrettr commented Nov 7, 2013

bxjx commented Nov 7, 2013

psivesely commented Nov 2, 2016

redshiftzero commented Nov 2, 2016

psivesely commented Nov 2, 2016

redshiftzero commented Nov 2, 2016

psivesely commented Nov 2, 2016