Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible builds #641

Closed
massar opened this issue Mar 3, 2015 · 26 comments
Closed

Reproducible builds #641

massar opened this issue Mar 3, 2015 · 26 comments

Comments

@massar
Copy link

massar commented Mar 3, 2015

https://github.com/WhisperSystems/Signal-iOS/releases/tag/2.0

mentions "Git tag was signed with @FredericJacobs's GPG key"

Which is great, but the IPA that people will download (who will bother compiling themselves, and then one will get a different edition due to signing) will either get it from the Apple App Store or maybe that release page.

Neither version can be verified as to matching to the source code though. And they are VERY different.

Signal.ipa (Github) = 10.287.224 bytes, uncompressed 25 MiB, SHA512 = b8663105def290c866c38f95016ae45c17fed1a5fd8de1dd8f632457b03727dc2c9f8bd50ba15d256bed519da30ab9e6d869beaa86a741403c5376cd59e2bda9

Signal.ipa (AppStore) = 15.541.382 bytes, uncompressed 15 MiB, SHA512 = 7f5152b41a81e7fe4625e89a80e28cca1285779083880af608ee8fb1f42e6502a54aa34a19da121cb75bb2b91820235e718d15e9ec6b0af180a811dd22899982

That does not even remotely match. And the compression is extremely funny there that the Github edition with Symbols compresses smaller than the AppStore without...

In the AppStore version there is a "iTunesArtwork" file of 33k + iTunesMetadata.plist, while the Github edition has Symbols included (great for debugging, not really a 'release' thing).

Inside the payload:
Binary files ./Signal and /Users/jeroen/Downloads/sa/Payload/Signal.app/Signal differ

and various other files, that are diffable are also different.

The size difference "apparently" comes primarily from different compression technique

All that said:

  • even if there are minor differences between the published edition and the Github edition, please document these differences and why they exist
  • Please publish the SHA512 hashes, PGP-signed by your keys, so that people can see that what they get from AppStore/Github is really what they should be having

Open Source means nothing when the source does not match the binary. (Next to actually having an audit of the system and hoping that IOS is not working against the whole security model ;)

@FredericJacobs
Copy link
Contributor

Looking at matching IPAs is a lost cause. But I want to be able to get reproducible binaries.

@massar
Copy link
Author

massar commented Mar 5, 2015

Looking at matching IPAs is a lost cause. But I want to be able to get reproducible binaries.

While reproducible binaries would be awesome, it is impossible due to the embedded signatures.
As we cannot sign with your key we will not be able to reproduce them.

IMHO the best path would be:

  • you tag a version
  • you compile it
  • you create a hash of what you submit to the App Store
  • you submit the edition to the App Store + upload it to github as a release + PGP-signed hash
  • you see the new version appearing in the App Store
  • you check the hash of what you submitted to the released edition

They should be identical. Though looking at the 2.0 tag, the one published on github as mentioned above has symbols, the one in the App Store does not. From there we already know they are completely different things.

Hence, if AppStore removes or adds things, we should have a list of what these changes are and then the best thing you can do is to:

  • publish the compiled binary on Github + pgp-signed hash
  • download the App Store version, pgp-sign that and publish it also on the App Store.

That way, we at least know what the differences are supposed to be and that the edition on our phone and in the App Store is the one you at least 'blessed' and that we did not get MITM on the connection (maybe only in the App Store... but Apple&cohorts can do a lot more than subverting Signal anyway...)

@FredericJacobs
Copy link
Contributor

I took a quick look at the differences at the binaries but it looks like before processing them for the App Store, Apple encrypts the binary as part of their DRM protection FairPlay. This not only changes the signature but also makes it terribly difficult to diff the binary.

This would mean that to verify the decrypted binary, you will need a jailbroken iPhone running Clutch ...

@massar
Copy link
Author

massar commented Mar 5, 2015

I think it will be a rather hard problem to solve, hence the closest you can get is the 'blessing' strategy, where you simply assume that Apple did not change it and that the IPA in the App Store is your unmodified binary built from your unmodified source...

One can btw extract the IPA from the phone by doing a backup with iTunes. Hence, if we trust that iTunes backu gives us the same binary, we could have a OSX/Windows tool that verifies/checks what is in the IPA. (yes, that will be after-the-fact and the app is already on your phone by then, but we'll have to trust Apple in this case anyway...)

@FredericJacobs FredericJacobs changed the title Publish PGP-signed SHA512 hash of edition on Github and AppStore Reproducible builds Mar 5, 2015
@belenko
Copy link

belenko commented Mar 5, 2015

Here is an observation: although Apple applies DRM, actual (encrypted) binaries seems to be the same when downloaded by different accounts (.ipa would change, but main binary wouldn't).

I have verified this by downloading app using two different accounts, in both cases SHA1 of the binary is 56587600848a61c52a19da0f54c57fc056f2aa71 (this is version 2.0 as served by AppStore at the time of this writing).

So while truly reproducible builds are not possible, one option to bring it one step closer would be:

  • Purchase/download .ipa from the AppStore, compute SHA1 of the main binary (and other files, if necessary);
  • Install .ipa on a jailbroken device and decrypt main binary (using tools like Clutch for example);
  • Compare decrypted binary to a known good one (e.g. built from a corresponding git tag and using same compiler and settings as the person submitting the app). Their checksums/digests won't match as binaries have different code signatures but other than that they should be pretty close (note that binary built from sources will differ from the binary inside the .ipa published on the GitHub in pretty much the same way);
  • If that verification goes well then SHA1 obtained at the first step can be marked as "known good" one (but not the .ipa – those contain per-account data necessary for FairPlay to decrypt the binary).

@bertrandmt
Copy link

As others have pointed out, you can successfully use a jailbroken device to decrypt, as it were, the App Store version of your app. I would expect some differences to remain between the decrypted App Store version and the original submitted version. In particular, the layout of the binary is changed (IIRC) when DRM encryption is applied.

So straight hashes would not immediately match, but you could verify a small number of meaningful differences between the two and convince yourself that the App Store distributed binary was, in fact, not tampered.

@comex
Copy link

comex commented Mar 11, 2015

I'm fairly familiar with this. It's messy but shouldn't be too hard to do correctly, similarly to the way @belenko described. The changes I know of between the submitted and downloaded IPAs are:

  • Apple runs codesign to change the code signature of each Mach-O.
    • A code signature consists of a LC_CODE_SIGNATURE load command plus a blob with an offset and size defined in the load command. If there weren't an existing LC_CODE_SIGNATURE command, it would have to add one, which might occasionally require adding more space in the header, but binaries submitted to the App Store always have one already. It also needs to make space for the code signature blob, which might be bigger than the previous one, but since code signature blobs are at the end of binaries (due to being inserted after linking), this just increases the size of the binary, and shouldn't change any other offsets in the binary. Therefore, this should preserve a binary diff outside of the load commands and CS blob.
    • Fat (multi-arch) binaries are just an archive containing N completely independent Mach-Os; adding more space to the end of one of them would shift all following members, so they should be extracted before comparison.
    • The CS blob itself mostly does not affect the binary's operation; however, it contains entitlements, which are mostly just boolean permissions to access various parts of the system, but could change behavior in some cases. I don't believe Apple changes entitlements from the submitted binary, so you can use codesign to dump the entitlements of each binary and check that they are equal. There are also designated requirements, which, AFAIK, do get changed, but are just for keychain stuff.
  • Apple encrypts most of the __TEXT segment in place, and adds a LC_ENCRYPTION_INFO load command, which identifies which part of the binary is encrypted. I think this could hypothetically need to add space in the header for the new load command, but it's very unlikely since the linker puts in a lot of extra space.
    • The other data related to DRM is stored in directories called SC_Info. The code dealing with this is heavily obfuscated, but I can't think of any obvious attack by changing SC_Info besides changing the key/IV to something that produces useful garbage as code.
    • There is no way to verify the text without decrypting it on a jailbroken device. The code signature doesn't do this because (in an instance of questionable design) it signs the encrypted pages, not the originals.

This all happens once and produces a .ipa which iTunes (for Mac or iOS, i.e. the App Store app on the latter) downloads over HTTPS; it's not terribly important, but you can see what's going on on a Mac or jailbroken iOS device by forcibly disabling SSL verification, e.g. using this tool.

  • When iTunes downloads a .ipa, it also gets an customized .sinf file which is added to the main SC_Info directory; therefore, only .sinf should differ between users. Same note above about key/IV applies.
  • iTunes also adds iTunesArtwork and iTunesMetadata.plist. The former is a PNG; the latter is not obfuscated and you can see it just contains metadata.

TLDR: unzip both IPAs, ignore iTunes{Artwork,Metadata.plist} and SC_Info, everything else should be the same except the executable(s); for those, extract each architecture if fat and verify that the load commands are the same except for LC_CODE_SIGNATURE and LC_ENCRYPTION_INFO, and that all data is the same but the encrypted text and code signature blob; compare the entitlements from the code signature blobs; use a jailbroken device to decrypt the text and compare that. Only one person needs to have a jailbroken device, since others should get the same .ipa contents but for iTunes* and .sinf files.

Edit: Just for completeness, if for whatever reason you really don't like jailbreaks, there is a small trick you could use to make the binary verifiable without: Because load commands (which must be available unencrypted) are part of the text segment, and encryption operates at page boundaries, bytes either within load commands or between the end of load commands and the start of the next page are both executable and unencrypted. You could theoretically stick the entry point in there and have a small routine that hashes the rest of the text, quitting if it doesn't match a known value; then someone verifying the .ipa could check that part and assume the rest of the text is intact (or the binary won't work at all). Probably no point, though.

@massar
Copy link
Author

massar commented Mar 11, 2015

@comex great writeup! Most extensive Github comment I have ever seen :)

Note that you can "download" the .ipa without jailbreak by just backing up the device. (Though here we assume that is the version actually running on the device and does not get any injections etc ;)

You then still need a tool to do all the verifications though @comex defines which would be a new project separate from Signal IMHO as it can apply to other tools too. (Signal-Toolkit? :)

@FredericJacobs minimal thing you should have in a short amount of time: a PGP-signed statement on Github with the SHA512 hash of the .IPA that gets installed by iTunes; then people can at least check that their version matches yours and at least know that they did not get a custom edition just for them.

@comex
Copy link

comex commented Mar 11, 2015

You may know this already, but you can also download IPAs using iTunes (on the desktop) and install them to a device over USB. What you need a jailbreak for is decrypting the text segments of executables, which is done on the fly (at page-in time) by obfuscated code on device with, I believe, at least some involvement of hardware keys.

Note that the IPAs that iTunes saves to disk already have the additional (personalized) files I mentioned inserted; getting at the original version (AFAIK) requires messing with the iTunes process. For the minimal solution you mention, it would be much better to hash the files inside.

@dylancarlson
Copy link

@comex thank you for that. I've been waiting for someone smarter than myself to put together something on iOS deterministic builds and finally. I could never understand why the unpacked IPAs didn't match my build. Very helpful. In the old days, we'd buy you a pizza.

@n8fr8
Copy link

n8fr8 commented Mar 12, 2015

Maybe this type of process could be automated by a cloud-based Mac, and submitted to a system along the lines of https://androidobservatory.org/ or other global app hash repository? This could be a great service for other iOS-oriented security and privacy apps.

Here's what TextSecure looks like there: https://androidobservatory.org/app/BD4DA569966EE123685F8B15BCF21B8FA7909E8A

@paulshapiro
Copy link

paulshapiro commented Feb 2, 2018

Paul from MyMonero here.

It would seem to me and my Monero co-contributor @hyc that us Cocoa devs must require that Apple address and support verifiable / reproducible builds. A radar has probably been opened but without any real demand I doubt anything will be done.

However, there are way too many scams out there (e.g. Freewallet) and there is far too much incentive to compromise builds (especially of security and finance apps) for build verification not to be supported.

Practically speaking, how can we make this happen? Anyone know of a radar? Once we open a radar, what's to stop it from being ignored and languishing?

@arrtchiu
Copy link

arrtchiu commented Jul 17, 2019

Has anyone had any further thoughts on this? I can't remember when Apple introduced LLVM bitcode (was it before or after the OP's post?), but they always touted the capability of thinning, optimising or compiling for new CPUs on their servers. One must assume that it makes reproducible builds impossible - not just because of DRM, but also because Apple will be using some unknown and probably unreleased compiler in their backend to do so.

As another person previously posted here, Apple could cause us harm in worse ways than backdooring a single app, so let's assume we trust them and can rule that out of the threat model.

What could OWS do in order to prove that the code they shipped to Apple is indeed the same code that's listed here on GitHub? What if Apple simply gave its developers the option of making the submitted .ipa available to the public somewhere? That'd give us end-users the ability to compile ourselves, and validate against what Apple was supplied. Sounds simple but could we really convince Apple to do it?

@tomj
Copy link

tomj commented Jan 31, 2020

Might be worth looking in to replicating how Telegram implement this for their iOS app. (Telegram post from Dec 2019)

@AndreasGassmann
Copy link

Has there been any progress here? As part of the WalletScrutiny project, which tests the reproducibility of cryptocurrency wallets, we would also like to add iOS apps to the list. I have searched far and wide for an answer, but the only app that provides a tutorial on how to do it is Telegram (which I have not tried out myself just yet).

It looks like a lot of apps are looking into this, but apparently no real progress is made (neither on a technical level, nor on a "political" level by reaching out to Apple). This issue seems to contain the most information. Some of the apps I found that are currently working on this:

NL Covid 19 app (it sounds like they have managed to do it? But it's not merged because of missing licenses)
ImmuniApp
AirGap Vault
Threema

Are there any other projects that are working on this that I'm not aware of? It would probably be a good idea to collect the available information somewhere. It looks like Telegram and the NL Covid 19 were able to do it. If we can bring together enough projects/developers, maybe we could reach out to apple to let them know that there is a need for this and it would be great if they could make it easier.

@rpc31
Copy link

rpc31 commented Dec 11, 2021

Any news on that topic? Telegram since has been able to do it. Isn't this topic supposed to be crucial for an app like Signal that is encouraged to be used by journalist and activists ? Are they supposed to just trust the Signal team not to change the code that is then submitted to Apple ?

@Cerberus0
Copy link
Contributor

Cerberus0 commented Feb 1, 2022

Hello everyone,

This issue tracker is no longer used for feature requests, as stated in the project's contribution guidelines:

The GitHub issue tracker is not used for feature requests, but new ideas can be submitted and discussed on the community forum. The purpose of this issue tracker is to track bugs in the iOS client. Bug reports should only be submitted for existing functionality that does not work as intended. Comments that are relevant and concise will help the developers solve issues more quickly.

Please continue this discussion on the community forum:

https://community.signalusers.org/t/add-reproducible-builds-to-the-app/20516

You can easily join the forum with your existing GitHub account. Thanks!

@stale
Copy link

stale bot commented Apr 2, 2022

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the wontfix label Apr 2, 2022
@JimLoose
Copy link

JimLoose commented Apr 4, 2022

^ bump - there is no reason to close this issue! It's still a big one tough

@stale stale bot removed the wontfix label Apr 4, 2022
@rpc31
Copy link

rpc31 commented Apr 4, 2022

Exactly. I don't even understand how Signal is not prioritizing this. It's the pinnacle of the open source / trust foundation that they expect to build with users an experts.

@stale
Copy link

stale bot commented Jun 3, 2022

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the wontfix label Jun 3, 2022
@massar
Copy link
Author

massar commented Jun 3, 2022

Still relevant, and likely for a while, keep!

@stale stale bot removed the wontfix label Jun 3, 2022
@Cerberus0
Copy link
Contributor

Cerberus0 commented Jul 13, 2022

Please ignore the stale bot, it does not know the difference between bugs and feature requests because this is no longer the right place to submit or discuss feature requests. Please let this GitHub issue be closed and continue this discussion on the community forum:

https://community.signalusers.org/t/add-reproducible-builds-to-the-app/20516

Thanks!

@stale
Copy link

stale bot commented Oct 11, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 11, 2022
@JimLoose
Copy link

^ bump - there is no reason to close this issue! It's still a big one tough

^

@stale stale bot removed the stale label Oct 12, 2022
@EvanHahn-Signal
Copy link
Contributor

As was mentioned, we'd like to use GitHub for bug reports, not feature requests. Please use this community forum thread to discuss this idea further, and the community forums to discuss other feature requests in general.

@EvanHahn-Signal EvanHahn-Signal closed this as not planned Won't fix, can't repro, duplicate, stale Oct 12, 2022
@signalapp signalapp locked as off-topic and limited conversation to collaborators Oct 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests