-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PKCS#7 encryption versus W3C standard? #18
Comments
Here are answers from Artem to Taehyun. You should be able to talk to him directly once he will have access to this repo.
Yes you are right, W3C Encryption standard describes more general padding mechanism, in fact the PKCS#7 is a special case of the described one. The PKCS#7 is de facto standard for block ciphers padding, so it will be enough in most cases, but I agree, we can implement padding scheme from W3C document to avoid problems in the future. Most crypto libraries do not support this padding scheme, but we can implement it by our own. In fact it looks like ISO 10126 padding standard which was withdrawn in 2007.
Yes, we know original size of some fields, but usually if cipher text size is a multiple of block size we add one dummy block, in another case decryptor will read the last byte and use it as padded bytes count. It will remove last byte-1 bytes. Decryptor doesn't have possibility to find out that encrypted data does not have padding, except CBC-CTS mode. Correct me if I'm wrong. |
Yes, W3C document explains ISO 10126, but it does not use the term ISO any longer.
If we already know the data is multiple of block size, it can be decrypted by decryptor with ‘nopad’ option, even though it has no extra padding. In this case, decryptor regards the number of padding is 0 without any padding data. And we have to remind that if the encryption algorithm is block cipher, ‘Encryption/content_key/encrypted_value’ of the LCP license is used only for encrypted key value whose size is always multiple of block size. |
My own answers, taken from our Slack channel: About 2: I do not have the same interpretation of the relevant sentence:
The language here is unclear about a content whose (fixed) size is already a multiple of the block size, as it just assumes the content is arbitrary. But, given that our profiles do not require the EPUB resources and the content key in the license document to be encrypted with the same algorithm (and thus are not required to have matching block lengths), it might be safer to require padding there as well. |
You're right that PKCS#7 is a subset of the W3C padding algorithm. So you may keep it as it is in the server side. But in the client, if you uses PKCS#7 for the padding, the encrypted resource used W3C padding algorithm would not work. It needs to make clear in the specification, I think.
The ways of decryption based on the specific encryption algorithm always depends on the algorithm itself. If one LCP profile uses stream cipher instead of block cipher, it doesn't need any kind of padding. If it uses even asymmetric one with OAEP, only decrypter should do process according to the OAEP algorithm. So it is matter of profile specification, not general LCP specification. Therefore if a profile of the LCP decides to use any block cipher algorithm specified in the 5.2 Block Encryption Algorithms of W3C document, we need to define clearly whether it is with or without padding at the Encryption/content_key/encrypted_value part. As you know the key size of the block cipher is always multiple of the block size. There will be no harm or any malfunction when the any block cipher key is encrypted and decrypted without padding. |
We can decrypt data with no padding option if it is multiple of block size, that's true. But usually padding is added in that case too, if there is no any specific requirements. This is standard behavior. And as you mentioned before, we have to remind that we use block cipher and we decrypt specific value ‘Encryption/content_key/encrypted_value’. I think that 16 bytes (which is padding in this case) only for one specific field is not the reason to complicate encryption/decryption process and to create additional profile specifications. I guess it may mislead developers and cause errors. |
Yes, as I mentioned before, I agree that we need to implement this padding scheme. |
According to the W3C document,
I don't think we should interpret this sentence that the padding is always default. This is a matter of interpretation way of implementer, not standard default, which may lead not interoperable in the end. |
I mean that most implementations of padding scheme for block ciphers has the same behavior for all data whether it's multiple of the block size or not. For example like PKCS7, ANSI X.923 etc. We do not have standard in our case, but I believe that it's better to use the same padding scheme all over the project, no matter what kind of field we are trying to decrypt. |
Now, I think we've got a consensus to have to use always the ANSI X.923 padding scheme to encrypting resource, no matter what the size of it. And these should be specified in a document as a normative information. Am I correct? |
I wrote about the ANSI X.923 as an example. This standard does not fit to the W3C document as far as I can see. And if we use it we get the same problem we had with the PKCS7. As you wrote before using PKCS7 or (ANSI X.923) may lead to errors during decryption. So I guess we can specify in a document that padding scheme from W3C method should be used, but clarify that padding must always be added. Actually I guess we can interpret that padding should be added even if data to encrypt is multiple of the block size. It's not very clear though:
We can see that N can be equal to B which equals to the block size here.
In the last example padding is equal to the block size. |
Yes. You're right. ANSI X.923 is just another subset of the W3C method. Because it should put zeros , not random digit before last one. I misunderstood. So we have to use the term W3C padding method. However even though the explanation of the text contains N is B case, it just explains one special case out of all cases which can come out when source size is arbitrary. But padding may not still necessary when we know the size is not arbitrary and fixed to multiple to the block size. |
Anyway, we need to make a decision about scheme we will use. As I mentioned before I think that padding scheme should be all-sufficient, which means that we do not need to know what kind of data we are trying to encrypt or decrypt. This means for me that we have unambiguous formula and to apply this formula we do not need to refer to LCP specific information. If we look at the existing padding standards, like PKCS7 http://tools.ietf.org/html/rfc5652#section-6.3, it has statements like:
Here we have k-(lth mod k) formula which describes padding without any additional references. I think it's a right way to formulate padding schemes. This is a standardized document which is used in cryptography widely. Eventually, the only difference between PKCS7 and our padding scheme is that we pad with random data while PKCS7 pads with exact digit. So I think we can specify our padding bytes count with the same formula. |
I am writing a specification for LCP profile which includes additional encryption algorithms (GCM), padding method and unique secret key for the User Key. What do you think about following description for the padding method? Padding Method
|
@thkim2015 the term "shall" seems inappropriate. Is MUST a better conformance requirement, or is the intent to indeed express optionality? (MAY) |
@thkim2015 have you reached a conclusion about CBC (with padding rule) vs. GCM? Any performance gain? Or any particular technical rationale for choosing one or the other for LCP? |
@danielweck padding is necessary for the CBC mode, could not not be optional. GCM could be a better choice in terms of padding because it does not need padding. |
@thkim2015 oh sure, I understand that CBC requires padding. I was wondering about choosing between CBC and GCM in the LCP server and client implementations. Right now, I believe it is CBC, which offers better decryption performance. So it looks like we should stick to this current encryption profile, right? |
@danielweck Recently I found that GCM has become only one encryption algorithm in the basic encryption profile. |
I put GCM instead of CBC in the basic profile in light of recent attacks on CBC, and the fact that GCM is now the recommended widespread cipher mode. Of course, that change can be reverted given enough objective evidence. Do you have any data on the performance difference between CBC and GCM, and how impactful it is on the whole performance of rendering an EPUB? In the typical text epub, and in one with large media elements? That could help us justify a decision, one way or another. |
I see, thank you guys for the clarifications. |
@jpbougie What kind of attack for the CBC did you see? Padding oracle attack? If it is, W3C padding method is one of the main reasons to remove the vulnerability of the PKCS#7. And regarding the rendering performance, theoretically CBC is better than GCM because GCM needs HMAC operation and counter calculation works more. And we have a test result for the CBC and GCM on the android and iOS phones. Even though we removed the HMAC verification process in large media resources since it needs whole data reading I/O job (it is almost impossible), some iOS version showed lower performance (almost 30% I remember although I don't know the reason) with GCM than CBC. I'll upload the detail test results which we have done later. |
We should not forget that it is not in the Readium LCP charter to be the ultimate rock-solid DRM technology. It the performance loss is really 30% with CGM in some instances, the choice will be easy. |
I saw your discussion and decided to add my opinion about this question. As @thkim2015 said GCM supports HMAC operation which provides integrity for ciphertext. It's so-called authenticated encryption scheme. But since we use RSA-SHA256 digital signature algorithm, it already provides authenticity of data for us. So I think CBC is a good option to provide confidentiality without performance loss. Though if performance is very important there is one another option: to use CTR mode. It faster than CBC because it supports parallel processing and it can give great performance growth on modern multi-core processors. |
Please refer the performance test results on CBC and GCM in the client side. We have tested with 400MB and 1GB resources. device (CBC/GCM sec) |
@artem-brazhnikov we use RSA-SHA256 to sign the license itself, which means that we only get authenticity for that file. In the case of GCM, it would provide it for all assets in the container, it's really not the same. @thkim2015 are these decryption time for the whole resource? Seems too fast for the whole resource. |
@HadrienGardeur It is average time per 1MB |
@thkim2015 thanks for the additional info. So even through relatively it can be quite slower, the difference won't be really perceivable, not even for video (1080p video is around 8 Mb/s) GCM is the de-facto standard these days for SSL, worst case scenario we'll be on par with that. |
@HadrienGardeur Above result is one after excluding HMAC processing. So it means it is close toward CBC vs. CTR rather that GCM. And SSL is a protocol which needs message authentication necessarily. So GCM could be a proper algorithm for it. But for the normal resource encryption, CBC is still de-facto standard, not GCM. Moreover, LCP has RSA-SHA256 signature in the License Document. |
@HadrienGardeur Oh yes, you right, I totally forgot about the files encryption, it was long time ago. Then GCM is the right choice for you I guess. |
@thkim2015 why would we need to give up random access feature with GCM? It's designed to support it, that's part of the reason it uses GHASH instead of true HMAC. I've already replied to the same comment from @artem-brazhnikov regarding RSA-SHA256: we use it strictly for the license vs resources for GCM. The only part where I agree with you is that if we don't need authentication for resources, then CTR is faster. |
Hi experts, we must come to a choice and finalize the readium LCP 1.0 Profile, so that we can begin interop tests shortly. |
Note that I have "ported" @jpbougie 's PR to the new LSD / LCP server architecture. |
Hello all! Currently, both the LCP Go server and C++ client use
Just to clarify: we want to use the exact same padding method in both cases (1) and (2), right? So, we can either continue to use the current CBC padding method, or update our implementations to support another padding scheme. Either way: we need to update the profile specification to explicitly document the type of padding used, right? Do we also need to refine the algorithm URI (which only generically references CBC), or can a decryptor simply look at the encrypted content to discover the padding method used? Any other points of discussion? |
We'll close this issue with the following choice, which will be written in the Readium LCP Encryption Profile 1.0: |
I will add comments in Go and CryptoPP code. |
FYI, client-side comment: readium/readium-lcp-client@7b258d5 |
I am sorry for late comment. I disagree with the conclusion on this matter to be PKCS#7. According to the discussion flow in this issue, if we use the CBC algorithm ID, "http://www.w3.org/2001/04/xmlenc#aes256-cbc", we should follow the W3C guideline like what I described in the above comment on 8 Jul 2016. And personally, I think W3C padding method is stronger than PKCS#7 against the padding oracle attack which is one of the weakness of CBC algorithm. Even though the padding scheme is explained in the W3C document well, IMO, the proposed text (on 8 Jul 2016) should be added in the LCP Encryption Profile 1.0 specification for the reminding the importance of the padding scheme again. Or in the LCP Encryption Profile 1.0, we can use GCM algorithm even for the Content Key in order not to consider the padding matter. |
@thkim2015 thanks for chiming in! Regarding AES-256-GCM, we have a perfectly fine Go server implementation. But, please see: #109 If DRM-Inside successfully implemented non-authenticated GCM decryption of partial cypher buffers / HTTP byte-range responses, could you please share details? Thank you very much. |
@thkim2015 regarding the use of the URI I am copying the "padding method" explanation you wrote further up in this discussion thread:
Would you be able to create a pull request for the Go server?
On the client side, CryptoPP "natively" handles PKCS (PKCS#7, even though it is internally named as synonymous PKCS#5, see https://github.com/readium/readium-lcp-client/blob/develop/src/third-parties/cryptopp/filters.h#L457 ), and Readium's existing code would need to be updated to handle the W3C padding algorithm (i.e. reading the correct offset inside the padding to determine the length, etc.):
(function So, can anybody create a pull request for this too? Thanks for your contributions! |
@thkim2015 in DRM-Inside client implementation, you are using |
@danielweck Right! ... DRM inside uses <openssl/aes.h>. However since there is just a little difference between PCKS#7 and W3C standard, we might be able to make a pull-request on the GO server for it. BTW would you add our company id 'drminside' into the private git 'readium/readium-lcp-server' and 'readium/readium-lcp-client' projects? |
@rkwright can you please add drm-inside? |
I am not sure mailing lists can be added. Will check when I get to NYPL |
organisation: |
@thkim2015 , @danielweck But that is their repo, not adding them to a team with access to a repo. I just checked and tried to add "drm-inside" , but github refuses, saying "drm-inside is not a github team member". I can add the members of the team, but need to do it person by person with their github username. |
have you tried drminside without the hyphen character? |
@rkwright @danielweck We added new e-mail address "[email protected]" for "drminside" github account. So could you add our new e-mail address? |
Did you actually create a "drminside" github iser account? That is the key |
@rkwright We have a valid account "drminside' that you can verify at https://github.com/drminside' |
@thkim2015 that worked and I sent an invite. Let me know if that doesn't work. |
@rkwright Now, 'drminside' account can access the lcp-server and lcp-client repo. Thanks. |
Thanks to @drminside @thkim2015 for helping with the client-side implementation of W3C padding :) |
Thanks to @drminside @thkim2015 for also helping with the server-side implementation of W3C padding. |
WebCryptoAPI supports only PKCS#7 padding scheme for AES-CBC See section `26. AES-CBC, https://www.w3.org/TR/WebCryptoAPI See more: readium#18 PKCS#7 is subset W3C padding scheme, i.e. if a client supports W3C padding scheme it will be able to decrypt content encrypted using PKCS#7 padding scheme.
WebCryptoAPI supports only PKCS#7 padding scheme for AES-CBC See section `26. AES-CBC, https://www.w3.org/TR/WebCryptoAPI See more: readium#18 PKCS#7 is subset W3C padding scheme, i.e. if a client supports W3C padding scheme it will be able to decrypt content encrypted using PKCS#7 padding scheme.
Thanks for @drminside for reporting this! CC @Mantano @mmenu-mantano @edrlab @clebeaupin @TEA-ebook @RemiBauzac
Verbatim quote from Taehyun Kim:
(Note: as the
readium/readium-lcp-server
repository is currently private, I am not hiding the content server URLs, so the links are quoted below in plain text)Encrypted EPUB and license was downloaded from http://www.neovento.net:8989/manage/
Here are test results which are different from our understanding on the W3C encryption standard.
But W3C Encryption standard requires other method, which is filled with any bytes until the last byte-1 at the last block, and the padding size should be specified by the last byte. (Please 5.2 Block Encryption Algorithm/padding part).
PKCS#7 is more delicate padding method than W3C Encryption standard. So decryption algorithm having PKCS#7 may occur error during encrypted data with W3C method, but not vice versa.
Encryption/content_key/encrypted_value in the license has padding. It is normal when original size is random and not multiple of data block size. But when the size of original data is always well known as multiple of block size, it doesn’t need padding. (Please 5.2 Block Encryption Algorithm/padding part)
And we need to share the root certificate for the interoperability of signature validation.
Except these mismatches, I have succeeded to decrypt resources with the content_key in the license.
And if you want to test with encrypted EPUBs from DRM inside, you may download from
http://www.drminside.com/LCP/
=>Download=>Test Material
All passphrase is ‘lcptest’
I am not asserting our understanding is correct or true. But I am worrying any other implementer of the LCP may interpret other way in this parts.
Note that Mantano's test file in GitHub ( https://github.com/Mantano/mantano-lcp-client/tree/develop/test/lcp-client-lib/data ) may need replacing with updated versions from http://www.neovento.net:8989/manage/ (just to make sure the provider certificate etc. is up to date)
Also note that DRM-Inside have test files in GitHub (private repository): https://github.com/drminside/readium-drm/tree/master/LCP/TestData
The text was updated successfully, but these errors were encountered: