Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An empty Content-Encoding header is set in object's metadata after uploading the object to s3 with extra argument ChecksumAlgorithm #3629

Closed
ronanliu opened this issue Mar 14, 2023 · 8 comments
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue s3 service-api This issue is caused by the service API, not the SDK implementation.

Comments

@ronanliu
Copy link

ronanliu commented Mar 14, 2023

Describe the bug

There's an issue which is similar to #3568. When calling the method upload_file with ChecksumAlgorithm argument, an empty Content-Encoding record will be set in uploaded object's metadata.
Screen Shot 2023-03-15 at 9 26 14 pm

Expected Behavior

I would expect no Content-Encoding record is set as object's metadata if I don't pass it as argument when uploading a object.

Current Behavior

Without specifying an extra argument for Content Encoding, the object's metadata still have an empty record for this header.

Reproduction Steps

import boto3

s3 = boto3.resource('s3')
s3_client = s3.meta.client

extra_args = {}

extra_args['ChecksumAlgorithm'] = 'SHA1'
extra_args['ServerSideEncryption'] = 'AES256'
extra_args['ContentType'] = 'text/css'

s3_client.upload_file(
  'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
  extra_args,
)

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.26.90

Environment details (OS name and version, etc.)

Linux

@ronanliu ronanliu added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Mar 14, 2023
@tim-finnigan tim-finnigan self-assigned this Mar 16, 2023
@tim-finnigan tim-finnigan added investigating This issue is being investigated and/or work is in progress to resolve the issue. s3 and removed needs-triage This issue or PR still needs to be triaged. labels Mar 16, 2023
@tim-finnigan
Copy link
Contributor

Hi @ronanliu thanks for reporting this issue. I could reproduce the behavior you described. Just to clarify the differences between #3568, the specific issue reported there now appears to be resolved, as the ContentEncoding header will now get set to when specified along with ChecksumAlgorithm. However, only specifying a ChecksumAlgorithm using the code snippet in #3568 also adds an empty ContentEncoding header as you described here. I will mark this issue for further investigation.

@tim-finnigan tim-finnigan added needs-review p2 This is a standard priority issue and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. needs-review labels Mar 16, 2023
@tim-finnigan
Copy link
Contributor

Upon further discussion with a colleague, we concluded that this is likely an issue with the S3 console. The payloads generated by the code snippets referenced above include Content-Encoding': b'aws-chunked' when using a ChecksumAlgorithm. This documentation provides some more context: https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html.

I created a new issue in our cross-SDK repository (aws/aws-sdk#498) and will reach out to S3 for further review. For updates please refer to that issue going forward. Thanks!

@ronanliu
Copy link
Author

Hi @tim-finnigan , thanks for looking into it!. I'll keep an eye on the new issue.

@tim-finnigan tim-finnigan reopened this Oct 30, 2024
@tim-finnigan tim-finnigan removed their assignment Oct 30, 2024
@tim-finnigan tim-finnigan added the service-api This issue is caused by the service API, not the SDK implementation. label Oct 30, 2024
@jmklix
Copy link

jmklix commented Oct 30, 2024

Re-opening this issue to track updates here. Closing the old tracking issue

P83760338 / D119267049

@ima747
Copy link

ima747 commented Oct 31, 2024

Since aws/aws-sdk#498 (comment) has been closed in favor of this to track, just bringing back up that this happens from Boto3 as well so is not limited to console uploads as theorized above. There are reports of the same behavior with sdks for other languages as well, implying this is more fundamental.

@alex-keeler
Copy link

Since I haven't seen anyone mention this -- it does appear to be documented behavior.
https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html

"If you specify Content-Encoding in your request as Content-Encoding : aws-chunked, S3 adds an empty value for Content-Encoding and stores the object metadata (Content-Encoding : ) to the resulting object."

The issue is that an empty Content-Encoding header is invalid and causes undefined behavior, i.e. when it conflicts with cloudfront's compression headers.

Just throwing this out there in case it helps locate the root cause.

@ianbotsf
Copy link

This issue has now been resolved by S3. Going forward, uploading objects via chunked encoding will no longer result in storing or retrieving Content-Encoding: . The service documentation has also been updated to reflect this:

Amazon S3 stores the resulting object without the aws-chunked value in the content-encoding header. If aws-chunked is the only value that you pass in the content-encoding header, S3 considers the content-encoding header empty and does not return this header when your retrieve the object.

Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue s3 service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

6 participants