Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log skipped checksum validation only when a checksum header is provided #3387

Closed

Conversation

jonathan343
Copy link
Contributor

@jonathan343 jonathan343 commented Feb 11, 2025

Related Issue: #3382

Summary

There are cases such as when making ranged GetObject requests to S3 when the ChecksumMode will be set to ENABLED by default, but S3 doesn't send us a checksum header. If multiple ranged requests are made for a single object, botocore will flood logs with this a skipped checksum validation message. Ranged GetObject requests are common for customers using high-level S3 commands that use s3transfer under the hood.

To prevent unwanted logging, we will only log the skipped checksum validation message when an actual checksum header is provided AND we determine we can't perform checksum validation due to an unsupported algorithm.

Example

  1. Upload a 16MB file using a multi-part upload and the default CRC32 checksum algorithm. This gets uploaded using two 8MB parts.
  2. Download the 16MB file using four ranged get_object requests of 4MB size. S3 doesn't send us checksum headers for the individual get_object requests, so we log the skipped validation message.
import boto3
import io
from boto3.s3.transfer import TransferConfig, MB

boto3.set_stream_logger("botocore.httpchecksum")


TEST_BUCKET = "aws-example-bucket"
TEST_KEY = "aws-example-file.txt"

client = boto3.client("s3")

transfer_config = TransferConfig(
    multipart_chunksize=8 * MB  # 8MB is the default, setting this to be explicit
)
# Upload a 16MB object using MPU. This will upload using two 8MB parts.
client.upload_fileobj(
    Fileobj=io.BytesIO(b"*" * (16 * MB)),
    Bucket=TEST_BUCKET,
    Key=TEST_KEY,
    Config=transfer_config,
)

# Download a 16MB object using four 4MB ranged get_object requests.
transfer_config.multipart_chunksize = 4 * MB
client.download_file(TEST_BUCKET, TEST_KEY, TEST_KEY, Config=transfer_config)

Output without PR:

2025-02-13 12:46:59,096 botocore.httpchecksum [DEBUG] Skipping checksum validation. Response did not contain one of the following algorithms: ['crc32', 'sha1', 'sha256'].
2025-02-13 12:46:59,102 botocore.httpchecksum [DEBUG] Skipping checksum validation. Response did not contain one of the following algorithms: ['crc32', 'sha1', 'sha256'].
2025-02-13 12:46:59,237 botocore.httpchecksum [DEBUG] Skipping checksum validation. Response did not contain one of the following algorithms: ['crc32', 'sha1', 'sha256'].
2025-02-13 12:46:59,244 botocore.httpchecksum [DEBUG] Skipping checksum validation. Response did not contain one of the following algorithms: ['crc32', 'sha1', 'sha256'].

Output with PR:

  • Nothing is logged since no checksum header (x-amz-checksum-<algo_name>) exists in the response for us to validate.

@codecov-commenter
Copy link

codecov-commenter commented Feb 11, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (develop@19140a9). Learn more about missing BASE report.
Report is 200 commits behind head on develop.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #3387   +/-   ##
==========================================
  Coverage           ?   93.05%           
==========================================
  Files              ?       66           
  Lines              ?    14481           
  Branches           ?        0           
==========================================
  Hits               ?    13475           
  Misses             ?     1006           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jonathan343
Copy link
Contributor Author

Closing. The team will be reevaluating improvements on how and when we provide customers information related to checksum validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants