Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HeadObject for lookup #69

Merged
merged 1 commit into from
Feb 6, 2023
Merged

Use HeadObject for lookup #69

merged 1 commit into from
Feb 6, 2023

Conversation

jamesbornholt
Copy link
Member

Our current lookup does two concurrent ListObjects requests. After
thinking about it a bit more carefully, one of them can be replaced with
a cheaper, faster HeadObject request. The "unsuffixed" request we were
doing was purely to discover whether an object of the exact looked-up
name existed, which is what HeadObject does. Switching to HeadObject
reduces the request costs of a lookup.

One disadvantage of HeadObject is when looking up directories. The
unsuffixed ListObjects we're replacing here could discover a common
prefix and return it immediately without waiting for the other request
to complete. But in practice, the two requests were dispatched
concurrently, so the customer still pays for both requests, and the
latency is the minimum latency of two concurrently ListObjects. Now,
the latency for a directory lookup will be the maximum of a concurrent
ListObjects and HeadObject.

An issue in this change is that we expect HeadObject to return 404 when
doing directory lookups, but right now the way our error types are
structured gives us no way to distinguish 404s from other errors. For
now, I'm just swallowing all errors on the HeadObject request, and I'll
follow up with a broader change to fix our error handling story to make
this work.

This is a partial fix for #12, but in future we can do better for
lookups against objects we've seen before by remembering their type.

Signed-off-by: James Bornholt [email protected]


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Our current `lookup` does two concurrent ListObjects requests. After
thinking about it a bit more carefully, one of them can be replaced with
a cheaper, faster HeadObject request. The "unsuffixed" request we were
doing was purely to discover whether an object of the exact looked-up
name existed, which is what HeadObject does. Switching to HeadObject
reduces the request costs of a lookup.

One disadvantage of HeadObject is when looking up directories. The
unsuffixed ListObjects we're replacing here could discover a common
prefix and return it immediately without waiting for the other request
to complete. But in practice, the two requests were dispatched
concurrently, so the customer still pays for both requests, and the
latency is the minimum latency of two concurrently ListObjects. Now,
the latency for a directory lookup will be the maximum of a concurrent
ListObjects and HeadObject.

An issue in this change is that we expect HeadObject to return 404 when
doing directory lookups, but right now the way our error types are
structured gives us no way to distinguish 404s from other errors. For
now, I'm just swallowing all errors on the HeadObject request, and I'll
follow up with a broader change to fix our error handling story to make
this work.

This is a partial fix for #12, but in future we can do better for
lookups against objects we've seen before by remembering their type.

Signed-off-by: James Bornholt <[email protected]>
@jamesbornholt jamesbornholt marked this pull request as ready for review February 1, 2023 23:14
// TODO: 404s currently become client errors, but they are expected when looking
// up a directory, so we just swallow all errors for now. Fix when we model
// service errors correctly.
if let Ok(result) = result.map_err(|e| InodeError::ClientError(e.into())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear why you're doing a map_err when you just care about the Ok case. Maybe just warn! with the error code and say we're ignoring the errors for now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is just leftover from the old code, which mapped the error so it could return it with ?. I’d prefer not to add the warning since it will be very noisy (every directory lookup) and I’m fast-following this change with the one that handles this case correctly, so this change is really a temporary state.

@jorajeev jorajeev merged commit 2c2c23c into main Feb 6, 2023
@jamesbornholt jamesbornholt deleted the lookup-head-object branch February 6, 2023 18:28
jamesbornholt added a commit that referenced this pull request Feb 14, 2023
Our current ObjectClient allows each implementer to provide its own
error types for each request. This is nice and flexible, but prevents
callers of an ObjectClient from being generic if they want to detect
common service errors like NoSuchKey -- they must know the concrete
error type of the particular client they're using to match on these
errors. We've been getting away with this until #69, where we need to be
able to distinguish (expected) 404 errors from other errors on
HeadObject.

This change refactors ObjectClient to provide a common service error
type for each operation. ObjectClients now return an error that is
*either* a modeled service error like NoSuchKey *or* a client-specific
error. This allows callers to be generic over the ObjectClient and still
discriminate on the interesting error types, where by "interesting" I
mean things I think it's likely a caller might want to know about.

The diff was getting pretty big so I'm splitting this into two commits.
This is Part 1, which just does the refactoring, but doesn't change our
S3CrtClient to return the new modeled service errors. That means this
change shouldn't cause any functional change -- every error will be a
client error, like it was before this commit.  I'll follow up with Part
2 that adds the service errors to S3CrtClient (so does XML parsing etc).

Signed-off-by: James Bornholt <[email protected]>
jamesbornholt added a commit that referenced this pull request Feb 14, 2023
Our current ObjectClient allows each implementer to provide its own
error types for each request. This is nice and flexible, but prevents
callers of an ObjectClient from being generic if they want to detect
common service errors like NoSuchKey -- they must know the concrete
error type of the particular client they're using to match on these
errors. We've been getting away with this until #69, where we need to be
able to distinguish (expected) 404 errors from other errors on
HeadObject.

This change refactors ObjectClient to provide a common service error
type for each operation. ObjectClients now return an error that is
*either* a modeled service error like NoSuchKey *or* a client-specific
error. This allows callers to be generic over the ObjectClient and still
discriminate on the interesting error types, where by "interesting" I
mean things I think it's likely a caller might want to know about.

The diff was getting pretty big so I'm splitting this into two commits.
This is Part 1, which just does the refactoring, but doesn't change our
S3CrtClient to return the new modeled service errors. That means this
change shouldn't cause any functional change -- every error will be a
client error, like it was before this commit.  I'll follow up with Part
2 that adds the service errors to S3CrtClient (so does XML parsing etc).

Signed-off-by: James Bornholt <[email protected]>
passaro added a commit to passaro/mountpoint-s3 that referenced this pull request Feb 5, 2025
Submodule mountpoint-s3-crt-sys/crt/aws-c-auth 5bc67797..b513db4b:
  > A bunch of CMake fixes (awslabs#258)
  > Add Account Id to Credentials (awslabs#260)
  > Skip Transfer-Encoding from signing (awslabs#261)
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fbbe2612..7299c6ab:
  > Fix Findcrypto.cmake (awslabs#205)
  > A bunch of CMake fixes (awslabs#203)
  > Switch CI to use roles (awslabs#202)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 7a6f5df2..0e7637fa:
  > A bunch of CMake fixes (awslabs#1178)
  > Fix heap overflow on uri parsing (awslabs#1185)
  > (take 2) Detect when AVX is disabled via OSXSAVE (awslabs#1184)
  > Fixup IPv6 validation logic (awslabs#1180)
  > Detect when AVX is disabled via OSXSAVE (awslabs#1182)
  > proof_ci.yaml must use latest upload-artifact (awslabs#1183)
  > change PR template to ask for clearer wording (awslabs#1177)
Submodule mountpoint-s3-crt-sys/crt/aws-c-compression c6c1191e..f951ab2b:
  > A bunch of CMake fixes (awslabs#72)
  > Switch CI to use roles (awslabs#71)
  > chore: Modified bug issue template to add checkbox to report potential regression. (awslabs#69)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http fc3eded2..590c7b59:
  > A bunch of CMake fixes (awslabs#497)
  > Fix CI for GCC-13 on Ubuntu-18  (awslabs#496)
  > Switch CI to use roles (awslabs#494)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io fcb38c80..3041dabf:
  > A bunch of CMake fixes (awslabs#701)
  > Event Loop & Socket Type Multi-Support (awslabs#692)
  > fix typo in log message (awslabs#702)
  > Fix CI for GCC-13 on Ubuntu-18 (awslabs#700)
  > Switch CI to use roles (awslabs#698)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 a3b401bf..6eb8be53:
  > A bunch of CMake fixes (awslabs#480)
  > S3Express CreateSession Allowlist Headers (awslabs#492)
  > Auto - Update S3 Ruleset & Partition (awslabs#491)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils 1ae8664f..ba6a28fa:
  > A bunch of CMake fixes (awslabs#50)
Submodule mountpoint-s3-crt-sys/crt/aws-checksums 3e4101b9..fb8bd0b8:
  > A bunch of CMake fixes (awslabs#101)
  > Switch CI to use roles (awslabs#100)
Submodule mountpoint-s3-crt-sys/crt/aws-lc ffd6fb71..138a6ad3:
  > Prepare AWS-LC v1.44.0 (#2153)
  > Fix issue with ML-DSA key parsing (#2152)
  > Add support for PKCS7_set/get_detached (#2134)
  > Prepare Docker image for CI integration jobs (#2126)
  > Delete OpenVPN mainline patch from our integration build (#2149)
  > SHA3/SHAKE Init Updates via FIPS202 API layer (#2101)
  > Support keypair calculation for PQDSA PKEY (#2145)
  > Optimize x86/aarch64 MD5 implementation (#2137)
  > Check for MIPSEB in target.h (#2143)
  > Ed25519ph and Ed25519ctx Support (#2120)
  > Support for ML-DSA public key generation from private key (#2142)
  > Avoid mixing SSE and AVX in XTS-mode AVX512 implementation (#2140)
  > Remove remaining support for Trusty and Fuchsia operating systems (#2136)
  > ACVP test harness for ML-DSA (#2127)
  > Minor symbols to work with Ruby's mainline (#2132)

Signed-off-by: Alessandro Passaro <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Feb 5, 2025
Update the CRT libraries to the latest releases. In particular, include:
* S3Express CreateSession Allowlist Headers
([awslabs/aws-c-s3#492](awslabs/aws-c-s3#492))

<details>
  <summary>Full CRT changelog:</summary>
  
```
Submodule mountpoint-s3-crt-sys/crt/aws-c-auth 5bc67797..b513db4b:
  > A bunch of CMake fixes (#258)
  > Add Account Id to Credentials (#260)
  > Skip Transfer-Encoding from signing (#261)
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fbbe2612..7299c6ab:
  > Fix Findcrypto.cmake (#205)
  > A bunch of CMake fixes (#203)
  > Switch CI to use roles (#202)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 7a6f5df2..0e7637fa:
  > A bunch of CMake fixes (#1178)
  > Fix heap overflow on uri parsing (#1185)
  > (take 2) Detect when AVX is disabled via OSXSAVE (#1184)
  > Fixup IPv6 validation logic (#1180)
  > Detect when AVX is disabled via OSXSAVE (#1182)
  > proof_ci.yaml must use latest upload-artifact (#1183)
  > change PR template to ask for clearer wording (#1177)
Submodule mountpoint-s3-crt-sys/crt/aws-c-compression c6c1191e..f951ab2b:
  > A bunch of CMake fixes (#72)
  > Switch CI to use roles (#71)
  > chore: Modified bug issue template to add checkbox to report potential regression. (#69)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http fc3eded2..590c7b59:
  > A bunch of CMake fixes (#497)
  > Fix CI for GCC-13 on Ubuntu-18  (#496)
  > Switch CI to use roles (#494)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io fcb38c80..3041dabf:
  > A bunch of CMake fixes (#701)
  > Event Loop & Socket Type Multi-Support (#692)
  > fix typo in log message (#702)
  > Fix CI for GCC-13 on Ubuntu-18 (#700)
  > Switch CI to use roles (#698)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 a3b401bf..6eb8be53:
  > A bunch of CMake fixes (#480)
  > S3Express CreateSession Allowlist Headers (#492)
  > Auto - Update S3 Ruleset & Partition (#491)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils 1ae8664f..ba6a28fa:
  > A bunch of CMake fixes (#50)
Submodule mountpoint-s3-crt-sys/crt/aws-checksums 3e4101b9..fb8bd0b8:
  > A bunch of CMake fixes (#101)
  > Switch CI to use roles (#100)
Submodule mountpoint-s3-crt-sys/crt/aws-lc ffd6fb71..138a6ad3:
  > Prepare AWS-LC v1.44.0 (#2153)
  > Fix issue with ML-DSA key parsing (#2152)
  > Add support for PKCS7_set/get_detached (#2134)
  > Prepare Docker image for CI integration jobs (#2126)
  > Delete OpenVPN mainline patch from our integration build (#2149)
  > SHA3/SHAKE Init Updates via FIPS202 API layer (#2101)
  > Support keypair calculation for PQDSA PKEY (#2145)
  > Optimize x86/aarch64 MD5 implementation (#2137)
  > Check for MIPSEB in target.h (#2143)
  > Ed25519ph and Ed25519ctx Support (#2120)
  > Support for ML-DSA public key generation from private key (#2142)
  > Avoid mixing SSE and AVX in XTS-mode AVX512 implementation (#2140)
  > Remove remaining support for Trusty and Fuchsia operating systems (#2136)
  > ACVP test harness for ML-DSA (#2127)
  > Minor symbols to work with Ruby's mainline (#2132)
```
</details>


### Does this change impact existing behavior?

No.

### Does this change need a changelog entry? Does it require a version
change?

No.

---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).

Signed-off-by: Alessandro Passaro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants