Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement mkdir #202

Merged
merged 5 commits into from
Apr 27, 2023
Merged

Implement mkdir #202

merged 5 commits into from
Apr 27, 2023

Conversation

passaro
Copy link
Contributor

@passaro passaro commented Apr 11, 2023

Address #77 by creating local-only directories:

  • mkdir will create a new empty directory in the file system, but not affect the content of the bucket (i.e. no "directory marker" is created in the bucket).
  • If a file is created under the new directory or a nested directory, the local directory will revert to a normal, implicit directory.

The new behavior is implemented using the same mechanism for writing files: a directory is created with a WriteStatus::LocalUnopened state and added to parent's writing_files set. When a new file is uploaded and its state transition to Remote in finish_writing, we also traverse its ancestors to transition local directories to Remote.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@passaro passaro temporarily deployed to PR integration tests April 11, 2023 10:25 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 11, 2023 10:25 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 11, 2023 10:25 — with GitHub Actions Inactive
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some very high level comments. Would like @monthonk's feedback here since he's just explored the local-only changes with readdir.

Comment on lines 364 to 369
let existing = self.lookup(client, dir, name).await;
match existing {
Ok(lookup) => return Err(InodeError::FileAlreadyExists(lookup.inode.ino())),
Err(InodeError::FileDoesNotExist) => (),
Err(e) => return Err(e),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i imagine you plan to change this anyway, but i assume we need new InodeError types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure about new errors: wouldn't `FileAlreadyExists' already cover directories as well? Even if we wanted to differentiate, I think they would map to the same error code anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it'll return the same error.

I'm just wondering how we can make the error type's meaning clearer. Maybe a Rustdoc on it, or renaming "FileAlreadyExists" as "PathAlreadyExists" and "FileDoesNotExist" as "EntryDoesNotExist".

Comment on lines 379 to 387
// Check again for the child now that the parent is locked, since we might have lost to a
// racing lookup. (It would be nice to lock the parent and *then* lookup, but we'd have to
// hold that lock across the remote API calls).
let InodeKindData::Directory { children, .. } = &mut parent_state.kind_data else {
return Err(InodeError::NotADirectory(dir));
};
if let Some(inode) = children.get(name) {
return Err(InodeError::FileAlreadyExists(inode.ino()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need this code also for rmdir and unlink. So if it makes sense, we may want to have this as a reusable method.

@dannycjones dannycjones requested a review from monthonk April 11, 2023 11:01
Copy link
Contributor

@monthonk monthonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, only have a few comments but I think you can continue this path.

children: Default::default(),
writing_children: Default::default(),
},
write_status: WriteStatus::LocalUnopened,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what status we should be using, but we might want to update this to WriteStatus::Remote when the first file is created.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this transition is the tricky one to think through -- a local directory becomes remote when any child (including recursive children) finishes writing. So the finish_writing path on a file probably needs to walk up the directory tree until it finds an already remote directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transition from local to remote implemented in finish_writing of nested files.

@passaro passaro temporarily deployed to PR integration tests April 11, 2023 17:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 11, 2023 17:26 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 11, 2023 17:26 — with GitHub Actions Inactive
Copy link
Member

@jamesbornholt jamesbornholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things:

  • How do we think readdir should work on local directories? Can we skip the ListObjects call in those cases?
  • How to think about concurrent mutations -- what if the remote side creates a new object that overlaps the local directory?

children: Default::default(),
writing_children: Default::default(),
},
write_status: WriteStatus::LocalUnopened,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this transition is the tricky one to think through -- a local directory becomes remote when any child (including recursive children) finishes writing. So the finish_writing path on a file probably needs to walk up the directory tree until it finds an already remote directory?

Copy link
Contributor

@sauraank sauraank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to know, where are these shell commands like mkdir, touch etc mapped to these methods? Is it in the fuse?

@passaro passaro had a problem deploying to PR integration tests April 18, 2023 13:29 — with GitHub Actions Failure
@passaro passaro had a problem deploying to PR integration tests April 18, 2023 13:29 — with GitHub Actions Failure
@passaro passaro had a problem deploying to PR integration tests April 18, 2023 13:29 — with GitHub Actions Failure
@passaro passaro added this to the Beta (Write support) milestone Apr 18, 2023
@passaro passaro linked an issue Apr 18, 2023 that may be closed by this pull request
@passaro passaro had a problem deploying to PR integration tests April 20, 2023 15:33 — with GitHub Actions Failure
@passaro passaro had a problem deploying to PR integration tests April 20, 2023 15:33 — with GitHub Actions Failure
@passaro passaro had a problem deploying to PR integration tests April 20, 2023 15:33 — with GitHub Actions Failure
@passaro passaro had a problem deploying to PR integration tests April 21, 2023 09:08 — with GitHub Actions Error
@passaro passaro had a problem deploying to PR integration tests April 21, 2023 09:08 — with GitHub Actions Error
@passaro passaro had a problem deploying to PR integration tests April 21, 2023 09:08 — with GitHub Actions Error
@passaro passaro temporarily deployed to PR integration tests April 21, 2023 09:15 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 21, 2023 09:15 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 21, 2023 09:15 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 24, 2023 07:40 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 24, 2023 07:40 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 24, 2023 07:40 — with GitHub Actions Inactive
@passaro passaro marked this pull request as ready for review April 24, 2023 08:08
Copy link
Contributor

@monthonk monthonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your approach on the test! added a few more comments.

@passaro passaro temporarily deployed to PR integration tests April 24, 2023 17:08 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 24, 2023 17:08 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 24, 2023 17:08 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 25, 2023 14:39 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 25, 2023 14:39 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 25, 2023 14:39 — with GitHub Actions Inactive
monthonk
monthonk previously approved these changes Apr 25, 2023
Copy link
Contributor

@monthonk monthonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +587 to +634
let ancestor = self.inner.get(ancestor_ino)?;
ancestors.push(ancestor.clone());
if ancestor.ino() == ROOT_INODE_NO
|| ancestor.inner.sync.read().unwrap().write_status == WriteStatus::Remote
{
break;
}
ancestor_ino = ancestor.parent();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a little bit of infinite loop paranoia here: track a set of seen ancestors and assert! that we never visit an ancestor we've already seen. That should be impossible today, but can imagine it changing if we ever did symlinks, for example.

} => {
writing_children.remove(&inode.ino());

// traverse ancestors from parent to first remote ancestor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says what, not why.

Comment on lines 599 to 600
// acquire locks on ancestors first
// ancestors_states goes from first remote ancestor to parent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should say why

@passaro passaro temporarily deployed to PR integration tests April 26, 2023 08:15 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 26, 2023 08:15 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests April 26, 2023 08:15 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt merged commit 330c632 into awslabs:main Apr 27, 2023
sauraank pushed a commit to sauraank/mountpoint-s3 that referenced this pull request Apr 27, 2023
* Implement mkdir to create local directories

Signed-off-by: Alessandro Passaro <[email protected]>

* Acquire locks from top to bottom in finish_writing

Signed-off-by: Alessandro Passaro <[email protected]>

* Use mount time for local directory stat

Signed-off-by: Alessandro Passaro <[email protected]>

* Add section for `mkdir` to SEMANTICS.md

Signed-off-by: Alessandro Passaro <[email protected]>

* Improve comments and add check for cycles

Signed-off-by: Alessandro Passaro <[email protected]>

---------

Signed-off-by: Alessandro Passaro <[email protected]>
Signed-off-by: sauraank <[email protected]>
@passaro passaro deleted the mkdir branch April 28, 2023 09:22
passaro added a commit to passaro/mountpoint-s3 that referenced this pull request Feb 5, 2025
Submodule mountpoint-s3-crt-sys/crt/aws-c-auth 5bc67797..b513db4b:
  > A bunch of CMake fixes (awslabs#258)
  > Add Account Id to Credentials (awslabs#260)
  > Skip Transfer-Encoding from signing (awslabs#261)
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fbbe2612..7299c6ab:
  > Fix Findcrypto.cmake (awslabs#205)
  > A bunch of CMake fixes (awslabs#203)
  > Switch CI to use roles (awslabs#202)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 7a6f5df2..0e7637fa:
  > A bunch of CMake fixes (awslabs#1178)
  > Fix heap overflow on uri parsing (awslabs#1185)
  > (take 2) Detect when AVX is disabled via OSXSAVE (awslabs#1184)
  > Fixup IPv6 validation logic (awslabs#1180)
  > Detect when AVX is disabled via OSXSAVE (awslabs#1182)
  > proof_ci.yaml must use latest upload-artifact (awslabs#1183)
  > change PR template to ask for clearer wording (awslabs#1177)
Submodule mountpoint-s3-crt-sys/crt/aws-c-compression c6c1191e..f951ab2b:
  > A bunch of CMake fixes (awslabs#72)
  > Switch CI to use roles (awslabs#71)
  > chore: Modified bug issue template to add checkbox to report potential regression. (awslabs#69)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http fc3eded2..590c7b59:
  > A bunch of CMake fixes (awslabs#497)
  > Fix CI for GCC-13 on Ubuntu-18  (awslabs#496)
  > Switch CI to use roles (awslabs#494)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io fcb38c80..3041dabf:
  > A bunch of CMake fixes (awslabs#701)
  > Event Loop & Socket Type Multi-Support (awslabs#692)
  > fix typo in log message (awslabs#702)
  > Fix CI for GCC-13 on Ubuntu-18 (awslabs#700)
  > Switch CI to use roles (awslabs#698)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 a3b401bf..6eb8be53:
  > A bunch of CMake fixes (awslabs#480)
  > S3Express CreateSession Allowlist Headers (awslabs#492)
  > Auto - Update S3 Ruleset & Partition (awslabs#491)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils 1ae8664f..ba6a28fa:
  > A bunch of CMake fixes (awslabs#50)
Submodule mountpoint-s3-crt-sys/crt/aws-checksums 3e4101b9..fb8bd0b8:
  > A bunch of CMake fixes (awslabs#101)
  > Switch CI to use roles (awslabs#100)
Submodule mountpoint-s3-crt-sys/crt/aws-lc ffd6fb71..138a6ad3:
  > Prepare AWS-LC v1.44.0 (#2153)
  > Fix issue with ML-DSA key parsing (#2152)
  > Add support for PKCS7_set/get_detached (#2134)
  > Prepare Docker image for CI integration jobs (#2126)
  > Delete OpenVPN mainline patch from our integration build (#2149)
  > SHA3/SHAKE Init Updates via FIPS202 API layer (#2101)
  > Support keypair calculation for PQDSA PKEY (#2145)
  > Optimize x86/aarch64 MD5 implementation (#2137)
  > Check for MIPSEB in target.h (#2143)
  > Ed25519ph and Ed25519ctx Support (#2120)
  > Support for ML-DSA public key generation from private key (#2142)
  > Avoid mixing SSE and AVX in XTS-mode AVX512 implementation (#2140)
  > Remove remaining support for Trusty and Fuchsia operating systems (#2136)
  > ACVP test harness for ML-DSA (#2127)
  > Minor symbols to work with Ruby's mainline (#2132)

Signed-off-by: Alessandro Passaro <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Feb 5, 2025
Update the CRT libraries to the latest releases. In particular, include:
* S3Express CreateSession Allowlist Headers
([awslabs/aws-c-s3#492](awslabs/aws-c-s3#492))

<details>
  <summary>Full CRT changelog:</summary>
  
```
Submodule mountpoint-s3-crt-sys/crt/aws-c-auth 5bc67797..b513db4b:
  > A bunch of CMake fixes (#258)
  > Add Account Id to Credentials (#260)
  > Skip Transfer-Encoding from signing (#261)
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fbbe2612..7299c6ab:
  > Fix Findcrypto.cmake (#205)
  > A bunch of CMake fixes (#203)
  > Switch CI to use roles (#202)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 7a6f5df2..0e7637fa:
  > A bunch of CMake fixes (#1178)
  > Fix heap overflow on uri parsing (#1185)
  > (take 2) Detect when AVX is disabled via OSXSAVE (#1184)
  > Fixup IPv6 validation logic (#1180)
  > Detect when AVX is disabled via OSXSAVE (#1182)
  > proof_ci.yaml must use latest upload-artifact (#1183)
  > change PR template to ask for clearer wording (#1177)
Submodule mountpoint-s3-crt-sys/crt/aws-c-compression c6c1191e..f951ab2b:
  > A bunch of CMake fixes (#72)
  > Switch CI to use roles (#71)
  > chore: Modified bug issue template to add checkbox to report potential regression. (#69)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http fc3eded2..590c7b59:
  > A bunch of CMake fixes (#497)
  > Fix CI for GCC-13 on Ubuntu-18  (#496)
  > Switch CI to use roles (#494)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io fcb38c80..3041dabf:
  > A bunch of CMake fixes (#701)
  > Event Loop & Socket Type Multi-Support (#692)
  > fix typo in log message (#702)
  > Fix CI for GCC-13 on Ubuntu-18 (#700)
  > Switch CI to use roles (#698)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 a3b401bf..6eb8be53:
  > A bunch of CMake fixes (#480)
  > S3Express CreateSession Allowlist Headers (#492)
  > Auto - Update S3 Ruleset & Partition (#491)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils 1ae8664f..ba6a28fa:
  > A bunch of CMake fixes (#50)
Submodule mountpoint-s3-crt-sys/crt/aws-checksums 3e4101b9..fb8bd0b8:
  > A bunch of CMake fixes (#101)
  > Switch CI to use roles (#100)
Submodule mountpoint-s3-crt-sys/crt/aws-lc ffd6fb71..138a6ad3:
  > Prepare AWS-LC v1.44.0 (#2153)
  > Fix issue with ML-DSA key parsing (#2152)
  > Add support for PKCS7_set/get_detached (#2134)
  > Prepare Docker image for CI integration jobs (#2126)
  > Delete OpenVPN mainline patch from our integration build (#2149)
  > SHA3/SHAKE Init Updates via FIPS202 API layer (#2101)
  > Support keypair calculation for PQDSA PKEY (#2145)
  > Optimize x86/aarch64 MD5 implementation (#2137)
  > Check for MIPSEB in target.h (#2143)
  > Ed25519ph and Ed25519ctx Support (#2120)
  > Support for ML-DSA public key generation from private key (#2142)
  > Avoid mixing SSE and AVX in XTS-mode AVX512 implementation (#2140)
  > Remove remaining support for Trusty and Fuchsia operating systems (#2136)
  > ACVP test harness for ML-DSA (#2127)
  > Minor symbols to work with Ruby's mainline (#2132)
```
</details>


### Does this change impact existing behavior?

No.

### Does this change need a changelog entry? Does it require a version
change?

No.

---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).

Signed-off-by: Alessandro Passaro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for mkdir
5 participants