Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fs): add azure blob storage #8415

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Conversation

NadineYasser1
Copy link

Description

Short explanation of this PR (feel free to re-use commit message)

  • Introduced Azure Blob Storage integration
  • Added Azurite installation and usage Documentation

Checklist

  • Commit
    • Title follows commit conventions
    • Reference the relevant issue (Fixes #007, See xoa-support#42, See https://...)
    • If bug fix, add Introduced by
  • Changelog
    • If visible by XOA users, add changelog entry
    • Update "Packages to release" in CHANGELOG.unreleased.md
  • PR
    • If UI changes, add screenshots
    • If not finished or not tested, open as Draft

Review process

This 2-passes review process aims to:

  • develop skills of junior reviewers
  • limit the workload for senior reviewers
  • limit the number of unnecessary changes by the author
  1. The author creates a PR.
  2. Review process:
    1. The author assigns the junior reviewer.
    2. The junior reviewer conducts their review:
      • Resolves their comments if they are addressed.
      • Adds comments if necessary or approves the PR.
    3. The junior reviewer assigns the senior reviewer.
    4. The senior reviewer conducts their review:
      • If there are no unresolved comments on the PR → merge.
      • Otherwise, we continue with 3.
  3. The author responds to comments and/or makes corrections, and we go back to 2.

Notes:

  1. The author can request a review at any time, even if the PR is still a Draft.
  2. In theory, there should not be more than one reviewer at a time.
  3. The author should not make any changes:
    • When a reviewer is assigned.
    • Between the junior and senior reviews.

@NadineYasser1 NadineYasser1 requested a review from fbeauchamp March 6, 2025 09:09
this.#container = parts.shift()
this.#dir = join(...parts)
this.#containerClient = this.#blobServiceClient.getContainerClient(this.#container)
this.#createContainer()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't call an async method without a await or a then. And you can't call an async method in a constructor. Please move this this in _sync()

Comment on lines 64 to 65
await super._sync()
await this.#containerClient.createIfNotExists()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you test if this works without retry :

Suggested change
await super._sync()
await this.#containerClient.createIfNotExists()
await this.#containerClient.createIfNotExists()
await super._sync()

const prefix = path === '/' ? '' : path + '/'
const result = []
for await (const item of this.#containerClient.listBlobsByHierarchy('/', { prefix })) {
const strippedName = item.name.startsWith(`${path}/`) ? item.name.replace(`${path}/`, '') : item.name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the item.name will always start with prefix , no ?

}

const blobClient = this.#containerClient.getBlockBlobClient(file)
const blockCount = Math.ceil(data.length / MAX_BLOCK_SIZE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the stream don't always have a length property. Please reuse the parameters of s3._outuputStream async _outputStream(path, input, { streamLength, maxStreamLength = streamLength, validator })


const start = i * MAX_BLOCK_SIZE
const end = Math.min(start + MAX_BLOCK_SIZE, data.length)
const chunk = data.slice(start, end)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data is a stream, you can't use data.slice here.
You can use readChunkStrict from package @vates/read-chunk to get a part of a stream as a buffer.

@@ -94,6 +111,12 @@ export const format = ({ type, host, path, port, username, password, domain, pro
string = protocol === 'https' ? 's3://' : 's3+http://'
string += `${encodeURIComponent(username)}:${encodeURIComponent(password)}@${host}`
}
if (type === 'azure') {
// used a double slash to seperate path cause password might contain slashes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slash in password should be encoded by encoreURIComponent


// list blobs in container
async _list(path) {
const prefix = path === '/' ? '' : path + '/'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use makePrefix if you prefer

}

async _rmtree(path) {
const iter = this.#containerClient.listBlobsFlat({ prefix: path?.endsWith('/') ? path : `${path}/` })
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use makeprefix here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants