-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Multipart support #2
Comments
# This is the 1st commit message: Write the irods-S3-bridge. This is the initial release, and thus limited in fairly many important ways. Right now it particularly needs attention to the support of different payload signature schemes as it does not handle streaming ones at the moment, it also does not yet with the irods S3 resource as a client. # This is the commit message irods#2: Add boost::url Continue working on getObject # This is the commit message irods#3: Initial commit # This is the commit message irods#4: initial goals # This is the commit message irods#5: Add some debug branches to make sure the request discrimination logic is correct # This is the commit message irods#6: progress # This is the commit message irods#7: Add bucket resolver(placeholder for now) # This is the commit message irods#8: Various fixes. Getobject works properly # This is the commit message #9: Write some more of the S3 authentication stuff. # This is the commit message irods#10: write handle_listobjects_v2 # This is the commit message irods#11: Working on the authentication stuff # This is the commit message irods#12: Steps towards authentication # This is the commit message irods#13: Big changes * Move authentication code out of main.cpp * Write hex_encode function * Authentication works now * Use a library that is not openssl for signature verification # This is the commit message irods#14: Progress towards growing a plugin interface
Noting here the potential need for the following ...
|
initial design... 20240104 sequenceDiagram
participant client as S3 Client
participant s3 as S3 API
participant irods as iRODS
client ->>+ s3: CreateMultipartUpload
activate client
s3 ->>+ irods: initiate_parallel_transfer
irods ->>- s3: replica_token
s3 ->>- client: uploadID
deactivate client
par Part 1
client ->>+ s3: UploadPart_1
activate client
s3 ->>+ irods: part_1
irods ->>- s3: response
s3 ->>- client: response
and Part 2
client ->>+ s3: UploadPart_2
s3 ->>+ irods: part_2
irods ->>- s3: response
s3 ->>- client: response
and Part 3
client ->>+ s3: UploadPart_3
s3 ->>+ irods: part_3
irods ->>- s3: response
s3 ->>- client: response
deactivate client
end
client ->>+ s3: CompleteMultiPart
activate client
s3 ->>+ irods: complete_parallel_transfer
irods ->>- s3: response
s3 ->>- client: response
deactivate client
|
Yesterday we discussed different options (originally listed on UGM2023 slides)... a. Multiobject - write all parts individually to iRODS, then complete triggers copy/concatenate/whatever
b. Store-and-forward - write it all down in the bridge, then send it to iRODS
c. Efficient store-and-forward - write down / hold non-contiguous parts in bridge - send contiguous parts to iRODS when ready
d. Store-and-register - write it all down where iRODS can see it, then just register it in iRODS
e. ugm2023 persistence layer - efficient ring buffers in the middle for use of space and restart...
S3 protocol works for Amazon because concatenate() is FREE - AWS just stores its objects as a 'list' of parts, transparently.
Zoey/Hao approach
|
I don't think the SQlite part is in scope at the moment. As for the multipart checklists, all have been implemented except ListParts, ListMultipartUploads, and AbortMultipartUpload. I will bump this to 0.3.0. |
or we close it for 0.2.0... and make a new one for any remaining work? |
b) store-and-forward has been implemented for 0.2.0 |
GitHub gives us a button to export bullet list items as new issues. |
We've created new issues for remaining multipart-related tasks. Closing this issue. |
Here is my best attempt for a sequence diagram for the current behavior sequenceDiagram
participant client as S3 Client
participant s3 as S3 API
participant thread_pool as S3 API Thread Pool
participant irods as iRODS
client ->>+ s3: CreateMultipartUpload
activate client
s3 ->> s3: generate uploadID
s3 ->>- client: uploadID
deactivate client
par Part 1
client ->>+ s3: UploadPart_1
activate client
s3 ->> s3: write part to disk
s3 ->>- client: response
and Part 2
client ->>+ s3: UploadPart_2
activate client
s3 ->> s3: write part to disk
s3 ->>- client: response
end
client ->>+ s3: CompleteMultipartUpload
activate client
s3 ->>+ thread_pool: write part 1 upload task to thread pool
s3 ->>+ thread_pool: write part 2 upload task to thread pool
par Thread 1
thread_pool ->>+ irods: stream bytes for part 1
irods ->>- thread_pool: response
and Thread 2
thread_pool ->>+ irods: stream bytes for part 2
irods ->>- thread_pool: response
end
thread_pool ->>- s3: complete
s3 ->>- client: response
deactivate client
|
The 'current' behavior being |
The text was updated successfully, but these errors were encountered: