Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 backend with TUS handler fails to handle PATCH #1067

Closed
pcfreak30 opened this issue Jan 20, 2024 · 10 comments · Fixed by #1020
Closed

S3 backend with TUS handler fails to handle PATCH #1067

pcfreak30 opened this issue Jan 20, 2024 · 10 comments · Fixed by #1020
Labels

Comments

@pcfreak30
Copy link
Contributor

Describe the bug

I have been digging through a lot of routing code and found the s3 backend creates a compound ID at

info.ID = objectId + "+" + multipartId
.

However, pat, the router, only supports alphanumeric characters with :id

Mux:
https://github.com/bmizerany/pat/blob/e4b6760bdd6f5467d6b19ff8f0f85a00938eb64a/mux.go#L236

Register:

mux.Add("PATCH", ":id", http.HandlerFunc(handler.PatchFile))

I see a few fixes:

  1. Don't use compound IDs in the store
  2. Patch pat
  3. Replace pat

I would recommend replacing pat. More people, including me, are using https://github.com/julienschmidt/httprouter, which does not have the same restriction.

I will have to fork this, and I can submit a PR with feedback on how the project maintainers want to proceed on this bug.

To Reproduce

Could you create an embedded TUS server with the handler and a s3 backend?

Expected behavior
It should route the PATCH request for a file properly.

Setup details
Please provide following details, if applicable to your situation:

  • Operating System: Linux
  • Used tusd version: v2.2.2
  • Used tusd data storage: S3
  • Used tusd configuration: N/A
  • Used tus client library: tusc
@pcfreak30 pcfreak30 added the bug label Jan 20, 2024
@Acconut
Copy link
Member

Acconut commented Jan 23, 2024

To be frank, I am not able to follow your description here. We have been running tusd with S3 like this for years without issues, even though URLs regularly contain non-alphanumeric letters like + and _. I think to remember that some S3-compatible (other than AWS S3) might use other letters in their multipart IDs as well. Are you using such a service? Do you have a concrete error message that you are running into?

Mux:
https://github.com/bmizerany/pat/blob/e4b6760bdd6f5467d6b19ff8f0f85a00938eb64a/mux.go#L236

Not sure what this link should show us. It links to an implementation of the Tail method, which is not used by tusd AFAIK. Am I missing something here?

@pcfreak30
Copy link
Contributor Author

To be frank, I am not able to follow your description here. We have been running tusd with S3 like this for years without issues, even though URLs regularly contain non-alphanumeric letters like + and _. I think to remember that some S3-compatible (other than AWS S3) might use other letters in their multipart IDs as well. Are you using such a service? Do you have a concrete error message that you are running into?

Mux:
https://github.com/bmizerany/pat/blob/e4b6760bdd6f5467d6b19ff8f0f85a00938eb64a/mux.go#L236

Not sure what this link should show us. It links to an implementation of the Tail method, which is not used by tusd AFAIK. Am I missing something here?

I used a debugger and followed the code paths.

And I'm currently using Digital Oceans S3 service, so yes non "official"

I have forked tusd and replaced it with httprouter successfully and am using it in development now just fine too.

It's not an error message; it's the fact the router never matches the route due to non-alphanumeric characters, which is a fundamentally low-level problem. That o/c means a 404 and breaking everything.

See main...LumeWeb:tusd:replaced-router

@Acconut
Copy link
Member

Acconut commented Jan 23, 2024

Can you please provide an exemplary upload URL that is returned by tusd when used with DigitalOcean Spaces? Using that, we can attempt to make the routing work. Some related work has already started in the past in #1020, but is still far from being completed.

@pcfreak30
Copy link
Contributor Author

https://DOMAIN/s5/upload/tus/02099f522cd0206742ee00a0f5943c1e+2~-r4HDPO3D0dP9BtlwQbk7KltxE4v0r3?auth_token=X

However, my opinion is to try and replace pat because you then don't even need a middleware hack to sidestep its algorithm.

@Acconut
Copy link
Member

Acconut commented Jan 24, 2024

I tried to reproduce this with DigitalOcean Spaces but uploads were working. Here are the logs to show that it was able to route the URL:

$ go run cmd/tusd/main.go -s3-endpoint=https://fra1.digitaloceanspaces.com -s3-bucket=tusdtest
2024/01/24 12:12:04.336304 Using 'https://fra1.digitaloceanspaces.com/tusdtest' as S3 endpoint and bucket for storage.
2024/01/24 12:12:04.336691 Using 0.00MB as maximum size.
2024/01/24 12:12:04.336762 Supported tus extensions: creation,creation-with-upload,termination,concatenation,creation-defer-length
2024/01/24 12:12:04.336766 Using 0.0.0.0:8080 as address to listen.
2024/01/24 12:12:04.336768 Using /files/ as the base path.
2024/01/24 12:12:04.336783 Using /metrics as the metrics path.
2024/01/24 12:12:04.337088 You can now upload files to: http://[::]:8080/files/
2024/01/24 12:12:07.214735 level=INFO event=RequestIncoming method=OPTIONS path="" requestId=""
2024/01/24 12:12:07.214986 level=INFO event=ResponseOutgoing method=OPTIONS path="" requestId="" status=200 body=""
2024/01/24 12:12:07.216772 level=INFO event=RequestIncoming method=POST path="" requestId=""
2024/01/24 12:12:07.340869 level=INFO event=UploadCreated method=POST path="" requestId="" id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El size=88184 url=http://localhost:8080/files/70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El
2024/01/24 12:12:07.340960 level=INFO event=ResponseOutgoing method=POST path="" requestId="" id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El status=201 body=""
2024/01/24 12:12:07.356294 level=INFO event=RequestIncoming method=OPTIONS path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId=""
2024/01/24 12:12:07.356410 level=INFO event=ResponseOutgoing method=OPTIONS path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId="" status=200 body=""
2024/01/24 12:12:07.357957 level=INFO event=RequestIncoming method=PATCH path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId=""
2024/01/24 12:12:07.387843 level=INFO event=ChunkWriteStart method=PATCH path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId="" id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El maxSize=88184 offset=0
2024/01/24 12:12:07.476994 level=INFO event=ChunkWriteComplete method=PATCH path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId="" id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El bytesWritten=88184
2024/01/24 12:12:07.507552 level=INFO event=UploadFinished method=PATCH path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId="" id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El size=88184
2024/01/24 12:12:07.507597 level=INFO event=ResponseOutgoing method=PATCH path=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El requestId="" id=70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El status=204 body=""

The URL also looked similar to yours: http://localhost:8080/files/70a8dc8076ccf0deec0b93397068ebd7+2~Qz-9TvpBUZI7jhe4GL2DDxgQmplp0El Is there anything I am missing?

That being said, there is a bug resulting in a 404 being returned when trying to GET a completed upload.

@pcfreak30
Copy link
Contributor Author

I reverted the patches I made to test again:
Selection_1818

On a PATCH request to /s5/upload/tus/221e2a31b396eae8eeabc75220e762e8+2~D7JsaTCm27aogDwnYEfX3PZZlzXr6tl?auth_token=eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3MDYxOTMwNjQsImlzcyI6InBvcnRhbCIsInN1YiI6Nn0.XNXum8AIq3DgUdfF6xkjMc6bDX4E_7rR-H6oe8Wcqgjps4g4bQWgbXxOU979wPEbuGR3xMV32WPaBsIeTrUqCw (the jwt is on a dev site)

It partially matches the route, but when it goes to pat val, _, i = match(path, matchPart(nextc), i) returns an empty string, and it never fully matches.

You can find my configuration/setup of TUS with this debug test at https://git.lumeweb.com/LumeWeb/portal/src/commit/9002064937ae0f869af09964c5c72694cd10f97f/storage/storage.go#L124

@pcfreak30
Copy link
Contributor Author

Additionally #1071 seems to be a new edge case unrelated to mine (outside being S3) 😅

@Acconut
Copy link
Member

Acconut commented Jan 24, 2024

Alright, this clear the fog. The problem is not that the URL contains non-alphanumeric characters but that in your case the path begin with a slash (`path = "/221e2a31b396eae8eeab..."). But the routes do not expect a starting slash:

mux.Add("PATCH", ":id", http.HandlerFunc(handler.PatchFile))

That's why the route does not match. If I modify tusd to manually add the beginning slash, I get the same 404 errors:

image
image
image

I would recommend you to look into http.StripPrefix to get rid of the beginning slash. That is also done by our example code:

http.Handle("/files/", http.StripPrefix("/files/", handler))

In tusd, we should probably make our logic handle these cases as well. I hope that helps for now.

@pcfreak30
Copy link
Contributor Author

Thanks, ill prob stick to my fork for now because I prefer them all on the same router, and the changes were minor, and I fixed the routing issues that came up after. I had http.StripPrefix in use... but it also caused issues. I'm using a forked version of it now too.

Sorry for the goose chase, though at-least 1 bug fix came from it.

@Acconut Acconut linked a pull request Jan 25, 2024 that will close this issue
@Acconut
Copy link
Member

Acconut commented Jan 25, 2024

Alright, whatever you prefer. I am glad the situation has been cleared up. #1020 contains an improvement which makes tusd not choke on the leading slash. Feel free to try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants