Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve thumbnail reading performance with more direct access #31054

Open
CarlSchwan opened this issue Feb 7, 2022 · 12 comments
Open

Improve thumbnail reading performance with more direct access #31054

CarlSchwan opened this issue Feb 7, 2022 · 12 comments

Comments

@CarlSchwan
Copy link
Member

CarlSchwan commented Feb 7, 2022

Current state

When loading a thumbnail, we do:

  • Authentication DB requests
  • File system setup DB requests
  • Check if the user can read the file
  • Load the thumbnail from the appdata or generate the thumbnail and store it in the appdata
  • Serve the thumbnail

Idea how to fix it

The general idea is that we communicate a secret token with the clients when they fetch the list of files in a folder. This secret token is unique for each file and only valid for 5 min. When requesting a thumbnail, we need to pass this token to the request and then with it we can skip the authorization and authentication, as it was already done during the file listing.

Technical explanation

Add a new dav property that contains for each image a thumbnail_access_token. The thumbnail_access_token is generated with the following expression: hash(fileId + rounded_time + server_secret) where rounded_time is the intval(current_timestamp / 60 / 5) so that validity of the token is only 5 minutes and secret_server is secret from the config.php (an unique identifier for each instance).

When requesting a thumbnail, the clients and the webui would need to provide the thumbnail_access_token that they previously got from the dav request. If the token is valid, we can skip the authentification and the full file system setup call and only do the appdata setup. If it's not valid or for legacy reasons are not set, we do the authentification and the file system setup.

For the web UI, the HTTP header can be set using a web worker. For the mobile and desktop client, it shouldn't be an issue to add an additional header to the requests (I hope)

Advantages

  • Performance: A lot less of DB requests are needed. Optimally none are actually needed as the fileid is already provided by the requests and this allows to find the path in the local storage and s3 storage. But I guess it will be a bit complicated to fight against the framework to not fetch that app config
  • Backward compatibility: it's not a new API, just an optional new parameter. It also means we can add support progressively to the apps and clients
  • Caching friendly: since the token is not part of the URL, the cache won't get invalidated for every request
  • Security: The token is only valid for 5 minutes and after that, a user will need the new token to fetch the thumbnail or be authorized by the server. Apps like access_control will send an invalid token, if the user doesn't have access to the dav requests. The only way for unauthorized users to access the thumbnail would be to get the secret token from the server or that an authorized user share the thumbnail_access_token quickly enough (5 min lifetime) but in that case it's easier to just share the file directly.

Disavantages

  • This will need some adaptions in every client and app that use thumbnails. Even if it can be done progressively
  • ???
@Kuphi

This comment was marked as resolved.

@juliusknorr
Copy link
Member

Makes sense 👍

A few additional thoughts:

  • Taking the etag of the file into the hash might avoid the rather low risk that you can get a thumbnail for a file that was updated after as users access has been revoked
  • files_accesscontrol might still list the file in the PROPFIND even though access is being blocked, just something to double check that the token is only issued when read permissions are actually there

@T-bond
Copy link

T-bond commented Feb 16, 2022

I think the token is valid for 5 min maximum, and most of the time less than 5 min.

For example:
Let the current timestamp be 299 in seconds when the token generated, then the rounded time is 0.
Next second (300) the rounded time is 1, so our token was valid for only 1 seconds this this case.

But I like the idea, as loading the previews are very slow for me on web and on mobile too.

@PVince81
Copy link
Member

This will need some adaptions in every client and app that use thumbnails. Even if it can be done progressively

would be good to also get feedback from @tobiasKaminsky here then

@PVince81
Copy link
Member

PVince81 commented Feb 16, 2022

clients to adjust eventually (raise tickets):

  • viewer app
  • photos app
  • Android files app
  • iOS files app
  • Desktop (Windows with VFS)

@tobiasKaminsky
Copy link
Member

I see some troubles.

Current state on e.g. Android:

  • open a folder with 500 images
  • propfind -> all items are stored in db
  • quit app and navigate back to said folder
  • etag is checked -> same -> no new refresh
  • scroll through 500 images -> as soon as an image is visible -> request thumbnail

With new approach this means:

  • always update all files in db -> heavy load on server and client for maybe no reason (if e.g. all thumbnails are already fetched)
  • scroll through list may not take more than 5min, otherwise we do not benefit from it

@PVince81
Copy link
Member

this seems useful mostly for the web UI:

Because there we do a propfind on the folder every time, so we already setup the FS and at that moment we know the permissions, so when the thumbnails from that folder are requested, we don't need to check permissions.
The background issue we're trying to solve is that currently even when we read only the thumbnail, the whole setupFS logic (mount points, etc) is triggered just to check permissions (files_accesscontrol app, etc). Having the token makes it possible to bypass that logic because we trust that the permissions were already checked.

@tobiasKaminsky
Copy link
Member

What about binding this to eTag of parent folder?

  • client do propfind or even asks for specific thumbnail
  • server grants access to all those files within same folder that are not incoming shares

This then has no implementation on client, but will still help a lot.
Only e.g. in our talk subfolder this is "useless", but not slower than now.
On all other folders, e.g. even with 20% shares, it will be way faster.

@CarlSchwan
Copy link
Member Author

Results of the brainstorming with @tobiasKaminsky

  1. Unlike the WebUI which is doing a WebDAV requests on each page load, the clients are aggressively caching PROPFIND results (a very good thing btw) and only refresh it when the root directory's ETag changes so communicating the direct access token during the PROPFIND might not bring a big improvement as in many cases the token will be outdated :(

  2. Due to ACLs (files_accesscontrol, groupfolders) that can change at any moment and won't update the ETag, we can't really use ETags instead of the 5min time-based approach for the access right expiration. Unless we consider that a user who had access to a file in the past, can also access the preview in the future as long as the ETag/content didn't change (the client will probably still have the preview in its cache anyway if the content didn't change).

  3. To simplify the deployment to the clients, we might want to cache the fact that a user has read access to a file inside Redis instead of a token-based approach so that we don't need to change the clients at all. This might suffer from the same issue as 1. if the token is valid with a time-based approach :/

  4. We also briefly considered adding an endpoint to fetch multiple previews at the same time. This would reduce the number of queries, but this might be tricky to implement correctly from the client side (and probably also on the server side).

@skjnldsv
Copy link
Member

skjnldsv commented Mar 1, 2022

  • This will need some adaptions in every client and app that use thumbnails. Even if it can be done progressively

Good idea to standardize this!
I think we should have proper methods so anyone can get the thumbnail Url from a file.
If we have to copy/paste the method in the front, this would make maintaining this cumbersome. Maybe have this as a dav property? <oc:thumbnailUrl />

@tobiasKaminsky
Copy link
Member

As the url is rather fixed/can be easily computed it it big overhead to have it in every response.
What about adding the available sizes in capabilities?
This way different server can have different max values / steps inbetween.
Clients will then round to next available size.

@skjnldsv
Copy link
Member

skjnldsv commented Mar 8, 2022

Clients will then round to next available size.

We already do, we generate only for 64, 256, 1024 and 4096.
Whatever you request in between is rounded up and capped at 4096

@juliusknorr juliusknorr removed their assignment Jul 25, 2022
@PVince81 PVince81 changed the title Improve thumbnail reading performance Improve thumbnail reading performance with more direct access Aug 15, 2022
@PVince81 PVince81 removed their assignment Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants