This repository has been archived by the owner on Oct 22, 2021. It is now read-only.
generated from beyondstorage/go-service-example
-
Notifications
You must be signed in to change notification settings - Fork 2
Add gdrive for go-storage design #14
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
- Author: (fill me in with `name <mail>`, e.g., Xuanwo <[email protected]>) | ||
- Start Date: (fill me in with today's date, YYYY-MM-DD) | ||
- RFC PR: [beyondstorage/go-service-gdrive#0](https://github.com/beyondstorage/go-service-gdrive/issues/0) | ||
- Tracking Issue: [beyondstorage/go-service-gdrive#0](https://github.com/beyondstorage/go-service-gdrive/issues/0) | ||
|
||
# RFC-0: <proposal name> | ||
|
||
- Updates: (delete this part if not applicable) | ||
- [RFC-20](./20-abc): Deletes something | ||
- Updated By: (delete this part if not applicable) | ||
- [RFC-10](./10-do-be-do-be-do): Adds something | ||
- [RFC-1000](./1000-lalala): Deprecates this RFC | ||
|
||
## Background | ||
|
||
Explain why we are doing this. | ||
|
||
Related issues and early discussions can be linked, but the RFC should try to be self-contained if possible. | ||
|
||
## Proposal | ||
|
||
<proposal's content> | ||
|
||
## Rationale | ||
|
||
<proposal's rationale content, other implementations> | ||
|
||
Possible content: | ||
|
||
- Design Principles | ||
- Drawbacks | ||
- Alternative implementations and comparison | ||
- Possible Q&As | ||
|
||
## Compatibility | ||
|
||
<proposal's compatibility statement> | ||
|
||
## Implementation | ||
|
||
Explain what steps should be done to implement this proposal. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
- Author: Jun [email protected] | ||
- Start Date: 2021-7-18 | ||
- RFC PR: [beyondstorage/go-service-gdrive#14](https://github.com/beyondstorage/go-service-gdrive/issues/14) | ||
- Tracking Issue: [beyondstorage/go-service-gdrive#15](https://github.com/beyondstorage/go-service-gdrive/issues/15) | ||
|
||
# RFC-14: Gdrive for go-storage design | ||
|
||
## Background | ||
|
||
Google drive API has so many different notions that differs from `go-storage`, and we have briefly discussed in [Gdrive use FileId to manipulate data instead of file name #11](https://github.com/beyondstorage/go-service-gdrive/issues/11). Now I would like to start a RFC so that we can make all things more clear. | ||
|
||
In Google drive API, `FileID` is a critical attribute of a file(or directory). We will use it to manipulate data instead of by path. In fact, path is very trivial in gdrive, and we can create files with the same name in the same location. In other words, path can be duplicate in gdrive. This behavior can cause some problems to our path based API. | ||
|
||
## Proposal | ||
|
||
**We manually stipulate that every path is unique.** | ||
|
||
When users try to call `Write` to an existing file, we update it's content instead of creating another file with the same name. | ||
|
||
**We will do a conversion between path and `FileID`.** | ||
|
||
In this way, every path can be converted to `FileID`, so we are able to build a good bridge between `go-storage` API and gdrive API. | ||
|
||
**We will cache `path -> id` in memory with TTL.** | ||
|
||
For performance reasons, we will cache the ids of the files as they are created, and we will only look up their ids when the cache expires. | ||
|
||
## Implementation | ||
|
||
When users try to call `Write("foo/bar/test.txt")`, we will do this: | ||
|
||
First, we look up the `FileID` of `foo` in cache, and try to search it's `FileID` if it is expired. Then, we will do the same thing to `bar` and `test.txt`. Be aware that when we can not find the `FileId` of a directory, we won't continue to do the search to it's subdirectories. In this case, we can consider the file doesn't exist. | ||
|
||
After that, there are two possibilities: | ||
|
||
When `foo/bar/test.txt` doesn't exist, we will create folders one by one. At the same time, we will cache their `FileId`. | ||
|
||
When `foo/bar/test.txt` already exist, then we will update it's content instead of creating another one. | ||
|
||
Our significant point `pathToId` can be implement like this: | ||
|
||
If the file is in the root folder, then we just do a simple search by using `drive.service.Files.List().Q(searchArgs).Do()`. The return value type is `*drive.File`, and it's attribute `ID` is what we need. | ||
|
||
But if the file path is like `foo/bar/demo.txt`, it would be a little complex. | ||
|
||
First, we get the `FileID` of directory `foo` like what we previously do, then we can use this `FileID` to list all of it's content. By this way, we can find a directory named `bar` and it's `FileID`. At last, we just repeat what we did before, and get the `FileID` we want. | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the path is complex like
a/b/c/d/e/f/h/q.txt
, it looks like we need to repeat the search many times.How about cache the
path -> id
in memory with TTL?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @junaire for a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll work on it. However, I don't have too much experience with this and I would like to ask if we should use some third library like go-cache? Or just simply use the map from the standard library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To simplify the implementation, use a map with Mutex is OK for now.
We can implement the TTL logic later by https://github.com/dgraph-io/ristretto or other libs.