-
Notifications
You must be signed in to change notification settings - Fork 2
Add gdrive for go-storage design #14
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,41 @@ | ||||||
- Author: (fill me in with `name <mail>`, e.g., Xuanwo <[email protected]>) | ||||||
- Start Date: (fill me in with today's date, YYYY-MM-DD) | ||||||
- RFC PR: [beyondstorage/go-storage#0](https://github.com/beyondstorage/go-storage/issues/0) | ||||||
- Tracking Issue: [beyondstorage/go-storage#0](https://github.com/beyondstorage/go-storage/issues/0) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
# GSP-0: <proposal name> | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
- Updates: (delete this part if not applicable) | ||||||
- [GSP-20](./20-abc): Deletes something | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
- Updated By: (delete this part if not applicable) | ||||||
- [GSP-10](./10-do-be-do-be-do): Adds something | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
- [GSP-1000](./1000-lalala): Deprecates this RFC | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
## Background | ||||||
|
||||||
Explain why we are doing this. | ||||||
|
||||||
Related issues and early discussions can be linked, but the RFC should try to be self-contained if possible. | ||||||
|
||||||
## Proposal | ||||||
|
||||||
<proposal's content> | ||||||
|
||||||
## Rationale | ||||||
|
||||||
<proposal's rationale content, other implementations> | ||||||
|
||||||
Possible content: | ||||||
|
||||||
- Design Principles | ||||||
- Drawbacks | ||||||
- Alternative implementations and comparison | ||||||
- Possible Q&As | ||||||
|
||||||
## Compatibility | ||||||
|
||||||
<proposal's compatibility statement> | ||||||
|
||||||
## Implementation | ||||||
|
||||||
Explain what steps should be done to implement this proposal. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,47 @@ | ||||||
- Author: Jun [email protected] | ||||||
- Start Date: 2021-7-18 | ||||||
- RFC PR: [beyondstorage/go-service-gdrive#14](https://github.com/beyondstorage/go-service-gdrive/issues/14) | ||||||
- Tracking Issue: [beyondstorage/go-service-gdrive#15](https://github.com/beyondstorage/go-service-gdrive/issues/15) | ||||||
|
||||||
# GSP-14: Gdrive for go-storage design | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
## Background | ||||||
|
||||||
Google drive API has so many different notions that differs from `go-storage`, and we have briefly discussed in [Gdrive use FileId to manipulate data instead of file name #11](https://github.com/beyondstorage/go-service-gdrive/issues/11). Now I would like to start a RFC so that we can make all things more clear. | ||||||
|
||||||
In Google drive API, `FileID` is a critical attribute of a file(or directory). We will use it to manipulate data instead of by path. In fact, path is very trivial in gdrive, and we can create files with the same name in the same location. In other words, path can be duplicate in gdrive. This behavior can cause some problems to our path based API. | ||||||
|
||||||
## Proposal | ||||||
|
||||||
**We manually stipulate that every path is unique.** | ||||||
|
||||||
When users try to call `Write` to an existing file, we update it's content instead of creating another file with the same name. | ||||||
|
||||||
**We will do a conversion between path and `FileID`.** | ||||||
|
||||||
In this way, every path can be converted to `FileID`, so we are able to build a good bridge between `go-storage` API and gdrive API. | ||||||
|
||||||
**We will cache `path -> id` in memory with TTL.** | ||||||
|
||||||
For performance reasons, we will cache the ids of the files as they are created, and we will only look up their ids when the cache expires. | ||||||
|
||||||
## Implementation | ||||||
|
||||||
When users try to call `Write("foo/bar/test.txt")`, we will do this: | ||||||
|
||||||
First, we look up the `FileID` of `foo` in cache, and try to search it's `FileID` if it is expired. Then, we will do the same thing to `bar` and `test.txt`. Be aware that when we can not find the `FileId` of a directory, we won't continue to do the search to it's subdirectories. In this case, we can consider the file doesn't exist. | ||||||
|
||||||
After that, there are two possibilities: | ||||||
|
||||||
When `foo/bar/test.txt` doesn't exist, we will create folders one by one. At the same time, we will cache their `FileId`. | ||||||
|
||||||
When `foo/bar/test.txt` already exist, then we will update it's content instead of creating another one. | ||||||
|
||||||
Our significant point `pathToId` can be implement like this: | ||||||
|
||||||
If the file is in the root folder, then we just do a simple search by using `drive.service.Files.List().Q(searchArgs).Do()`. The return value type is `*drive.File`, and it's attribute `ID` is what we need. | ||||||
|
||||||
But if the file path is like `foo/bar/demo.txt`, it would be a little complex. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the path is complex like How about cache the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ping @junaire for a look. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I'll work on it. However, I don't have too much experience with this and I would like to ask if we should use some third library like go-cache? Or just simply use the map from the standard library? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To simplify the implementation, use a map with Mutex is OK for now. We can implement the TTL logic later by https://github.com/dgraph-io/ristretto or other libs. |
||||||
|
||||||
First, we get the `FileID` of directory `foo` like what we previously do, then we can use this `FileID` to list all of it's content. By this way, we can find a directory named `bar` and it's `FileID`. At last, we just repeat what we did before, and get the `FileID` we want. | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.