-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store sharding function used in the repo. #13
Conversation
What are some example strings? We should make sure that these are somewhat nice multicodecs, cc @Kubuxu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, a few things i noticed
path: path, | ||
getDir: getDir, | ||
sync: sync, | ||
func New(path string, fun0 string, sync bool) (*Datastore, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets have a New
that detects the type from disk, and a NewWithFunc
that has this signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but I will still need the "auto" test case as that is used by the conversion utility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do this yet. Do you basically want:
func New(path string, sync bool) (*Datastore, error) {
NewWithFunc(path, "auto", sync)
}
func NewWithFunc(path string, fun0 string, sync bool) {...}
Is it that you don't like the special "auto" value? (I don't want to eliminate the use of that string as it makes writing the conversion function/utility easier, so using two separate functions won't really simplify any code.)
str := padding + noslash | ||
return str[len(str)-suffixLen:] | ||
fun := "" | ||
if fun0 == "auto" && fun1 == "auto" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should make this a switch statement to be more idiomatic go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I don't particular see the difference, but don't object either. In the future I will try to remember that switch is generally preferred over multiple else if
s.
What is the content of the SHARDING file directly? It isn't clear from the code? |
Example file:
As discussed in ipfs/kubo#3463 |
Could you change it to previously proposed: |
@whyrusleeping agree with @Kubuxu? I have no problem doing it. |
Yeap, agreed with @Kubuxu |
Done. Also refactored the parsing of the shard identifier string to make it more obvious what it going on. |
return nil, fmt.Errorf("shard function not specified") | ||
case fun0 == "auto": | ||
fun = fun1 | ||
case fun1 == "auto": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should separate the task of opening an existing datastore and creating a new one. That way we can remove a lot of the weirdness here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not. The "weirdness" is also there to allow opening an existing repo without the SHARDING file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm.... We're going to have to run a migration anyways. I think thats acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whyrusleeping, okay to do things the way you want I will basically have to redo this pull request and the conversion code will require some adjusting. Let me know. (I'm only asking because you indicated you might do it in a private conversation and don't want to duplicate work.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started work on this. Will have it ready some time Wed.
@whyrusleeping okay please have another look, there are two fixme, but I think this is close to what you want. There is a separate This fix also makes the flatfs a little more robust, espacally after the fixme to check if an existing directory without a SHARDING file is not empty. |
5a92831
to
bf2369a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Githubs review thing is really annoying me... I thought I submitted this a day ago...
func parseShardFunc(str string) shardId { | ||
str = strings.TrimSpace(str) | ||
parts := strings.Split(str, "/") | ||
// ignore prefix for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should check to make sure that the string has the expected prefix
} | ||
} | ||
|
||
func parseShardFunc(str string) shardId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should a pointer to the shardId and an error here
return shardId{funName: parts[0], param: parts[1]} | ||
case 1: | ||
return shardId{funName: parts[0]} | ||
default: // can only happen for len == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this case should be an error
} | ||
switch len(parts) { | ||
case 3: | ||
return shardId{version: parts[0], funName: parts[1], param: parts[2]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of returning strings here, we should probably do the parsing from Func()
in here. That way we don't end up creating an 'invalid' shardId object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will see what I can do. I originally had parseShardFunc return an error but decided to push all the error checking into Func(). Will try refactoring again and see if I can make it work better the way I think you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now.
} | ||
|
||
func ReadShardFunc(dir string) (string, error) { | ||
fun := "auto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldnt ever return 'auto' from reading the func from disk. Either its a valid function, or its an error, try to validate input as much as possible as soon as possible.
if err != nil { | ||
return err | ||
} | ||
if str == IPFS_DEF_SHARD { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets separate the step of writing the readme from writing the sharding function, and also make sure to error out or print a log message if we try to write a readme for an unsupported sharding type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whyrusleeping I don't think returning an error is a good idea. Especially if the flatfs datastore could be used outside of IPFS. I also don't like the idea of a warning for the same reason, but can add one if you still think it is a good idea.
I will separate the step from the writing of the sharding function.
@@ -214,40 +218,42 @@ func testStorage(p *params, t *testing.T) { | |||
if !seen { | |||
t.Error("did not see the data file") | |||
} | |||
if fs.ShardFunc() == flatfs.IPFS_DEF_SHARD && !haveREADME { | |||
t.Error("expected _README file") | |||
} else if fs.ShardFunc() == flatfs.IPFS_DEF_SHARD && !haveREADME { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two cases the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Yes. Fixed.
8f329f3
to
ee4b8f3
Compare
@whyrusleeping I should have addressed all your concerns in the latest review except for returning an error if no README can be created because I am not sure that it was a good idea. See my comment. |
Note the two special files are named:
I use '_README' because that is what @jbenet wanted. For consistency I can also rename 'SHARDING' to '_SHARDING' if you want. Let me know. For the record, I think using the special name '_README' is a good idea to help make it clear there is something special about it. |
getDir: getDir, | ||
sync: sync, | ||
var DatastoreExists = errors.New("Datastore already exist") | ||
var DatastoreDoesNotExist = errors.New("Datastore directory does not exist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use one var block and Err
name prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay.
str = str[len(PREFIX):] | ||
} | ||
parts := strings.Split(str, "/") | ||
if len(parts) == 3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the v1 optional? it doesn't have to be. Let's make it simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree (with "it doesn't have to be") but @whyrusleeping would probably say the same thing so I will change it.
} | ||
|
||
func ReadShardFunc(dir string) (*ShardIdV1, error) { | ||
buf, err := ioutil.ReadFile(filepath.Join(dir, "SHARDING")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the name of the file constant as it is reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a policy on this? In this particular case it seams like an overkill to me. We also don't seam to be very consistent about it elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to use constants for these sorts of things when we can, There are definitely cases where we forget to do so, but in general its a good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine, Done. However, I think it makes code harder to read and I also am horrible at naming the things.
return nil, err | ||
} | ||
|
||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double err handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Will fix.
return nil, fmt.Errorf("invalid prefix in shard identifier: %s", str) | ||
} | ||
str = str[len(PREFIX):] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use TrimPrefix, and check if output string is different than the input, if it is then prefix was trimmed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay.
} | ||
|
||
func Prefix(prefixLen int) ShardFunc { | ||
padding := strings.Repeat("_", prefixLen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It there a better way to do it? It seems clever but it might have performance penalty (string concat), and it is done for every key, potentially multiple time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies to other ShardFuncs too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be yes. But this code has not changed, just moved. The same thing was also done before any of my shading changes. So for now (at least this p.r.) I would rather leave it alone.
[Sorry if this is a duplicate. GitHub is being annoying.]
@Kubuxu @whyrusleeping I should have addressed all your concerns. I also implemented the fixme. The only outstanding issue is returning an error if no README can be created because I am not sure that it was a good idea. From my comment above:
|
|
||
func WriteReadme(dir string, id *ShardIdV1) error { | ||
if id.String() == IPFS_DEF_SHARD { | ||
err := ioutil.WriteFile(filepath.Join(dir, README_FN), []byte(IPFS_DEF_SHARD), 0444) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be README_DEF_SHARD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Told you I was bad a naming things. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Misunderstood and didn't see my own error. Will fix.
if err != nil { | ||
return err | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All above can be just return err
.
if err != nil { | ||
return err | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be just return err
too (no need for if).
|
||
err := os.Mkdir(path, 0777) | ||
if err != nil && !os.IsExist(err) { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be return err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
00c1048
to
f3396ad
Compare
@Kubuxu anything else :) |
c221959
to
574569a
Compare
Store the sharding function used in the file "SHARDING" in the repo. To make this work the sharding function is now always specified as a string.
This avoids the need for the special "auto" string for the shard func.
Also use *ShardIdV1 instead of string for the shard fun parameter to WriteShardFunc and WriteReadme.
Other minor cleanups.
574569a
to
6db29c0
Compare
Note: Just rebased to clean up the commits, nothing else has changed. |
Note: I just pushed some more commits to avoid the use of strings wherever possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store the sharding function used in the file "SHARDING" in the repo.
To make this work the sharding function is now always specified as a
string.