-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add link traversal option #96
Conversation
e7f2f8b
to
d04eb43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I don't have many opinions on the flag (maybe --dereference
and -L
like in the cp
command?).
cli/parse.go
Outdated
@@ -475,6 +480,13 @@ func appendFile(fpath string, argDef *cmdkit.Argument, recursive, hidden bool) ( | |||
|
|||
fpath = filepath.ToSlash(filepath.Clean(fpath)) | |||
|
|||
if travLinks { | |||
fpath, err := filepath.EvalSymlinks(fpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, calling that function for every file added is going to be slow. It recursively evaluates symlinks for every component in the path.
I'd write this as:
stat, err := os.Lstat(fpath)
if err != nil {...}
if travLinks {
for stat.Mode().IsSymlink() {
// read link
stat, err = ...
// handle error.
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as far as I can tell, this won't conserve the filename (the first argument to NewSerialFile below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about the speed, but I agree that the filename probably isn't preserved. @djdv could you add tests that verify that the code works as inteded?
Sorry for the delay, I had no notification on this for some reason. @Stebalien, @keks
I'd be interested in adding
For sure, I'll try and come up with some automated tests after we decide on the above^ |
Ah. For some reason I thought this was traversing all links. This would be the Although... actually, I believe the fast and correct way to do this is to simply call I'd add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR only derefences links in the command line and not those inside the tree, I'd change the flags to -H (as in cp, ls, chown, ...) and maybe --dereference-command-line (as in ls).
b202622
to
4160477
Compare
@Stebalien, @keks Using Stat over Lstat seems to work the same, except it messes up the progress bar for Edit: |
I think I'd go for long-form only right now and see if we come up with a good short flag later.
Yes, that's what I'm thinking, too. The progress bar get's the total file size here, you could log the file size there and see if that's what you expect. Also, using One minor nitpick: In |
@keks, @Stebalien I have something working but I'm not sure if I'm approaching this right, I'd like some input on this. I managed to get accurate sizes by modifying serialFile.Size() to either deep-resolve the size of symlinks or not, depending on their depth. Currently they're just ignored and sized as 0. However, I did this by hard coding a depth limit. func NewSerialFile(name, path string, hidden bool, stat os.FileInfo) (File, error) I would likely need something like this func NewSerialFile(name, path string, hidden, deref, derefAll bool, stat os.FileInfo) (File, error) I figured if this has to change anyway it may be best to do something like this instead: type Options struct {
handleHiddenFiles bool
resolveRootLinks bool
resolveAllLinks bool
}
func NewSerialFile(name, path string, options Options, stat os.FileInfo) (File, error) In either case, I could just switch off of them inside of Size() like this //if file is a symlink {
// -H
if f.options.resolveRootLinks {...conditionalySize(link)...}
// -L
if f.options.resolveAllLinks {...alwaysSize(link)...}
// neither
// just return the size of the link itself
//} |
+1 to get some eyes on this again! |
@djdv (please bug me if I don't respond) I see why you'd need this for the recursive case but I think the issue here is really walk. That is, serialFile can stat |
@Stebalien For a recursive approach, we would need to know inside of I have to revisit this though, the patch I had working before, doesn't anymore, because the types have split since then (Need to change method for I believe we're going to have that same requirement though, since the only other option is to dereference only the commandline (during initial parsing). That is to say, I think we can have |
4160477
to
b6c0dc4
Compare
I pushed a current WIP that depends on these: But seems to work. I need to test it more, but we also have to come to a decision on what we want to support, and how to change the NewSerialFile options. Edit: Note: I forgot to implement -1 in the current patchset. |
b6c0dc4
to
790b550
Compare
Latest patch works, but is inaccurate on the progressbar for 1 of my test scenarios.
Calling Semi-related, I may need to add link type-detection up front, in |
790b550
to
bceb240
Compare
The names of flags need to be decided on as well. This is my current thought: Something like |
I'm not sure depth-limited link traversal is really all that useful, from a user's perspective. I'd really just make these two different options:
The second feature will require an additional option to |
Note: to avoid breaking everything, we may want to add a new constructor that takes options using the "functional option" pattern (like we do here: https://github.com/libp2p/go-libp2p-kad-dht/blob/master/opts/options.go). That way, we can stop breaking this every time we need a new option. |
I'm wondering this myself. It's only exposed at the moment for testing, but seemed like it might be useful if people wanted to import complex datasets. However, I don't really know if anyone structures data like this. In implementation, I don't think it will change much if we omit it. But I have to look at it.
I need to double check, but I think Go's
👍 |
bceb240
to
e156fdf
Compare
@djdv can we try adding a sharness test in go-ipfs for a recursive link ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get all of this. How are the two other branches related to this PR?
Also can we name the option --dereference-links
?
@keks
Resolving N levels deep doesn't seem useful, so it's been removed to make the I believe this should all be good. But the There's also no tests at the moment. The All in all, I guess we need to layout if there are any more unfulfilled requirements for |
(ping me when there is a PR against go-ipfs and @keks has reviewed it). |
cli/parse.go
Outdated
func resolveCommandLine(req *cmds.Request) bool { | ||
linkOpt, ok := req.Options[cmds.DerefLong].(bool) | ||
return linkOpt && ok | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linkOpt
will automatically be false
if ok
is false
. So you could just
linkOpt, _ := ...
return linkOpt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ this comment is still valid (really, I'm not sure if we even need a function for this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shortened that and another section.
Edit:
Hold on this. The change is causing problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien
Sorry about that, the semantic difference between casting the return from the map if req.Options[cmds.DerefLong].(bool)
, and casting the return of a type assertion
safeVar, _ := req.Options[cmds.DerefLong].(bool)
tripped me up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long period of silence.
Regarding the code in this PR, it looks pretty solid, but I left two comments. Also, tests would be really nice.
We need a better test infrastructure for this repo, but for now you could add a command to http/handler_test.go
and add a tests case in http/http_test.go
. This will not test the local case but at least it tests something and when we sit down to build better testing, we already have a bunch of test cases that we can re-use and generalize.
I also have some more questions on the deep-resolve-links branch, but let's discuss that in a new PR :)
opts.go
Outdated
) | ||
|
||
// options that are used by this package | ||
var OptionEncodingType = cmdkit.StringOption(EncLong, EncShort, "The encoding type the output should be encoded with (json, xml, or text)").WithDefault("text") | ||
var OptionRecursivePath = cmdkit.BoolOption(RecLong, RecShort, "Add directory paths recursively").WithDefault(false) | ||
var OptionStreamChannels = cmdkit.BoolOption(ChanOpt, "Stream channel output") | ||
var OptionTimeout = cmdkit.StringOption(TimeoutOpt, "set a global timeout on the command") | ||
var OptionTimeout = cmdkit.StringOption(TimeoutOpt, "Set a global timeout on the command") | ||
var OptionDerefArgs = cmdkit.BoolOption(DerefLong, "Resolve link arguments, instead of adding links as links").WithDefault(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.WithDefault(false)
doesn't do anything because the zero value is implicit default.
5889674
to
8923429
Compare
@keks I rebased this patch, and removed some default bool calls. Do you think this is sufficient? Edit: FWIW I couldn't figure out a good way to handle tests in |
8923429
to
4504227
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two nits but otherwise LGTM.
opts.go
Outdated
var OptionStreamChannels = cmdkit.BoolOption(ChanOpt, "Stream channel output") | ||
var OptionTimeout = cmdkit.StringOption(TimeoutOpt, "set a global timeout on the command") | ||
var OptionTimeout = cmdkit.StringOption(TimeoutOpt, "Set a global timeout on the command") | ||
var OptionDerefArgs = cmdkit.BoolOption(DerefLong, "Resolve link arguments, instead of adding links as links") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "Dereference symlinks appearing in arguments instead of adding them as symlinks".
Not sure about the "appearing in" part but I think it's slightly better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinions on "Symlinks supplied in arguments, are dereferenced"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, minus, the, comma 😄.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh — more of an em dash kind of person I see.
Changed.
cli/parse.go
Outdated
func resolveCommandLine(req *cmds.Request) bool { | ||
linkOpt, ok := req.Options[cmds.DerefLong].(bool) | ||
return linkOpt && ok | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ this comment is still valid (really, I'm not sure if we even need a function for this).
9847a9f
to
0e426e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be a little late in the game here, but could we consider renaming --dereference-command-line
to the less tedious --dereference-args
. As far as I can tell --dereference-command-line
is only used by GNU ls as a long name for -H
. Here we do not have the option to use the shorter -H
so I think it warrants more careful consideration to something less tedious to use.
@Stebalien @keks what do you think? If you don't agree feel free to dismiss my review.
I'd like to be as consistent with other tools as possible (but really, they're both a mouthful). (at the end of the day, I don't really care much either way) |
GNU |
Ah, well, in that case, I'd be happy either way. |
Likewise, I have no opinion on this. Options like this seem very script heavy to me, I doubt it will get much interactive (shell) use. So I'm not against verbosity. Tab completion is also a thing anyway. But there's nothing wrong with brevity either. |
Yeah, unless someone is against it I think it is better to go with the shorter Sorry for the trouble and being late with my review. |
License: MIT Signed-off-by: Dominic Della Valle <[email protected]>
0e426e5
to
be09a40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This allows us to follow filesystem links (symlinks, junctions, etc.).
The logic should be good but I need a suggestions on what the flag(s) should be.
traverse-links
is a placeholder.This is needed in
go-ipfs
so that we can traverse the link and add the target instead of adding the link itself.