-
Notifications
You must be signed in to change notification settings - Fork 537
worktree, status and reset implementation based on merkletrie #339
Conversation
…ckout implementations
I don't think putting all the implementations in the same place is worth it.
I don't think so. @smola and I talked about it a few months ago when I was working on merkletrie. We agreed that the hash method should not return errors, as valid objects should have valid hashes, while objects, with a hash that cannot be calculated, should return errors upon construction. Let us know if you would rather do it any other way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the PR looks good.
With respect to your questions:
- Moving newTreeNoder/NewTreeRootNode to other package will probably result in a circular dependency.
- For the Hash thing, let's look first if we should compute the Hash in Hash() at all. If it can error, maybe hashes should be calculated on TreeNoder creation.
plumbing/object/treenoder.go
Outdated
@@ -21,10 +21,11 @@ type treeNoder struct { | |||
name string // empty string for the root node | |||
mode filemode.FileMode | |||
hash plumbing.Hash | |||
children []noder.Noder // memoized | |||
children []noder.Noder // memorized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memoized was actually the right term ;-)
plumbing/object/treenoder.go
Outdated
} | ||
|
||
func newTreeNoder(t *Tree) *treeNoder { | ||
// NewTreeRootNode returns the root node of a Tree | ||
func NewTreeRootNode(t *Tree) *treeNoder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be just NewTreeNode
? The Root
part seems redundant?
plumbing/object/treenoder.go
Outdated
@@ -74,7 +75,7 @@ func (t *treeNoder) Children() ([]noder.Noder, error) { | |||
return noder.NoChildren, nil | |||
} | |||
|
|||
// children are memoized for efficiency | |||
// children are memorized for efficiency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/memorized/memoized/
utils/merkletrie/filesystem/node.go
Outdated
} | ||
|
||
func (n *Node) Hash() []byte { | ||
if n.IsDir() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't Hash() be cached? I think DiffTree does a lot of repeated calls to Hash. So either implementations cache Hash, or DiffTree caches them internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, but I could see any performance improvement.
options.go
Outdated
// Branch to be checked out, if empty uses `master` | ||
Branch plumbing.ReferenceName | ||
Hash plumbing.Hash | ||
// RemoteName is the name of the remote to be pushed to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is godoc for RemoteName and the variable is Force, copypasta?
options.go
Outdated
|
||
// CheckoutOptions describes how a checkout operation should be performed. | ||
type CheckoutOptions struct { | ||
// Branch to be checked out, if empty uses `master` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End the sentence with a full stop.
options.go
Outdated
// Branch to be checked out, if empty uses `master` | ||
Branch plumbing.ReferenceName | ||
Hash plumbing.Hash | ||
// RemoteName is the name of the remote to be pushed to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment does not match any code.
options.go
Outdated
type CheckoutOptions struct { | ||
// Branch to be checked out, if empty uses `master` | ||
Branch plumbing.ReferenceName | ||
Hash plumbing.Hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing documentation.
options.go
Outdated
Branch plumbing.ReferenceName | ||
Hash plumbing.Hash | ||
// RemoteName is the name of the remote to be pushed to. | ||
Force bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing documentation.
} | ||
|
||
// Validate validates the fields and sets the default values. | ||
func (o *CheckoutOptions) Validate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this method return an error?
On top of that, maybe validate is not the best name for this method, when validating you return true or false according to some checks, while this method overwrites some attributes (and only some of them, leaving other invalid values untouched) probably because there is not constructor or setters to this properly. Maybe a better name will be FillInDefaults
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already discussed on previous PRs, and is not related to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it was in #178.
I'll open an issue, as we agreed back then.
utils/merkletrie/filesystem/node.go
Outdated
".git": true, | ||
} | ||
|
||
func IsEquals(a, b noder.Hasher) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be documented extensively, including what it will be the algorithm for comparing hashes in directories.
utils/merkletrie/filesystem/node.go
Outdated
|
||
func (n *Node) NumChildren() (int, error) { | ||
files, err := n.readDir() | ||
return len(files), err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this, be careful with submodules and empty dirs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, since now the children are pre-computed, return just the len of the children slice.
The submodule are not tested, and yes for sure are failing since, the hash of a submodule is not computed in the same way. I am implementing this.
@@ -0,0 +1,128 @@ | |||
package filesystem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be moved to the billy package, but definately it should not be under merkletrie utilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't make any sense at billy, is a very specific things to git, maybe outside of the merkletrie package.
@@ -0,0 +1,113 @@ | |||
package index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably no go under the merkletrie package, but on the index.
Please document extensively how are you planning to map an index to a merkletrie, the challenges, surprises and your solutions. Otherwise, I can only guess your intentions and it is very time consuming and error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't place this in index, you will end with circular dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably place it in an internal package under the index package.
utils/merkletrie/filesystem/node.go
Outdated
func IsEquals(a, b noder.Hasher) bool { | ||
pathA := a.(noder.Path) | ||
pathB := b.(noder.Path) | ||
if pathA[len(pathA)-1].IsDir() || pathB[len(pathB)-1].IsDir() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this, reviewing this is too time consuming due to lack of documentation.
Can you can explain the general guidelines of how are you planning to map a filesystem into a merkletrie at the very begining?, also the main challenges, surprises and your solutions.
I will continue reviewing this later, it is too long. I advise to break this kind of PR into smaller contributions that can be reviewed easily and faster. |
@alcortesm with this goal I created nice commits, to allow you review commit by commit if you wish. |
Codecov Report
@@ Coverage Diff @@
## master #339 +/- ##
==========================================
+ Coverage 77.3% 77.31% +<.01%
==========================================
Files 117 122 +5
Lines 8062 8268 +206
==========================================
+ Hits 6232 6392 +160
- Misses 1164 1184 +20
- Partials 666 692 +26
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the hash calculation is universal to all the Noder implementations in go-git, otherwise you can compare it. That's why I believe that all the implementations should be in the same place.
I see. Then we are in a very different page. The whole point of the Hasher
interface, defining the IsEqual
function and passing it to Difftree
was to not have the same hash implementation for all the types. We will need to refactor all that part so it is consistent with your comment above.
@@ -178,6 +178,72 @@ type SubmoduleUpdateOptions struct { | |||
RecurseSubmodules SubmoduleRescursivity | |||
} | |||
|
|||
// CheckoutOptions describes how a checkout 31operation should be performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/31//
@@ -178,6 +178,72 @@ type SubmoduleUpdateOptions struct { | |||
RecurseSubmodules SubmoduleRescursivity | |||
} | |||
|
|||
// CheckoutOptions describes how a checkout 31operation should be performed. | |||
type CheckoutOptions struct { | |||
// Hash to be checked out, if used HEAD will in detached mode. Branch and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing a verb:
if used HEAD will be in detached mode.
// Hash to be checked out, if used HEAD will in detached mode. Branch and | ||
// Hash are mutual exclusive. | ||
Hash plumbing.Hash | ||
// Branch to be checked out, if Branch and Hash are empty is set to `master`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing subject:
if Branch and Hash are empty it is set to master
.
// Branch to be checked out, if Branch and Hash are empty is set to `master`. | ||
Branch plumbing.ReferenceName | ||
// Force, if true when switching branches, proceed even if the index or the | ||
// working tree differs from HEAD. This is used to throw away local changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End the sentence with a full stop.
} | ||
|
||
// Validate validates the fields and sets the default values. | ||
func (o *CheckoutOptions) Validate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it was in #178.
I'll open an issue, as we agreed back then.
}, nil | ||
} | ||
|
||
func (n *node) calculateHash(path string, file billy.FileInfo) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On line 71, the method calculateChildren
sets the children of the node and only returns an error.
Here, the method calculateHash
doesn't set the hash but return it instead.
I think we should use the same strategy in both methods, for coherency, or just make this a function instead of a method.
|
||
func Test(t *testing.T) { TestingT(t) } | ||
|
||
type NoderSuite struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to call it NodeSuite
instead.
|
||
func IsEquals(a, b noder.Hasher) bool { | ||
if bytes.Equal(a.Hash(), empty) || bytes.Equal(b.Hash(), empty) { | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very weird, as it is defeating the whole purpose of using the merkletrie package and implementing Noder.
Maybe it is ok for the tests, but in that case I would mention it in a comment.
|
||
var _ = Suite(&NoderSuite{}) | ||
|
||
func (s *NoderSuite) TestDiff(c *C) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this way of testing node
will make debugging harder than if we do unit testing of its public methods instead: it would be easier to detect errors with the unit tests than having to trace back difftree's results mismatches in the tests to see what errors in node
caused them.
I would write unit tests for node
and maybe left a cople of these as integrations tests.
@@ -0,0 +1,113 @@ | |||
package index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably place it in an internal package under the index package.
I am exhausted, I will continue reviewing this the next day. |
I think you merged before my review, you should consider reopening for my comments today. I will probably add a few more in the next days. |
This is a implementation of the status and the reset implementation based on merkletrie.
Things to be agreed:
Two new implementations of
merkletrie.Noder
have been implemented, those implementations are inside ofutils/merkletrie/<use-case>
, and the original one is atplumbing/objects/
, worth to put all the implementations in the same place?Due to the lack of error in the
noder.Hash
interface method, the implementation contains panic, We should add an error to this method?NOTE: hard reset is not implemented, I am finishing this and some additional
Worktree.Reset
tests tomorrowRequired by #256