worktree, status and reset implementation based on merkletrie #339

mcuadros · 2017-04-11T02:45:54Z

This is a implementation of the status and the reset implementation based on merkletrie.

Things to be agreed:

Two new implementations of merkletrie.Noder have been implemented, those implementations are inside of utils/merkletrie/<use-case>, and the original one is at plumbing/objects/, worth to put all the implementations in the same place?
Due to the lack of error in the noder.Hash interface method, the implementation contains panic, We should add an error to this method?

NOTE: hard reset is not implemented, I am finishing this and some additional Worktree.Reset tests tomorrow

Required by #256

…ckout implementations

alcortesm · 2017-04-11T07:51:08Z

Two new implementations of merkletrie.Noder has been implemented, those implementations are ?
inside of utils/merkletrie/, and the original one is at plumbing/objects/, worth to put all the > implementations in the same place?

I don't think putting all the implementations in the same place is worth it.

Due to the lack of error in the noder.Hash interface method, the implementation contains panic, We should add an error to this method?

I don't think so.

@smola and I talked about it a few months ago when I was working on merkletrie. We agreed that the hash method should not return errors, as valid objects should have valid hashes, while objects, with a hash that cannot be calculated, should return errors upon construction.

Let us know if you would rather do it any other way.

smola

I think the PR looks good.

With respect to your questions:

Moving newTreeNoder/NewTreeRootNode to other package will probably result in a circular dependency.
For the Hash thing, let's look first if we should compute the Hash in Hash() at all. If it can error, maybe hashes should be calculated on TreeNoder creation.

smola · 2017-04-11T08:49:15Z

plumbing/object/treenoder.go

@@ -21,10 +21,11 @@ type treeNoder struct {
 	name     string // empty string for the root node
 	mode     filemode.FileMode
 	hash     plumbing.Hash
-	children []noder.Noder // memoized
+	children []noder.Noder // memorized


memoized was actually the right term ;-)

smola · 2017-04-11T08:51:12Z

plumbing/object/treenoder.go

 }

-func newTreeNoder(t *Tree) *treeNoder {
+// NewTreeRootNode returns the root node of a Tree
+func NewTreeRootNode(t *Tree) *treeNoder {


Shouldn't this be just NewTreeNode? The Root part seems redundant?

smola · 2017-04-11T08:51:23Z

plumbing/object/treenoder.go

@@ -74,7 +75,7 @@ func (t *treeNoder) Children() ([]noder.Noder, error) {
 		return noder.NoChildren, nil
 	}

-	// children are memoized for efficiency
+	// children are memorized for efficiency


s/memorized/memoized/

smola · 2017-04-11T08:58:07Z

utils/merkletrie/filesystem/node.go

+}
+
+func (n *Node) Hash() []byte {
+	if n.IsDir() {


Shouldn't Hash() be cached? I think DiffTree does a lot of repeated calls to Hash. So either implementations cache Hash, or DiffTree caches them internally.

done, but I could see any performance improvement.

smola · 2017-04-11T08:59:39Z

options.go

+	// Branch to be checked out, if empty uses `master`
+	Branch plumbing.ReferenceName
+	Hash   plumbing.Hash
+	// RemoteName is the name of the remote to be pushed to.


This is godoc for RemoteName and the variable is Force, copypasta?

alcortesm · 2017-04-11T07:52:06Z

options.go

+
+// CheckoutOptions describes how a checkout operation should be performed.
+type CheckoutOptions struct {
+	// Branch to be checked out, if empty uses `master`


End the sentence with a full stop.

alcortesm · 2017-04-11T07:52:50Z

options.go

+	// Branch to be checked out, if empty uses `master`
+	Branch plumbing.ReferenceName
+	Hash   plumbing.Hash
+	// RemoteName is the name of the remote to be pushed to.


This comment does not match any code.

alcortesm · 2017-04-11T07:53:05Z

options.go

+type CheckoutOptions struct {
+	// Branch to be checked out, if empty uses `master`
+	Branch plumbing.ReferenceName
+	Hash   plumbing.Hash


Missing documentation.

alcortesm · 2017-04-11T07:53:37Z

options.go

+	Branch plumbing.ReferenceName
+	Hash   plumbing.Hash
+	// RemoteName is the name of the remote to be pushed to.
+	Force bool


Missing documentation.

alcortesm · 2017-04-11T07:57:05Z

options.go

+}
+
+// Validate validates the fields and sets the default values.
+func (o *CheckoutOptions) Validate() error {


Why does this method return an error?

On top of that, maybe validate is not the best name for this method, when validating you return true or false according to some checks, while this method overwrites some attributes (and only some of them, leaving other invalid values untouched) probably because there is not constructor or setters to this properly. Maybe a better name will be FillInDefaults.

This was already discussed on previous PRs, and is not related to this PR

You are right, it was in #178.

I'll open an issue, as we agreed back then.

alcortesm · 2017-04-11T09:15:43Z

utils/merkletrie/filesystem/node.go

+	".git": true,
+}
+
+func IsEquals(a, b noder.Hasher) bool {


This should be documented extensively, including what it will be the algorithm for comparing hashes in directories.

alcortesm · 2017-04-11T09:22:26Z

utils/merkletrie/filesystem/node.go

+
+func (n *Node) NumChildren() (int, error) {
+	files, err := n.readDir()
+	return len(files), err


Not sure about this, be careful with submodules and empty dirs.

done, since now the children are pre-computed, return just the len of the children slice.

The submodule are not tested, and yes for sure are failing since, the hash of a submodule is not computed in the same way. I am implementing this.

alcortesm · 2017-04-11T09:25:04Z

utils/merkletrie/filesystem/node.go

@@ -0,0 +1,128 @@
+package filesystem


this should probably be moved to the billy package, but definately it should not be under merkletrie utilities.

doesn't make any sense at billy, is a very specific things to git, maybe outside of the merkletrie package.

alcortesm · 2017-04-11T09:26:57Z

utils/merkletrie/index/node.go

@@ -0,0 +1,113 @@
+package index


this should probably no go under the merkletrie package, but on the index.

Please document extensively how are you planning to map an index to a merkletrie, the challenges, surprises and your solutions. Otherwise, I can only guess your intentions and it is very time consuming and error prone.

You can't place this in index, you will end with circular dependencies.

I would probably place it in an internal package under the index package.

alcortesm · 2017-04-11T09:28:13Z

utils/merkletrie/filesystem/node.go

+func IsEquals(a, b noder.Hasher) bool {
+	pathA := a.(noder.Path)
+	pathB := b.(noder.Path)
+	if pathA[len(pathA)-1].IsDir() || pathB[len(pathB)-1].IsDir() {


I don't understand this, reviewing this is too time consuming due to lack of documentation.

Can you can explain the general guidelines of how are you planning to map a filesystem into a merkletrie at the very begining?, also the main challenges, surprises and your solutions.

alcortesm · 2017-04-11T09:29:11Z

I will continue reviewing this later, it is too long. I advise to break this kind of PR into smaller contributions that can be reviewed easily and faster.

mcuadros · 2017-04-11T19:12:37Z

@alcortesm with this goal I created nice commits, to allow you review commit by commit if you wish.

codecov · 2017-04-12T13:17:54Z

Codecov Report

Merging #339 into master will increase coverage by <.01%.
The diff coverage is 65.64%.

@@            Coverage Diff             @@
##           master     #339      +/-   ##
==========================================
+ Coverage    77.3%   77.31%   +<.01%     
==========================================
  Files         117      122       +5     
  Lines        8062     8268     +206     
==========================================
+ Hits         6232     6392     +160     
- Misses       1164     1184      +20     
- Partials      666      692      +26

Impacted Files	Coverage Δ
plumbing/format/index/index.go	`0% <0%> (ø)`
plumbing/object/difftree.go	`77.77% <100%> (ø)`	⬆️
submodule.go	`65% <100%> (ø)`	⬆️
plumbing/object/treenoder.go	`80.35% <100%> (ø)`	⬆️
plumbing/object/tree.go	`77.88% <100%> (+3.51%)`	⬆️
status.go	`50% <50%> (ø)`
options.go	`78.57% <54.54%> (-8.53%)`	⬇️
utils/merkletrie/filesystem/node.go	`63.33% <63.33%> (ø)`
worktree.go	`66.51% <66.42%> (+9.82%)`	⬆️
worktree_status.go	`67.16% <67.16%> (ø)`
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b45f46...5bcf802. Read the comment docs.

alcortesm

the hash calculation is universal to all the Noder implementations in go-git, otherwise you can compare it. That's why I believe that all the implementations should be in the same place.

@smola

I see. Then we are in a very different page. The whole point of the Hasher interface, defining the IsEqual function and passing it to Difftree was to not have the same hash implementation for all the types. We will need to refactor all that part so it is consistent with your comment above.

alcortesm · 2017-04-12T10:27:48Z

options.go

@@ -178,6 +178,72 @@ type SubmoduleUpdateOptions struct {
 	RecurseSubmodules SubmoduleRescursivity
 }

+// CheckoutOptions describes how a checkout 31operation should be performed.


alcortesm · 2017-04-12T10:28:51Z

options.go

@@ -178,6 +178,72 @@ type SubmoduleUpdateOptions struct {
 	RecurseSubmodules SubmoduleRescursivity
 }

+// CheckoutOptions describes how a checkout 31operation should be performed.
+type CheckoutOptions struct {
+	// Hash to be checked out, if used HEAD will in detached mode. Branch and


missing a verb:
if used HEAD will be in detached mode.

alcortesm · 2017-04-12T10:29:48Z

options.go

+	// Hash to be checked out, if used HEAD will in detached mode. Branch and
+	// Hash are mutual exclusive.
+	Hash plumbing.Hash
+	// Branch to be checked out, if Branch and Hash are empty is set to `master`.


missing subject:
if Branch and Hash are empty it is set to master.

alcortesm · 2017-04-12T10:30:46Z

options.go

+	// Branch to be checked out, if Branch and Hash are empty is set to `master`.
+	Branch plumbing.ReferenceName
+	// Force, if true when switching branches, proceed even if the index or the
+	// working tree differs from HEAD. This is used to throw away local changes


End the sentence with a full stop.

alcortesm · 2017-04-12T10:38:22Z

options.go

+}
+
+// Validate validates the fields and sets the default values.
+func (o *CheckoutOptions) Validate() error {


You are right, it was in #178.

I'll open an issue, as we agreed back then.

alcortesm · 2017-04-12T13:11:04Z

utils/merkletrie/filesystem/node.go

+	}, nil
+}
+
+func (n *node) calculateHash(path string, file billy.FileInfo) ([]byte, error) {


On line 71, the method calculateChildren sets the children of the node and only returns an error.

Here, the method calculateHash doesn't set the hash but return it instead.

I think we should use the same strategy in both methods, for coherency, or just make this a function instead of a method.

alcortesm · 2017-04-12T13:15:53Z

utils/merkletrie/filesystem/node_test.go

+
+func Test(t *testing.T) { TestingT(t) }
+
+type NoderSuite struct{}


I think it is better to call it NodeSuite instead.

alcortesm · 2017-04-12T13:24:20Z

utils/merkletrie/filesystem/node_test.go

+
+func IsEquals(a, b noder.Hasher) bool {
+	if bytes.Equal(a.Hash(), empty) || bytes.Equal(b.Hash(), empty) {
+		return false


This is very weird, as it is defeating the whole purpose of using the merkletrie package and implementing Noder.

Maybe it is ok for the tests, but in that case I would mention it in a comment.

alcortesm · 2017-04-12T13:29:37Z

utils/merkletrie/filesystem/node_test.go

+
+var _ = Suite(&NoderSuite{})
+
+func (s *NoderSuite) TestDiff(c *C) {


I think this way of testing node will make debugging harder than if we do unit testing of its public methods instead: it would be easier to detect errors with the unit tests than having to trace back difftree's results mismatches in the tests to see what errors in node caused them.

I would write unit tests for node and maybe left a cople of these as integrations tests.

alcortesm · 2017-04-12T13:31:58Z

utils/merkletrie/index/node.go

@@ -0,0 +1,113 @@
+package index


I would probably place it in an internal package under the index package.

alcortesm · 2017-04-12T13:48:53Z

I am exhausted, I will continue reviewing this the next day.

alcortesm · 2017-04-12T13:50:43Z

I think you merged before my review, you should consider reopening for my comments today.

I will probably add a few more in the next days.

mcuadros added 5 commits April 11, 2017 04:34

merkletrie: node support for index file

6a00b30

merkletrie: node support for billy filesystems

9e0ae96

plumbing: format, index stringer

af4f25d

plumbing: object, public Tree.FindEntry and minor diff changes

aa818a3

worktree, status implementation based on merkletrie and reset and che…

116fed7

…ckout implementations

mcuadros requested review from alcortesm and smola April 11, 2017 02:47

smola suggested changes Apr 11, 2017

View reviewed changes

alcortesm suggested changes Apr 11, 2017

View reviewed changes

mcuadros added 3 commits April 11, 2017 23:16

merge, Repository.Log

7a428a9

merkletrie: filesystem and index speedup and documentation

e14ee7a

worktree, reset implementation and status improvements

63f2348

smola approved these changes Apr 12, 2017

View reviewed changes

alcortesm mentioned this pull request Apr 12, 2017

Review the rationale and naming of the validate methods in options.go #341

Closed

examples: checkout example update

5bcf802

mcuadros merged commit 932ced9 into src-d:master Apr 12, 2017

alcortesm reviewed Apr 12, 2017

View reviewed changes


		func Test(t *testing.T) { TestingT(t) }

		type NoderSuite struct{}


		var _ = Suite(&NoderSuite{})

		func (s NoderSuite) TestDiff(c C) {

worktree, status and reset implementation based on merkletrie #339

worktree, status and reset implementation based on merkletrie #339

Conversation

mcuadros commented Apr 11, 2017 • edited by smola Loading

alcortesm commented Apr 11, 2017

smola left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alcortesm commented Apr 11, 2017 • edited Loading

mcuadros commented Apr 11, 2017

codecov bot commented Apr 12, 2017 • edited Loading

Codecov Report

alcortesm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alcortesm commented Apr 12, 2017

alcortesm commented Apr 12, 2017 • edited Loading

mcuadros commented Apr 11, 2017 •

edited by smola

Loading

alcortesm commented Apr 11, 2017 •

edited

Loading

codecov bot commented Apr 12, 2017 •

edited

Loading

alcortesm commented Apr 12, 2017 •

edited

Loading