-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ticket checking for expected consensus #482
Conversation
754fc99
to
b745608
Compare
@@ -52,17 +55,6 @@ func (b *Block) Cid() *cid.Cid { | |||
return b.ToNode().Cid() | |||
} | |||
|
|||
// AddParent sets the parent pointer of the receiver to the argument if it | |||
// is a valid assignment, else returns an error. | |||
func (b *Block) AddParent(p Block) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to move this because it's not convenient to validate parent height here anymore. Also, not really sure it makes sense to be doing this validation at such a low-level on such a general data structure.
|
||
// Mine does the actual work. It's the implementation of worker.mine. | ||
func Mine(ctx context.Context, input Input, blockGenerator BlockGenerator, doSomeWork DoSomeWorkFunc, outCh chan<- Output) { | ||
func Mine(ctx context.Context, input Input, nullBlockTimer NullBlockTimerFunc, blockGenerator BlockGenerator, createPoST DoSomeWorkFunc, outCh chan<- Output) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not thrilled about the proliferation of these callback functions... I have had some ideas since getting my hands a little dirtier, but they seem out of scope for this PR:
1/ I think just naming better helps a lot. I renamed doSomeWork
to createPoST
because I think that is the role it is actually playing. If we complete the rename of the type (in a separate PR) and do other similar work to clean up the descriptiveness of all these names and organize them better, that will help a bit.
2/ A much bigger sledgehammer we could use is to make at least some of these functions global variables, and then replace them temporarily during tests. There's a nice pattern for this using defer
. That would get rid of all the parameters passed everywhere and type definitions, and isolate the remaining complexity in tests. A bonus would be that jump-to-definition in editors would work again. This has the disadvantage, though, that we won't be able to parallelize tests as aggressively as if we hadn't done it. I'm not sure to what extent we care about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like both ideas, better naming is always good IMO (assuming of course you can come up with better names in a reasonable amount of time). And your idea 2 is great, seems like it would improve code readability and debugability pretty nicely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 && 2 SGTM. Re 2 I love that pattern because it's easy to follow, but have been avoiding it due to previous feedback about sullying global namespace and public interfaces. Will bring it back when it looks like the right tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it out in the lastest PR. See changes to worker_test.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swap
is cool. If we start using it, we'll need to watch out for goroutines escaping the dynamic scope of the swap. Mentioning this mostly as a note to self since this will be a new Go pattern for me.
} | ||
|
||
// How long the node's mining Worker should sleep to simulate mining. | ||
const mineSleepTime = time.Millisecond * 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reduced this significantly because (a) I needed to move it to in front of when the block is mined to better adhere to the design/spec, and also because (b) now with EC we tend to attempt mining a lot more times in each test (thus making tests painfully slow without changing this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. FWIW the fact that it was sleeping that long in tests was basically a bug, we should have been using a nop function in tests when constructing the mining worker here: https://github.com/filecoin-project/go-filecoin/blob/master/node/node.go#L198. Given trajectory though def simpler to just lower the sleep time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But don't we want the sleep in place under normal circumstances, when you run like go-filecoin daemon
? I think that codepath is used in both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. I was assuming that our trajectory was to have a real mechanism that slows us down -- either waiting for the next epoch or actually doing proofs -- some time soon. I guess I don't know when that is actually going to happen so yes you are right, continuing to sleep on the normal code path is probably good.
mining/block_generator.go
Outdated
@@ -23,7 +23,7 @@ type GetStateTree func(context.Context, *cid.Cid) (state.Tree, error) | |||
|
|||
// BlockGenerator is the primary interface for blockGenerator. | |||
type BlockGenerator interface { | |||
Generate(context.Context, *types.Block, types.Address) (*types.Block, error) | |||
Generate(ctx context.Context, block *types.Block, ticket types.Signature, nullBlockCount uint64, addr types.Address) (*types.Block, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would either make these names better, or leave them off. block
is unclear, should maybe be parent
or basis
// Find the smallest ticket from parent set | ||
var smallest types.Signature | ||
for _, v := range parents { | ||
if smallest == nil || bytes.Compare(v.Ticket, smallest) < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mental note: We should have a set of test vectors for this to make sure all implementations are sorting correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you noticed this, but there's a small set of them in the unit test we could start with: https://github.com/filecoin-project/go-filecoin/pull/482/files#diff-a0c70e0ee68f1661f66db8fcbbabacf4R213. I was able to (ab)use the official NIST test vectors for this purpose. But generally speaking yeah, we're going to want/need a pretty extensive conformance test that covers the entire protocol.
break | ||
} | ||
|
||
nullBlockTimer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still need a TODO to check if we've gotten a better block from the network in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as written, this will make every miner keep trying an increasing number of null blocks until they win, implying every miner will submit a block at every epoch. Which is fine for now, but obviously not optimal in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I think there's actually a minor bug there. When a better block comes in, Start()
will cancel the current goroutine executing Mine()
and spawn a new one. So we will start mining on the new block. But, I neglect to check for cancellation in this loop in the case where there's no winning ticket, so this instance of Mine()
will also keep going until it finds a number of null blocks it wins for.
I'll write tests that exercise this -- it's the last TODO I had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all LGTM
// doSomeWork is a function like Sleep() that we call to simulate mining. | ||
doSomeWork DoSomeWorkFunc | ||
mine mineFunc | ||
createPoST DoSomeWorkFunc // TODO: rename createPoSTFunc? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, you should rename as you see fit here and anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR already too big.
mining/worker.go
Outdated
} | ||
|
||
buf := make([]byte, 4) | ||
n := binary.PutUvarint(buf, uint64(nullBlockCount)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make it easy for the person doing the varint implementation to know that they should make a change here:
// TODO replace varint encoding as part of #340
mining/block_generator.go
Outdated
@@ -104,15 +104,17 @@ func (b blockGenerator) Generate(ctx context.Context, baseBlock *types.Block, re | |||
return nil, err | |||
} | |||
|
|||
if blockHeight != baseBlock.Height+nullBlockCount+1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a little more clear if blockHeight was set right before this check instead of a dozen lines above. When you do that though seems like the utility of the check comes into question, so maybe worth a note about what we're trying to accomplish (a backstop bc the relationship isn't enforced anywhere else).
472e539
to
892c8e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Esteemed reviewers, PTAL, I believe this is ready to land.
@@ -138,7 +138,7 @@ func TestPaymentChannelRedeemSuccess(t *testing.T) { | |||
func TestPaymentChannelReclaimSuccess(t *testing.T) { | |||
payer := &address.TestAddress | |||
target := &address.TestAddress2 | |||
eol := types.NewBlockHeight(5) | |||
eol := types.NewBlockHeight(20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is suuuper ghetto. We're going to have to think about how to write this kind of test now that mine once
means (and I think we want it to continue to mean) mine until you get a block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the scope of the issue but we should feel empowered to modify mine once's behavior to hit a winning ticket immediately in the case of tests like this if it is useful. Or have a new command that does. If all we want is a block and we don't care about the actual consensus, we can circumvent consensus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That you are having to futz with the numbers here points to a potential issue with the design. It's not entirely clear to me how a client and a miner will negotiate the eol
such that the miner is guaranteed enough time to redeem their vouchers without the client having to tie up funds for an excessive amount of time before they can be reclaimed. There will be a point in time (point in block height?) where the client can create and deliver a voucher, but the miner will have, at best, a probabilistic chance of being able to redeem it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, in tests like this we can make it so mining always succeeds.
9138c72
to
559ec2f
Compare
@@ -138,7 +138,7 @@ func TestPaymentChannelRedeemSuccess(t *testing.T) { | |||
func TestPaymentChannelReclaimSuccess(t *testing.T) { | |||
payer := &address.TestAddress | |||
target := &address.TestAddress2 | |||
eol := types.NewBlockHeight(5) | |||
eol := types.NewBlockHeight(20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the scope of the issue but we should feel empowered to modify mine once's behavior to hit a winning ticket immediately in the case of tests like this if it is useful. Or have a new command that does. If all we want is a block and we don't care about the actual consensus, we can circumvent consensus.
Yeah, that would be easy to implement technically, but I was concerned about how it would interact with consensus. What would it mean to have this feature in the deployed client? I guess it would mean you mine a block and just nobody agrees? It fails consensus? If that is OK, I'm happy to implement it. |
Can you say more a out that?
…On Fri, May 25, 2018, 6:19 PM clkunzang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In mining/worker.go
<#482 (comment)>
:
>
// Mine does the actual work. It's the implementation of worker.mine.
-func Mine(ctx context.Context, input Input, blockGenerator BlockGenerator, doSomeWork DoSomeWorkFunc, outCh chan<- Output) {
+func Mine(ctx context.Context, input Input, nullBlockTimer NullBlockTimerFunc, blockGenerator BlockGenerator, createPoST DoSomeWorkFunc, outCh chan<- Output) {
Swap is cool. If we start using it, we'll need to watch out for
goroutines escaping the dynamic scope of the swap. Mentioning this mostly
as a note to self since this will be a new Go pattern for me.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#482 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAE6BHCn7fm-5Guz-Tihln0lVm__8364ks5t2K2mgaJpZM4UJ7Ri>
.
|
@aboodman, Sure -- as an example (for shape, not substance), consider your usage:
In this case, it happens to be fine because the relevant work of However, knowing that this is the case requires knowing what's going on. Since the test involves many more actions, it's conceivable that we are relying on Or, maybe This is manageable, but it does require reasoning a lot more about the concurrency. It also means that changes to any function which spawns a goroutine need to be considered carefully lest they upset a test somewhere (which indirectly brings a call to that function into dynamic scope). Because of all these issues, I might suggest that we split the difference. We could implement a test wrapper abstraction providing a syntactically convenient way to run a single self-contained test within the dynamic scope of the swapped re-bindings. This would reduce the discipline required in trying to use very fine-grained re-bindings like in your example. Not directly related, but on the same topic and going back to your comment about parallel tests. I'm actually not sure of exactly when Go's Basically, I think I'm quite interested to hear about how you've used this in the past and what kind of patterns/harnesses you adopted to stay safe. Maybe this warrants a discussion issue of its own. |
On Fri, May 25, 2018 at 7:16 PM, clkunzang ***@***.***> wrote:
@aboodman <https://github.com/aboodman>, Sure -- as an example (for
shape, not substance), consider your usage:
(func() {
defer swap.Swap(&isWinningTicket, everyThirdWinningTicket())()
go Mine(ctx, input, nullBlockImmediately, mockBg, func() { workCount++ }, outCh)
r = <-outCh
})()
In this case, it happens to be fine because the relevant work of Mine has
definitely completed by the time the function scope ends. (This is because
of the blocking receive into r.)
However, *knowing* that this is the case requires knowing what's going
on. Since the test involves many more actions, it's conceivable that we are
relying on Mine to do something else after sending to outCh.
Or, maybe Mine *does* continue to do something else that we wouldn't care
about if isWinningTicket were still swapped. But after it sends to outCh
the first time, the inner function will return and isWinningTicket will
*not* be swapped anymore. So it's now possible that the still-running (if
it is -- this is just an example) goroutine will call isWinningTicket
again, but this time get the original/default/snapshot value of the
function and call it.
This is manageable, but it does require reasoning a lot more about the
concurrency. It also means that changes to any function which spawns a
goroutine need to be considered carefully lest they upset a test somewhere
(which indirectly brings a call to that function into dynamic scope).
Because of all these issues, I might suggest that we split the difference.
We could implement a test wrapper abstraction providing a syntactically
convenient way to run a single self-contained test within the dynamic scope
of the swapped re-bindings. This would reduce the discipline required in
trying to use very fine-grained re-bindings like in your example.
Not directly related, but on the same topic and going back to your comment
about parallel tests. I'm actually not sure of *exactly* when Go's testing
framework believes it has the right to run tests in parallel. If we begin
adopting this, someone should find that out and document it in the usage
rules.
Basically, I think Swap is cool (and I like the black reflect magic in
its implementation) -- but I think it comes with a lot of potential
pitfalls. We can mitigate that by going all-in and formalizing some
patterns which might be a bit more limited than all we can dream up but
might also prevent us from getting into some nightmarish scenarios
involving non-deterministic tests. If we end up in that quicksand, we'll
find it hard to get out and wish we hadn't let it happen.
Agree with all of this. I'm not sure how useful Swap() will be in practice
anyway because it really only helps for replacing global variables which
aren't that common. And besides the issues you brought up, what we have
here is basically monkey-patching, with all the complexity that entails.
I'm quite interested to hear about how you've used this in the past and
what kind of patterns/harnesses you adopted to stay safe. Maybe this
warrants a discussion issue of its own.
I haven't used it in the past. I frequently like to "try things on for
size" to see how they feel. It's sometimes hard to speculate about how
useful something will or wont' be, or how the tradeoffs will feel, without
hands on experience - at least for me. Plus it's just fun to try new
approaches. In this case, this has the one dramatic advantage of letting
jump-to-definition continue to work, so I wanted to try it out.
After using this here and gaining more experience with our tests, I'm
learning more toward just a traditional software engineering solution of
re-organizing the dependencies into interfaces with consistent names and
code organization, and I bet that'll be enough.
But now at least the tool is there and we can all play with it to gain
experience.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#482 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAE6BDp9W0qKYwBD6rqkt5pWHxQgYj2nks5t2Lr1gaJpZM4UJ7Ri>
.
|
mining/worker.go
Outdated
// let the successful run proceed unless the context is explicitly canceled. | ||
if ctx.Err() == nil { | ||
outCh <- NewOutput(next, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should definitely log something if a context cancellation causes us to throw away a good block. I can't think of too many cases where we would want to actually throw away a correctly mined block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some logging and made the existing TODO more assertive.
util/chk/check.go
Outdated
// True panics if a boolean condition is not true. | ||
func True(cond bool, formatAndArgs ...interface{}) { | ||
var msg string | ||
if len(formatAndArgs) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not put this inside the if !cond {
? That way it only allocates when its going to immediately cease to exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. As a result of this comment, I went looking for ways to test this and found testing.AllocsPerRun
. I also decided that it made more sense to remove chk.Equals()
and chk.Nil()
as those would also implicitly allocate and it's not burdensome to just inline those checks in the calls to chk.True()
.
// | ||
// The destination value must be a pointer and the new value must be a value | ||
// that can be assigned to the destination pointer. | ||
func Swap(dst, new interface{}) (unswap func()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hot 🔥
util/swap/swap_test.go
Outdated
|
||
age := 25 | ||
fn := func() int { return 1 } | ||
(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the extra parens mean... what exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. I don't know why gofmt didn't call me out for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whyrusleeping ptal. Assuming you are good, can you also please click the 'approve' button.
util/chk/check.go
Outdated
// True panics if a boolean condition is not true. | ||
func True(cond bool, formatAndArgs ...interface{}) { | ||
var msg string | ||
if len(formatAndArgs) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. As a result of this comment, I went looking for ways to test this and found testing.AllocsPerRun
. I also decided that it made more sense to remove chk.Equals()
and chk.Nil()
as those would also implicitly allocate and it's not burdensome to just inline those checks in the calls to chk.True()
.
util/swap/swap_test.go
Outdated
|
||
age := 25 | ||
fn := func() int { return 1 } | ||
(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. I don't know why gofmt didn't call me out for that.
is approval from @whyrusleeping the last thing missing to get this merged? |
…bing global vars in tests)
Fixes #384