[WIP] remove waitpub, export publish #47

schomatis · 2018-12-21T20:57:04Z

Fixes #38.

schomatis · 2018-12-21T21:14:20Z

@Stebalien This is a rough sketch of my proposal in #38, could you take a look at it please and tell me what you think please?

This patch doesn't guarantee that different PublishNow will actually happen in the order they were called (but this isn't guaranteed in the current code anyways, since the order of Update calls isn't enforced).

nitishm

/LGTM

nitishm · 2018-12-25T18:56:01Z

repub.go


 	valueLock          sync.Mutex
-	valueToPublish     cid.Cid
-	lastValuePublished cid.Cid
+	valueToPublish     *cid.Cid


Why is this a pointer ? Can the CID be modified (elsewhere) while we are waiting to publish ?

Ah I think I see how this handled in PublishNow() with the extractedValue == nil check.

Can the CID be modified (elsewhere) while we are waiting to publish ?

It shouldn't be, the Update API hasn't changed, we still make a local copy.

This should just be a cid.Cid and should be set to cid.Undef when "nil". (probably should have been cid.Nil but I didn't win that argument).

Yes, I thought of that at first but we do cid.Undef to issue publish orders, at least in some tests,

go-mfs/repub_test.go

Lines 43 to 45 in 4fb6dc4

go func() {

for {

rp.Update(cid.Undef)

so I think it violates the semantics I would expect of a nil value.

I agree that we should fix that and provide a cid.Nil but in the meanwhile I don't see the harm in implementing the nil with a pointer for an internal variable.

The pointer is fine but shouldn't be necessary. Really, we should fix that test, passing the "Undef" CID is should be equivalent to a "nil" CID (and passing a nil CID to rp.Update doesn't make sense).

Stebalien

This looks like the right approach. However, I'd expect PublishNow to only return after the latest value has been published (even if there's a concurrent publish).

Stebalien · 2018-12-27T23:15:44Z

repub.go


 	valueLock          sync.Mutex
-	valueToPublish     cid.Cid
-	lastValuePublished cid.Cid
+	valueToPublish     *cid.Cid


This should just be a cid.Cid and should be set to cid.Undef when "nil". (probably should have been cid.Nil but I didn't win that argument).

Stebalien · 2018-12-27T23:17:25Z

repub.go

-	if err != nil {
-		return err
+	if extractedValue == nil {
+		return nil


A concurrent call won't actually wait. We may need a RwMutex here.

schomatis · 2018-12-30T20:37:04Z

However, I'd expect PublishNow to only return after the latest value has been published (even if there's a concurrent publish).

Yes, that seems fair but WaitPub didn't do that, and my main objective here is to simplify the code, not redefine behavior, maybe PublishNow is a misleading name, would PublishCurrentValue be better?

Stebalien · 2019-01-05T01:58:00Z

Yes, that seems fair but WaitPub didn't do that, and my main objective here is to simplify the code, not redefine behavior, maybe PublishNow is a misleading name, would PublishCurrentValue be better?

If one changes a file and then calls WaitPub, WaitPub is guaranteed to not return until that change has been published. Of course, given multiple calls to WaitPub, one (or more) of these calls may block waiting for yet another change (that's the issue you're fixing).

Here, given two commands call PublishNow at the same time, one command will return early (before the publish happens). That means PublishNow, as implemented, isn't useful as a replacement for WaitPub.

Stebalien · 2019-01-05T01:58:32Z

repub.go

@@ -139,17 +113,20 @@ func (rp *Republisher) Run() {

 // Wrapper function around the user-defined `pubfunc`. It publishes
 // the (last) `valueToPublish` set and registers it in `lastValuePublished`.
-func (rp *Republisher) publish(ctx context.Context) error {
+// TODO: Allow passing a value to `PublishNow` which supersedes the
+// internal `valueToPublish`.


I'm not sure if we want to allow this. Users shouldn't swap out the MFS root using the republisher.

I'm not sure I understand the comment, what do you mean by swap out?

The TODO (that I don't think I worded correctly) was aiming at adding an optional argument that would replace the Update(newCid); PublishNow(); call pair with just a PublishNow(newCid) call.

Got it. Yeah, that makes sense.

(context: I keep thinking we're exposing the republisher to the user)

schomatis · 2019-01-05T03:23:50Z

Here, given two commands call PublishNow at the same time, one command will return early (before the publish happens). That means PublishNow, as implemented, isn't useful as a replacement for WaitPub.

Good point, actually this is just a simple (but important) mistake on my part, I should have been following the pubfunc structure of taking the lock twice, a first time to extract the value to publish and a second one to mark it as published (valueToPublish = nil) only after pubfunc is called, does that sound fair to you?

Stebalien · 2019-01-05T19:00:06Z

does that sound fair to you?

Are you referring to your latest change? That's going to overwrite potentially unpublished values. Also, we really shouldn't be allowing the user to invoke pubfunc multiple times in parallel (both for thread safety and to prevent logical races where we publish values out of order). In the past, this was protected by the loop.

schomatis · 2019-01-06T19:26:17Z

That's going to overwrite potentially unpublished values. Also, we really shouldn't be allowing the user to invoke pubfunc multiple times in parallel (both for thread safety and to prevent logical races where we publish values out of order). In the past, this was protected by the loop.

If I understand you correctly (please correct me if not) the two issues to study are:

Is the user-supplied PubFunc function thread-safe? This is what I was wondering in WaitPub may wait more than necessary #38 (comment), but from your following comments I assumed this wasn't a problem, if I misunderstood that (sorry!) and we can't risk calling it in parallel then we should just close this PR, which has no way of working then.

That's going to overwrite potentially unpublished values.

prevent logical races where we publish values out of order

This is actually what motivated this PR in the first place. I think the previous examples you mentioned put focus on multiple WaitPub calls, but from what I understand of the MFS API the user doesn't call WaitPub in isolation but only as a complement to Update (to ensure pubfunc was actually called when WaitPub returns). In that setup I don't see that this logical race being prevented, two simultaneous calls to Update(); WaitPub() don't guarantee that pubfunc is called for the two updated values, most likely one will overwrite the other (because of our short timer logic) and only one publish operation will happen. PublishNow does not fix that, but it makes it (IMO) much more explicit (giving visibility to what otherwise are easy to overlook bugs like #38). Putting the pubfunc in a loop gives the impression that we do an orderly publish of updated values when I think in fact we don't.

Anyway, if my appreciation in either of those points is wrong let's close this PR, my only objective at the moment, since I don't have much time left for proper fixes and redesigns, is to simplify the code for the next one to come along (hopefully not you :) to get a more clear perspective of what the code does (and doesn't do) and how could that be improved upon. Any suggestions towards that end you could propose I'll try to implement during next week.

Stebalien · 2019-01-08T00:08:33Z

This is what I was wondering in #38 (comment), but from your following comments I assumed this wasn't a problem, if I misunderstood that (sorry!) and we can't risk calling it in parallel then we should just close this PR, which has no way of working then.

It's probably thread-safe (although we shouldn't assume that) however, that doesn't matter. If the user calls Update(x) and then Update(y) in a single thread, y should win. With the new code, x could win (and could even prevent y from being published) given a concurrent PublishNow() call.

This is actually what motivated this PR in the first place.

Given two simultaneous calls to Update, one will always trump the other. However, given two sequential calls to update with interleaved calls to WaitPub and/or PublishNow, the last published value should always win. That's the real issue here (the end-user can't currently call Update anyways).

Anyway, if my appreciation in either of those points is wrong let's close this PR, my only objective at the moment, since I don't have much time left for proper fixes and redesigns, is to simplify the code for the next one to come along (hopefully not you :) to get a more clear perspective of what the code does (and doesn't do) and how could that be improved upon. Any suggestions towards that end you could propose I'll try to implement during next week.

Fixing #38 shouldn't require a redesign.

schomatis · 2019-01-08T00:26:55Z

However, given two sequential calls to update with interleaved calls to WaitPub and/or PublishNow, the last published value should always win.

If the user calls Update(x) and then Update(y) in a single thread, y should win. With the new code, x could win (and could even prevent y from being published) given a concurrent PublishNow() call.

I'm not sure I'm following, with this patch sequential calls to Update(x); PublishNow(); Update(y); PublishNow(); would not respect the order? (If so, could you expand on why?)

Fixing #38 shouldn't require a redesign.

Agreed, what I meant is that I want to diminish the technical debt that I think helps bugs like #38 go unnoticed, that requires a redesign I think.

schomatis · 2019-01-08T00:33:42Z

(If so, could you expand on why?)

Ok, I think I get it, the loop in Run doesn't play nice with independent PublishNow() calls from the user.

schomatis · 2019-01-08T00:45:18Z

Holding the lock throughout pubfunc seems like too much, closing then.

ghost assigned schomatis Dec 21, 2018

ghost added the status/in-progress In progress label Dec 21, 2018

schomatis force-pushed the fix/republisher/remove-waitpub branch 2 times, most recently from 5739f11 to 50b6144 Compare December 21, 2018 21:12

schomatis requested a review from Stebalien December 21, 2018 21:12

schomatis added the needs review label Dec 21, 2018

schomatis mentioned this pull request Dec 23, 2018

Review go-ipfs PR 4517 and see what can be extracted here. #32

Closed

4 tasks

nitishm mentioned this pull request Dec 25, 2018

Fix/32/pr ports from go-ipfs to go-mfs #49

Merged

nitishm reviewed Dec 25, 2018

View reviewed changes

Stebalien reviewed Dec 27, 2018

View reviewed changes

Stebalien reviewed Jan 5, 2019

View reviewed changes

remove waitpub, export publish

89de69a

schomatis force-pushed the fix/republisher/remove-waitpub branch from 50b6144 to 89de69a Compare January 5, 2019 03:28

schomatis closed this Jan 8, 2019

ghost removed the status/in-progress In progress label Jan 8, 2019

schomatis deleted the fix/republisher/remove-waitpub branch January 8, 2019 00:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] remove waitpub, export publish #47

[WIP] remove waitpub, export publish #47

schomatis commented Dec 21, 2018

schomatis commented Dec 21, 2018

nitishm left a comment

nitishm Dec 25, 2018

nitishm Dec 25, 2018

schomatis Dec 25, 2018

Stebalien Dec 27, 2018

schomatis Dec 30, 2018

Stebalien Jan 5, 2019

Stebalien left a comment

Stebalien Dec 27, 2018

Stebalien Dec 27, 2018

schomatis commented Dec 30, 2018

Stebalien commented Jan 5, 2019

Stebalien Jan 5, 2019

schomatis Jan 5, 2019

Stebalien Jan 5, 2019

schomatis commented Jan 5, 2019

Stebalien commented Jan 5, 2019

schomatis commented Jan 6, 2019

Stebalien commented Jan 8, 2019

schomatis commented Jan 8, 2019

schomatis commented Jan 8, 2019

schomatis commented Jan 8, 2019

[WIP] remove waitpub, export publish #47

[WIP] remove waitpub, export publish #47

Conversation

schomatis commented Dec 21, 2018

schomatis commented Dec 21, 2018

nitishm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schomatis commented Dec 30, 2018

Stebalien commented Jan 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schomatis commented Jan 5, 2019

Stebalien commented Jan 5, 2019

schomatis commented Jan 6, 2019

Stebalien commented Jan 8, 2019

schomatis commented Jan 8, 2019

schomatis commented Jan 8, 2019

schomatis commented Jan 8, 2019