-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(txpool) modify txpool guard to be for pipeline syncs only #4075
Conversation
// assert!(!pool.is_empty()); this keeps failing - indicating that there's a bug with these | ||
// tests | ||
|
||
// if we attempt to modify the initial test above with SyncState::Idle, the same assert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @mattsse re these tests
tl;dr - assert pool.is_empty()
and assert !pool.is_empty()
don't seem to be doing what we want it to in these tests (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we want here?
not following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uncommented assert!(!pool.is_empty());
and added another test to highlight what I mean.
Codecov Report
... and 13 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, I first envisioned that the initial_sync_done
flag will be tracked in the transactionmanager, but I think we can also just move it to the Syncstate impl, perhaps this is useful somewhere else.
ATM this seems perfectly reasonable but perhaps the txpool needs additional logic around this, but I can't think of anything else to check and just using the initial sync as the guard is sufficient atm.
so I'm supportive of adding it to syncstate.
tracking another atomic bool is inexpensive so this is fine, but we could put all of them in one atomicu8
crates/net/network-api/src/lib.rs
Outdated
@@ -49,6 +49,9 @@ pub trait NetworkInfo: Send + Sync { | |||
|
|||
/// Returns `true` if the network is undergoing sync. | |||
fn is_syncing(&self) -> bool; | |||
|
|||
/// Returns `true` if the network is undergoing an initial sync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs some additional context what this exactly represents wrt pipeline sync:
if this is a fresh node sync then this is expected to track the first complete pipeline run.
crates/net/network/src/network.rs
Outdated
let future_state = state.is_syncing(); | ||
let curr_state = self.inner.is_syncing.load(Ordering::Relaxed); | ||
let init_sync_done = self.inner.initial_sync_done.load(Ordering::Relaxed); | ||
self.inner.is_syncing.store(future_state, Ordering::Relaxed); | ||
if !init_sync_done { | ||
// we've either been in Idle or pipeline Syncing here, check to see if we're | ||
// moving from a pipeline sync to idle | ||
if curr_state && !future_state { | ||
// set initial_sync_done flag to true here, every sync after this will be a livesync | ||
self.inner.initial_sync_done.store(true, Ordering::Relaxed); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can simplify this,
basically initial_sync_done
will be true when is_syncing
goes from false to true
I think we can use .swap
here and compare the replaced value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm let me think about this - I think I see what you're saying, but I need think through the implementation
I think you're right in that we know when is_syncing
goes from false-> true (the first time, but is effectively "idempotent" every time after), initial_sync_done
-> true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think this will replace the block that I have now:
self.inner.initial_sync_done.compare_exchange(false, curr_state && !future_state, Ordering::Relaxed, Ordering::Relaxed);
basically a CAS (compare_and_swap
seems deprecated in favor of this new compare_exchange
api) that only sets initial_sync_done
if its false
I do want to mention that initial_sync_done
should only be true when !init_sync_done
&& is_syncing
goes from true->false
what you mentioned may not work (when is_syncing
goes from false->true) because I think we need to capture when the initial_sync completes and not just when it starts.
otherwise in the network manager code we won't be able to differentiate between a live_sync and an ongoing pipeline sync (i.e is_syncing() && is_initial_syncing()
will be true false**** even during the pipeline sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what we want to capture is the state transition from false->true->false with is_syncing
false->true showcases that we've kicked off a pipeline sync, but we can't mark is_initial_done
until it has actually completed
b0d7eca
to
d28e5a8
Compare
crates/net/network-api/src/lib.rs
Outdated
@@ -49,6 +49,9 @@ pub trait NetworkInfo: Send + Sync { | |||
|
|||
/// Returns `true` if the network is undergoing sync. | |||
fn is_syncing(&self) -> bool; | |||
|
|||
/// Returns `true` when a fresh node is initially being synced, known as a Pipeline sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not quite accurate because this also returns true on restart if the first pipeline sync is active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like fresh node being initially synced
is what's incorrect/misleading. let me reword this to say Returns true when a node is undergoing a pipeline sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because during a pipeline sync, state of the sync "job" is stored externally I think, and it'll be loaded in/continued from upon restart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good progress almost there
crates/net/network/src/network.rs
Outdated
let future_state = state.is_syncing(); | ||
let curr_syncing = self.inner.is_syncing.load(Ordering::Relaxed); | ||
self.inner.is_syncing.store(future_state, Ordering::Relaxed); | ||
// explicitly throw this error away since we expect it to error every time after it's been | ||
// set to true, i.e after every live sync | ||
let _ = self.inner.initial_sync_done.compare_exchange( | ||
false, | ||
// on first move back from Syncing->Idle - swap and set init_sync_done to true | ||
curr_syncing && !future_state, | ||
Ordering::Relaxed, | ||
Ordering::Relaxed, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not quite exactly what I had in mind, you can do a swap on line 272 and then update is_initial_done depending on the values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah.. ok yeah. I misunderstood the initial comment oops, but Ill keep the CAS bc it looks cleaner (and setting init_sync_done
it in one atomic step/without a load and a store is nice)
basically what you're saying is instead of the load/store on is_syncing
I just need one swap and the return value is what is being set as curr_syncing
(which I'm going to rename to prev_state
to make a bit clearer)
d28e5a8
to
1eca103
Compare
crates/net/network/src/network.rs
Outdated
let _ = self.inner.initial_sync_done.compare_exchange( | ||
false, | ||
// on first move back from Syncing->Idle - swap and set init_sync_done to true | ||
prev_state && !future_state, | ||
Ordering::Relaxed, | ||
Ordering::Relaxed, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like that we don't handle the result,
I would prefer to update the initial_sync_done
only if prev_state was false and future_state is true, this would be easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, done
crates/net/network-api/src/lib.rs
Outdated
@@ -49,6 +49,9 @@ pub trait NetworkInfo: Send + Sync { | |||
|
|||
/// Returns `true` if the network is undergoing sync. | |||
fn is_syncing(&self) -> bool; | |||
|
|||
/// Returns `true` when a fresh node being bootstrapped is undergoing a Pipeline sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Returns `true` when a fresh node being bootstrapped is undergoing a Pipeline sync | |
/// Returns `true` when the node is undergoing the very first Pipeline sync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// assert!(!pool.is_empty()); this keeps failing - indicating that there's a bug with these | ||
// tests | ||
|
||
// if we attempt to modify the initial test above with SyncState::Idle, the same assert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we want here?
not following
@mattsse - I added broken tests to the latest commit to showcase the behavior im calling out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome,
I made two changes
also the failing tests test_tx_broadcasts_through_two_syncs
can be fixed by mocking a session established, so there's a new peer in the set.
this is the last thing to do
crates/net/network/src/network.rs
Outdated
let init_sync_done = self.inner.initial_sync_done.load(Ordering::Relaxed); | ||
let is_currently_syncing = self.inner.is_syncing.load(Ordering::Relaxed); | ||
is_currently_syncing && !init_sync_done | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I flattened the checks here, because we only need to load is_syncing
if initial_sync_done is false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome makes sense, good catch
// we penalize the peer that sent the bad transaction with | ||
// the assumption that the peer should have known that this | ||
// transaction is bad. (e.g. consensus rules) | ||
if err.is_bad_transaction() && !this.network.is_initially_syncing() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to currently syncing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think we need this for every sync, good catch thanks
…doc comment for flag
…itial_syncing flag
af3c6de
to
dc6d636
Compare
should close #4066
the tests in
network/src/transactions.rs
need a bit of work - I think there's some bugs hiding in them rn.