Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VC metric to track validator statuses #5158

Open
nflaig opened this issue Feb 16, 2023 · 14 comments
Open

Add VC metric to track validator statuses #5158

nflaig opened this issue Feb 16, 2023 · 14 comments
Labels
good first issue Issues that are suitable for first-time contributors. help wanted The author indicates that additional help is wanted. meta-feature-request Issues to track feature requests. prio-low This is nice to have. scope-metrics All issues with regards to the exposed metrics. scope-ux Issues for CLI UX or general consumer UX.

Comments

@nflaig
Copy link
Member

nflaig commented Feb 16, 2023

Is your feature request related to a problem? Please describe.

As a node operator I want to know the status of my validators without relying on 3rd party solutions such as beaconchai.in.

The following questions should be answered

  • how many of my validators are active vs just imported without deposit?
  • how many validators have a pending deposit?
  • how many of my validators exited and was it voluntary or forced?
  • what is the state of my withdrawal if exited?

Describe the solution you'd like

Add VC metric to track validator statuses based on getValidatorStatus.

There is already a metric to track total count of validators (vc_indices_count), might just need to add status label.

indices: register.gauge({
name: "vc_indices_count",
help: "Current count of indices in IndicesService",
}),

This metric should then be visualized on a dashboard, Lodestar validator client looks like a good candidate as it already displays the VC indices.

Finally, also need to update the metric value used for validator_active in the client monitoring to correctly only count the active validators instead of total.

@philknows philknows added meta-discussion Indicates a topic that requires input from various developers. scope-metrics All issues with regards to the exposed metrics. scope-ux Issues for CLI UX or general consumer UX. meta-feature-request Issues to track feature requests. labels Feb 16, 2023
@philknows
Copy link
Member

This kind of ties into re-thinking a little bit about how the validator UX should look like (#4785) to the end user and how we can improve from our side.

@maschad
Copy link
Contributor

maschad commented Mar 2, 2023

Agreed @philknows and I think this can inform how we approach #5192

Along with the status label we may want to track other metadata like:

  • fee_recipient set for that validator
  • pubkey
  • validator index

@maschad
Copy link
Contributor

maschad commented Mar 2, 2023

Lighthouse's Validator Monitor Metrics tracks a few things we may be interested in such as:

I think this is pretty comprehensive. wdyt @nflaig ?

@maschad
Copy link
Contributor

maschad commented Mar 2, 2023

Following up from #5161 we would need to do some refactoring in order to get all this data in an accessible place, we want to ensure that we log every epoch i.e. not conditionally - The main issue here is that given Validator status should be logged and tracked every epoch, getStateValidators , what's the cost of making this api call every epoch? @tuyennhv @dapplion @wemeetagain

@dapplion
Copy link
Contributor

dapplion commented Mar 2, 2023

Why does it have to be logged every epoch? Where does this requirement come from? getStateValidators is somewhat expensive to keep polling regularly

@maschad
Copy link
Contributor

maschad commented Mar 3, 2023

Why does it have to be logged every epoch? Where does this requirement come from?

#4785 (comment)

That was the impetus behind #5161.

Perhaps we should cache this response then.

@nflaig
Copy link
Member Author

nflaig commented Mar 3, 2023

The Lighthouse validator monitor dashboard looks pretty complete to me and should answer most of the questions mentioned in the issue.

The question is where should those metrics be gathered from and displayed. If we use the validator_monitor metrics we need to get them from the BN and display them on the validator monitor dashboard.

The initial idea of the issue was to get the metrics from the VC and display them on the validator client dashboard. In the end, I think we want both as we should see the VC and BN as independent services.

@maschad
Copy link
Contributor

maschad commented Mar 3, 2023

I would say if we are using Lighthouse as an example it would be more appropriate in the validator monitor dashboard , I think we can delegate metrics related to attestations, publishing, block proposals, proofs and overall consensus between nodes to the validator client dashboard.

@nflaig
Copy link
Member Author

nflaig commented Mar 3, 2023

They still have few metrics in their validator client dashboard, such as total validators and enabled validators.

@wemeetagain
Copy link
Member

Perhaps we should cache this response then.

This won't do if we're trying to track statuses. These statuses are things that can change every epoch, so using cached statuses will just return the old statuses and not the (possibly) changed statuses.

getStateValidators is somewhat expensive to keep polling regularly

@dapplion what do you mean? we poll it regularly right now (as long as we haven't discovered all attached validators)

RE validator monitor vs validator client metrics, my opinion is that additional logs / metrics is a nice-to-have that should be weighed against additional validator / beacon node load. We also need to be very thoughtful and explicit about what additional logs and metrics are actually helpful for users, and what things are just us doing make-work.

@dapplion
Copy link
Contributor

dapplion commented Mar 7, 2023

@dapplion what do you mean? we poll it regularly right now (as long as we haven't discovered all attached validators)

Right but once everyone is discovered we stop. If I understand the plan is to check the status of each connected validator every epoch

maschad added a commit to maschad/lodestar that referenced this issue Mar 7, 2023
maschad added a commit to maschad/lodestar that referenced this issue Mar 13, 2023
maschad added a commit to maschad/lodestar that referenced this issue Mar 14, 2023
maschad added a commit to maschad/lodestar that referenced this issue Mar 20, 2023
maschad added a commit to maschad/lodestar that referenced this issue Mar 24, 2023
maschad added a commit to maschad/lodestar that referenced this issue Mar 24, 2023
maschad added a commit to maschad/lodestar that referenced this issue Mar 24, 2023
@nflaig
Copy link
Member Author

nflaig commented Apr 6, 2023

Why does it have to be logged every epoch? Where does this requirement come from? getStateValidators is somewhat expensive to keep polling regularly

Looks like all other VCs poll getStateValidators every epoch, I agree that should be conservative where possible but if another VC is used with Lodestar BN this will happen either way

@philknows
Copy link
Member

Benefits: While VC's running, if VCs get exited, you can remove the keys. It will prevent inaccuracies in active validator keys.

Edge case: If a validator is in a sync committee as it is exiting.

@philknows philknows added good first issue Issues that are suitable for first-time contributors. help wanted The author indicates that additional help is wanted. and removed meta-discussion Indicates a topic that requires input from various developers. labels Nov 6, 2024
@philknows
Copy link
Member

Can we measure the performance impact of the UX improvement to justify polling every epoch? There have been improvements to the interaction since this issue was created, so we can make a case to test this with Vero since they poll every epoch already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Issues that are suitable for first-time contributors. help wanted The author indicates that additional help is wanted. meta-feature-request Issues to track feature requests. prio-low This is nice to have. scope-metrics All issues with regards to the exposed metrics. scope-ux Issues for CLI UX or general consumer UX.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@wemeetagain @maschad @dapplion @nflaig @philknows and others