-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multitenant: expose consumption metrics #67873
Conversation
d246e9b
to
5a51d2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a "fine" approach because it's pragmatic and doesn't necessarily increase the cardinality of metrics in the system by a ton. The counter argument is that most tenants are likely to not have many ranges (at least, today they are likely to have many ranges) and that means that only a small number of nodes in the host cluster are likely to house the data for a tenant. However, ever tenant is going to round-robin over the host cluster nodes so most host cluster nodes will at some point have this usage data for each tenant. At some point this time series cardinality is going to hurt.
Another approach is to not read this directly from this service but rather to put up a separate service in the operator which serves this data by polling or creating a changefeed over the table. I think that approach would be better.
Reviewed 3 of 34 files at r1, 3 of 4 files at r3, 8 of 21 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @kernfeld-cockroach and @RaduBerinde)
pkg/ccl/multitenantccl/tenantcostserver/metrics.go, line 40 at r4 (raw file):
syncutil.Mutex // tenantMetrics stores the tenantMetrics for all tenants that have // send TokenBucketRequests to this node.
nit: s/send/sent/
pkg/ccl/multitenantccl/tenantcostserver/metrics.go, line 42 at r4 (raw file):
// send TokenBucketRequests to this node. // TODO(radu): add garbage collection to remove inactive tenants. tenantMetrics map[roachpb.TenantID]tenantMetrics
What's your vision here? Timestamps? Have we considered a radically different approach for these metrics: what if we built a separate service that serves a prometheus endpoint and polls the table or creates a changefeed? That way we'd dramatically reduce the cardinality of the data set side from every node in the cluster * tenants to just a constant * tenants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your high-level idea but I don't like the operational overhead of adding an extra component that is deployed separately just to do this one thing. I'd also want to avoid some external component being dependent on the particulars of the system table.
Could we achieve the same with just code in crdb? Perhaps we could use the jobs infrastructure to have one forever-running job that is executed by a single node and gets picked up by another node on failures? (similar to what we do for auto table statistics).
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @kernfeld-cockroach and @RaduBerinde)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we could do something like that. I thought about it some more and AndyK indicates that he doesn't expect to make host clusters larger than ~50 nodes. Given that, let's just move forward with this until it's a problem. I think one new histogram in crdb will create more time series in prometheus than this.
Reviewed 1 of 34 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @kernfeld-cockroach and @RaduBerinde)
pkg/ccl/multitenantccl/tenantcostserver/metrics.go, line 42 at r4 (raw file):
Previously, ajwerner wrote…
What's your vision here? Timestamps? Have we considered a radically different approach for these metrics: what if we built a separate service that serves a prometheus endpoint and polls the table or creates a changefeed? That way we'd dramatically reduce the cardinality of the data set side from every node in the cluster * tenants to just a constant * tenants.
Still curious to understand the vision here. I don't think it matters all that much though. I think just a periodic loop would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @kernfeld-cockroach and @RaduBerinde)
pkg/ccl/multitenantccl/tenantcostserver/metrics.go, line 42 at r4 (raw file):
Previously, ajwerner wrote…
Still curious to understand the vision here. I don't think it matters all that much though. I think just a periodic loop would be fine.
Yeah timestamps is what I had in mind. I'm still thinking about it. In practice, I'd think it wouldn't be the end of the world if we don't GC at all, or GC just deleted tenants.
One issue is if we want the system to always export the metrics for all tenants, even currently inactive ones. I believe that is not currently a requirement (because Prometheus will "remember" the last seen value), but it feels shoddy. A dedicated component like you suggested would fix that. But yeah I want to merge this basic thing for now so CC can do some dev on their side.
FWIW I can't think of a reason this would be bad on the CC side. Operationally we care about these values to limit the tenant's usage, but if the tenant isn't active then there will be no usage to limit. We'll feed the data into billing, but it seems pretty safe to assume that missing data is all 0's. |
My concern here is about a tenant that consumed some stuff at the beginning of the month and then went inactive for a week. If the metric for this tenant is not in the crdb system, will Prometheus still remember the last consumption value that it saw a week ago? Because assuming 0 would be incorrect. |
|
Cool, thanks. I buy 1, and to some extent 2 (there is a certain fragility if we only relied on it). Regarding 3, the tenant's usage before the inactivity period could be arbitrarily high; and they would still eventually be billed for it, just not in the right billing period. |
Oof, good point. I didn't think of that. |
5a51d2f
to
ea52aff
Compare
This commit adds a float aggregate gauge. We also add support for incrementing a GaugeFloat64 (required for the aggregate gauge). Release note: None
This commit adds per-tenant consumption metrics. Each node presents the latest consumption value that it knows about, for each tenant. The higher-level logic will take the Max value across all nodes. The metrics are: - tenant.consumption.request_units - tenant.consumption.read_requests - tenant.consumption.read_bytes - tenant.consumption.write_requests - tenant.consumption.write_bytes - tenant.consumption.sql_pods_cpu_seconds Note that these are aggregate metrics, meaning that we export a separate value for each tenant, and also a sum across all tenants. The sums are not meaningful in this case and should not be used. Release note: None
ea52aff
to
ac7ff37
Compare
bors r+ |
Build failed: |
bors r+ |
Build succeeded: |
This PR is for the top two commits. The rest is #67768.
util: add AggGaugeFloat64
This commit adds a float aggregate gauge. We also add support for
incrementing a GaugeFloat64 (required for the aggregate gauge).
Release note: None
multitenant: expose consumption metrics
This commit adds per-tenant consumption metrics. Each node presents
the latest consumption value that it knows about, for each tenant. The
higher-level logic will take the Max value across all nodes.
The metrics are:
Note that these are aggregate metrics, meaning that we export a
separate value for each tenant, and also a sum across all tenants. The
sums are not meaningful in this case and should not be used.
Release note: None