From f1bb82ae8fc4ae98cfb9c6d4180aef9dc034798a Mon Sep 17 00:00:00 2001 From: Chris Sinjakli Date: Tue, 1 Feb 2022 00:49:39 +0000 Subject: [PATCH 1/3] Use framework-specific route info in collector when available While `PATH_INFO` is framework agnostic, and works for any Rack app, some Ruby web frameworks pass a more useful piece of information into the request env - the route that the request matched. This means that rather than using our generic `:id` and `:uuid` replacements in the `path` label for any path segments that look like dynamic IDs, we can put the actual route that matched in there, with correctly named parameters. For example, if a Sinatra app defined a route like: get "/foo/:bar" do ... end instead of containing `/foo/:id`, the `path` label would contain `/foo/:bar`. Sadly, Rails is a notable exception, and (as far as I can tell at the time of writing) doesn't provide this info in the request env. Signed-off-by: Chris Sinjakli --- lib/prometheus/middleware/collector.rb | 50 +++++++++++++++++++- spec/prometheus/middleware/collector_spec.rb | 48 +++++++++++++++++++ 2 files changed, 97 insertions(+), 1 deletion(-) diff --git a/lib/prometheus/middleware/collector.rb b/lib/prometheus/middleware/collector.rb index 48e3ddd1..c3ff58b7 100644 --- a/lib/prometheus/middleware/collector.rb +++ b/lib/prometheus/middleware/collector.rb @@ -67,7 +67,7 @@ def trace(env) end def record(env, code, duration) - path = [env["SCRIPT_NAME"], env['PATH_INFO']].join + path = generate_path(env) counter_labels = { code: code, @@ -87,6 +87,54 @@ def record(env, code, duration) nil end + # While `PATH_INFO` is framework agnostic, and works for any Rack app, some Ruby web + # frameworks pass a more useful piece of information into the request env - the + # route that the request matched. + # + # This means that rather than using our generic `:id` and `:uuid` replacements in + # the `path` label for any path segments that look like dynamic IDs, we can put the + # actual route that matched in there, with correctly named parameters. For example, + # if a Sinatra app defined a route like: + # + # get "/foo/:bar" do + # ... + # end + # + # instead of containing `/foo/:id`, the `path` label would contain `/foo/:bar`. + # + # Sadly, Rails is a notable exception, and (as far as I can tell at the time of + # writing) doesn't provide this info in the request env. + def generate_path(env) + if env['sinatra.route'] + route = env['sinatra.route'].partition(' ').last + elsif env['grape.routing_args'] + # We are deep in the weeds of an object that Grape passes into the request env, + # but don't document any explicit guarantees about. Let's have a fallback in + # case they change it down the line. + # + # This code would be neater with the safe navigation operator (`&.`) here rather + # than the much more verbose `respond_to?` calls, but unlike Rails' `try` + # method, it still raises an error if the object is non-nil, but doesn't respond + # to the method being called on it. + route = nil + + route_info = env.dig('grape.routing_args', :route_info) + if route_info.respond_to?(:pattern) + pattern = route_info.pattern + if pattern.respond_to?(:origin) + route = pattern.origin + end + end + + # Fall back to PATH_INFO if Grape change the structure of `grape.routing_args` + route ||= env['PATH_INFO'] + else + route = env['PATH_INFO'] + end + + [env['SCRIPT_NAME'], route].join + end + def strip_ids_from_path(path) path .gsub(%r{/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(/|$)}, '/:uuid\\1') diff --git a/spec/prometheus/middleware/collector_spec.rb b/spec/prometheus/middleware/collector_spec.rb index 7809bcf9..49664a3a 100644 --- a/spec/prometheus/middleware/collector_spec.rb +++ b/spec/prometheus/middleware/collector_spec.rb @@ -95,6 +95,54 @@ expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1) end + it 'prefers sinatra.route to PATH_INFO' do + metric = :http_server_requests_total + + env('sinatra.route', 'GET /foo/:named_param') + get '/foo/7' + env('sinatra.route', nil) + expect(registry.get(metric).values.keys.last[:path]).to eql("/foo/:named_param") + end + + it 'prefers grape.routing_args to PATH_INFO' do + metric = :http_server_requests_total + + # request.env["grape.routing_args"][:route_info].pattern.origin + # + # Yes, this is the object you have to traverse to get the path. + # + # No, I'm not happy about it either. + grape_routing_args = { + route_info: double(:route_info, + pattern: double(:pattern, + origin: '/foo/:named_param' + ) + ) + } + + env('grape.routing_args', grape_routing_args) + get '/foo/7' + env('grape.routing_args', nil) + expect(registry.get(metric).values.keys.last[:path]).to eql("/foo/:named_param") + end + + it "falls back to PATH_INFO if the structure of grape.routing_args changes" do + metric = :http_server_requests_total + + grape_routing_args = { + route_info: double(:route_info, + pattern: double(:pattern, + origin_but_different: '/foo/:named_param' + ) + ) + } + + env('grape.routing_args', grape_routing_args) + get '/foo/7' + env('grape.routing_args', nil) + expect(registry.get(metric).values.keys.last[:path]).to eql("/foo/:id") + end + context 'when the app raises an exception' do let(:original_app) do lambda do |env| From 1b159e5faeaa040786642ec1d4bd82bfa0dc661b Mon Sep 17 00:00:00 2001 From: Chris Sinjakli Date: Tue, 1 Feb 2022 02:20:19 +0000 Subject: [PATCH 2/3] Handle consecutive path segments containing IDs in collector Previously, we would fail to strip IDs from paths if they appeared in consecutive path segments. This is because our regex that looked for IDs and UUIDs would consume the `/` character that followed them, causing the next one not to match. This commit uses a lookahead to match against the `/` without consuming it. Signed-off-by: Chris Sinjakli --- lib/prometheus/middleware/collector.rb | 4 +-- spec/prometheus/middleware/collector_spec.rb | 28 ++++++++++++++++++++ 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/lib/prometheus/middleware/collector.rb b/lib/prometheus/middleware/collector.rb index c3ff58b7..26893176 100644 --- a/lib/prometheus/middleware/collector.rb +++ b/lib/prometheus/middleware/collector.rb @@ -137,8 +137,8 @@ def generate_path(env) def strip_ids_from_path(path) path - .gsub(%r{/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(/|$)}, '/:uuid\\1') - .gsub(%r{/\d+(/|$)}, '/:id\\1') + .gsub(%r{/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(?=/|$)}, '/:uuid\\1') + .gsub(%r{/\d+(?=/|$)}, '/:id\\1') end end end diff --git a/spec/prometheus/middleware/collector_spec.rb b/spec/prometheus/middleware/collector_spec.rb index 49664a3a..06034289 100644 --- a/spec/prometheus/middleware/collector_spec.rb +++ b/spec/prometheus/middleware/collector_spec.rb @@ -95,6 +95,34 @@ expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1) end + it 'handles consecutive path segments containing IDs' do + expect(Benchmark).to receive(:realtime).and_yield.and_return(0.3) + + get '/foo/42/24' + + metric = :http_server_requests_total + labels = { method: 'get', path: '/foo/:id/:id', code: '200' } + expect(registry.get(metric).get(labels: labels)).to eql(1.0) + + metric = :http_server_request_duration_seconds + labels = { method: 'get', path: '/foo/:id/:id' } + expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1) + end + + it 'handles consecutive path segments containing UUIDs' do + expect(Benchmark).to receive(:realtime).and_yield.and_return(0.3) + + get '/foo/5180349d-a491-4d73-af30-4194a46bdff3/5180349d-a491-4d73-af30-4194a46bdff2' + + metric = :http_server_requests_total + labels = { method: 'get', path: '/foo/:uuid/:uuid', code: '200' } + expect(registry.get(metric).get(labels: labels)).to eql(1.0) + + metric = :http_server_request_duration_seconds + labels = { method: 'get', path: '/foo/:uuid/:uuid' } + expect(registry.get(metric).get(labels: labels)).to include("0.1" => 0, "0.5" => 1) + end + it 'prefers sinatra.route to PATH_INFO' do metric = :http_server_requests_total From 9d8ba56b4f3cb8eaec1f49fe26e497686311f796 Mon Sep 17 00:00:00 2001 From: Chris Sinjakli Date: Tue, 1 Feb 2022 21:16:37 +0000 Subject: [PATCH 3/3] Update UPGRADING.md Add note about improvements to path label generation in `Prometheus::Middleware::Collector`. Signed-off-by: Chris Sinjakli --- UPGRADING.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/UPGRADING.md b/UPGRADING.md index 1167de2e..75b1d0cc 100644 --- a/UPGRADING.md +++ b/UPGRADING.md @@ -140,7 +140,7 @@ recommends exporting a default value for any time series you know will exist in For series with no labels, other Prometheus clients (including Go, Java, and Python) do this automatically, so we have matched that behaviour in the 3.x.x series. -## Path generation fix in Collector middleware +## Added `SCRIPT_NAME` to path labels in Collector middleware Previously, we did not include `Rack::Request`'s `SCRIPT_NAME` when building paths in `Prometheus::Middleware::Collector`. We have now added this, which means that any @@ -151,6 +151,22 @@ This will most typically be present when mounting several Rack applications in t server process, such as when using [Rails Engines](https://guides.rubyonrails.org/engines.html). +## Improved stripping of IDs/UUIDs from paths in Collector middleware + +Where available (currently for applications written in the Sinatra and Grape frameworks), +we now use framework-specific equivalents to `PATH_INFO` in +`Prometheus::Middleware::Collector`, which means that rather than having path segments +replaced with the generic `:id` and `:uuid` placeholders, you'll see the route as you +defined it in your framework. + +For frameworks where that information isn't available to us (most notably Rails), we still +fall back to using `PATH_INFO`, though we have also improved how we strip IDs/UUIDs from +it. Previously, we would only strip them from alternating path segments due to the way we +were matching them. We have improved that matching so it works even when there are +IDs/UUIDs in consecutive path segments. + +You may notice the path label change for some of your endpoints. + ## Improved validation of label names Earlier versions of the Ruby Prometheus client performed limited validation of label names