feat(flags): add quota limiting for feature flags served up with `/decide` and to `local_evaluation` #28564

dmarticus · 2025-02-11T20:31:12Z

Problem

Part 1 of 2 of this issue #28223, this change adds quota limiting to the /decide and /local_evaluation endpoints so that we stop serving flags to users who are not paying for them.

A few major things to note here: before I explain more

This quota limiting feature is gated behind an environment variable (right now just a boolean), which means that it is off by default until we turn it on. The reason I did this is because i don't just want to roll out quota limiting to everyone without also sharing documentation around it (and potentially reaching out to some power users directly); I don't want people to be shocked that their flag requests start to fail.
To that point, I'm also open to making the environment variables that control quota limiting be a list of team IDs (rather than a boolean), so that we can toggle them on or off certain teams rather than rolling this to everyone. I just wanted to get my idea out so folks could debate it, and this change is an easy one to make if folks want it.
Once we're aligned on my implementation, I'm gonna write the docs about this feature that we can point our users at.
This change won't go live until I've updated every SDK to handle the quota-limited response. Those changes will come next.

Changes

This change uses a lot of the existing infrastructure for quota limiting recordings, so not a lot of code needed to be changed. I just did the following:

added FEATURE_FLAGS_EVALUATED as a quota-limited resource
set the overage buffer (e.g. how many resources they're allowed to go over before our requests start erroring out) to zero (open to discussing this)
added a conditional block to /decide that, if we pass in an environment variable that turns on quota limiting, returns an quota-limited flag response

{
    "quotaLimited": ["feature_flags"],
    "featureFlags": {},
    "errorsWhileComputingFlags": True,
    "featureFlagPayloads": {},
}

I also refactored the logical flow in /decide a bit (which mostly consisted of me pulling out the flag computation logic into a separate function). This makes it IMO a bit easier to read, and it'll also make it easier for me to redirect the flag computation to a separate service when I'm testing the new service in production.

greptile-apps

PR Summary

This PR adds quota limiting functionality to the /decide endpoint to restrict feature flag access for teams exceeding usage limits, with controlled rollout capabilities.

Added DECIDE_FEATURE_FLAG_QUOTA_CHECK setting in /posthog/settings/web.py to control quota limiting rollout
Extracted flag computation into compute_feature_flags() in /posthog/api/decide.py for better maintainability
Added FEATURE_FLAGS to QuotaResource enum in /ee/billing/quota_limiting.py with zero overage buffer
Returns structured response with empty flags when quota limited: {"quotaLimited": ["feature_flags"], "featureFlags": {}}
Added comprehensive test coverage in /posthog/api/test/test_decide.py for quota limiting scenarios

_{5 file(s) reviewed, 1 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

ee/billing/test/test_quota_limiting.py

dmarticus · 2025-02-11T20:32:18Z

ee/billing/quota_limiting.py

@@ -56,6 +57,7 @@ class QuotaLimitingCaches(Enum):
    QuotaResource.EVENTS: 0,
    QuotaResource.RECORDINGS: 1000,
    QuotaResource.ROWS_SYNCED: 0,
+    QuotaResource.FEATURE_FLAGS: 0,  # TODO: should we have a buffer here?


I'm open to calling this feature_flags_evaluated instead, that might be more clear (I like how rows_synced is more obvious than just events.

I think FEATURE_FLAGS_EVALUATED makes sense. That way it's not confused with how many feature flags a team has, which I assume is NOT quota limited?

dmarticus · 2025-02-11T20:36:16Z

posthog/api/decide.py

-            active_flags = {key: value for key, value in feature_flags.items() if value}
-
-            if api_version == 2:
-                flags_response["featureFlags"] = active_flags
-            elif api_version >= 3:
-                # v3 returns all flags, not just active ones, as well as if there was an error computing all flags
-                flags_response["featureFlags"] = feature_flags
-                flags_response["errorsWhileComputingFlags"] = errors
-                flags_response["featureFlagPayloads"] = feature_flag_payloads
-            else:
-                # default v1
-                flags_response["featureFlags"] = list(active_flags.keys())
-
-            # metrics for feature flags
-            team_id_label = label_for_team_id_to_track(team.pk)
-            FLAG_EVALUATION_COUNTER.labels(
-                team_id=team_id_label,
-                errors_computing=errors,
-                has_hash_key_override=bool(data.get("$anon_distinct_id")),
-            ).inc()
-
-            if is_request_sampled_for_logging:
-                logger.warn(
-                    "DECIDE_REQUEST_SUCCEEDED",
-                    team_id=team.id,
-                    distinct_id=distinct_id,
-                    errors_while_computing=errors or False,
-                    has_hash_key_override=bool(data.get("$anon_distinct_id")),
-                )
-        else:
-            flags_response["featureFlags"] = {}


this code was all moved to get_feature_flags_response

I wish GitHub supported semantic diffs to make it easier to review changes like this. 😆

yeah, that'd be so nice. Sometimes I find the side-by-side view to help with it.

dmarticus · 2025-02-11T20:36:43Z

posthog/api/decide.py

-        # NOTE: Whenever you add something to decide response, update this test:
-        # `test_decide_doesnt_error_out_when_database_is_down`
-        # which ensures that decide doesn't error out when the database is down
-
-        if feature_flags:
-            # Billing analytics for decide requests with feature flags
-            # Don't count if all requests are for survey targeting flags only.
-            if not all(flag.startswith(SURVEY_TARGETING_FLAG_PREFIX) for flag in feature_flags.keys()):
-                # Sample no. of decide requests with feature flags
-                if settings.DECIDE_BILLING_SAMPLING_RATE and random() < settings.DECIDE_BILLING_SAMPLING_RATE:
-                    count = int(1 / settings.DECIDE_BILLING_SAMPLING_RATE)
-                    increment_request_count(team.pk, count)


this code was also moved to compute_feature_flags, since it's only relevant if we actually evaluated flags.

dmarticus · 2025-02-11T20:37:41Z

posthog/settings/web.py

@@ -60,6 +60,9 @@
 # if `true` we disable session replay if over quota
 DECIDE_SESSION_REPLAY_QUOTA_CHECK = get_from_env("DECIDE_SESSION_REPLAY_QUOTA_CHECK", False, type_cast=str_to_bool)

+# if `true` we disable feature flags if over quota
+DECIDE_FEATURE_FLAG_QUOTA_CHECK = get_from_env("DECIDE_FEATURE_FLAG_QUOTA_CHECK", False, type_cast=str_to_bool)


this could be an array of team IDs to either explicit limit or explicitly not limit.

Just curious, why wouldn't we control this with our own feature flag?

I guess we could! I was just following the pattern of the other environment variable controls. We've generally avoided using feature flags that are evaluated within /decide itself, though (this variable is passed into get_decide.

We've generally avoided using feature flags that are evaluated within /decide itself

Ha! I guess that could get recursive. I'm not suggesting making this change. Just curious.

haacked · 2025-02-11T20:55:08Z

ee/billing/quota_limiting.py

@@ -56,6 +57,7 @@ class QuotaLimitingCaches(Enum):
    QuotaResource.EVENTS: 0,
    QuotaResource.RECORDINGS: 1000,
    QuotaResource.ROWS_SYNCED: 0,
+    QuotaResource.FEATURE_FLAGS: 0,  # TODO: should we have a buffer here?


I think FEATURE_FLAGS_EVALUATED makes sense. That way it's not confused with how many feature flags a team has, which I assume is NOT quota limited?

haacked · 2025-02-11T20:58:00Z

ee/billing/quota_limiting.py

@@ -623,8 +646,9 @@ def update_all_orgs_billing_quotas(
    for org_id in orgs_with_changes:
        properties = {
            "quota_limited_events": quota_limited_orgs["events"].get(org_id, None),
-            "quota_limited_recordings": quota_limited_orgs["events"].get(org_id, None),
+            "quota_limited_recordings": quota_limited_orgs["recordings"].get(org_id, None),


Is this a bug fix? Has recordings quota limited been wrong this whole time?

oh wow I didn't even realize i did this; it was a cursor tab-complete for adding in the quota_limited_flags stuff. I'm gonna tag the replay team so they know about this, since i don't want to change it without them signing off, but it's also possible they weren't even aware this was broken. @PostHog/team-replay you folks want to take a look at this?

🙈 that wouldn't suprise me.

i've never touched quota limiting and don't know how I'd check if it was broken 🤔

more scary, there were no failing tests before or after the change 😱

turns out it's been that way since being introduced

#14340

that PR talks about tracking so is this just what we're reporting to ourselves or the actual value we use for limiting?

lol I wonder if it's because the one test where we check the value of that entry we also assert that it's the same value of the quota_limited_events field.

assert org_action_call.kwargs.get("properties") == { "quota_limited_events": 1612137599, "quota_limited_recordings": 1612137599, "quota_limited_rows_synced": None, }

you nailed it though; that object is just used in report_organization_action, so it's just for internal reporting

for org_id in orgs_with_changes: properties = { "quota_limited_events": quota_limited_orgs["events"].get(org_id, None), "quota_limited_recordings": quota_limited_orgs["recordings"].get(org_id, None), "quota_limited_rows_synced": quota_limited_orgs["rows_synced"].get(org_id, None), "quota_limited_feature_flags": quota_limited_orgs["feature_flags"].get(org_id, None), } report_organization_action( orgs_by_id[org_id], "organization quota limits changed", properties=properties, group_properties=properties, )

So, no actual quota limiting bug, but a quota limit reporting bug!

are you ok to fix the test as well here?
i don't understand the impact but i'm guessing quota limiting on the correct count is better than not regardless of if it's just reporting

oh yeah i'm on it; there's a bunch of other tests I want to fix too – I need to assert the presence of this new quota limit in way more places lol

awesome... thanks!

haacked · 2025-02-11T21:02:22Z

posthog/api/decide.py

-            active_flags = {key: value for key, value in feature_flags.items() if value}
-
-            if api_version == 2:
-                flags_response["featureFlags"] = active_flags
-            elif api_version >= 3:
-                # v3 returns all flags, not just active ones, as well as if there was an error computing all flags
-                flags_response["featureFlags"] = feature_flags
-                flags_response["errorsWhileComputingFlags"] = errors
-                flags_response["featureFlagPayloads"] = feature_flag_payloads
-            else:
-                # default v1
-                flags_response["featureFlags"] = list(active_flags.keys())
-
-            # metrics for feature flags
-            team_id_label = label_for_team_id_to_track(team.pk)
-            FLAG_EVALUATION_COUNTER.labels(
-                team_id=team_id_label,
-                errors_computing=errors,
-                has_hash_key_override=bool(data.get("$anon_distinct_id")),
-            ).inc()
-
-            if is_request_sampled_for_logging:
-                logger.warn(
-                    "DECIDE_REQUEST_SUCCEEDED",
-                    team_id=team.id,
-                    distinct_id=distinct_id,
-                    errors_while_computing=errors or False,
-                    has_hash_key_override=bool(data.get("$anon_distinct_id")),
-                )
-        else:
-            flags_response["featureFlags"] = {}


I wish GitHub supported semantic diffs to make it easier to review changes like this. 😆

haacked · 2025-02-11T21:05:01Z

posthog/settings/web.py

@@ -60,6 +60,9 @@
 # if `true` we disable session replay if over quota
 DECIDE_SESSION_REPLAY_QUOTA_CHECK = get_from_env("DECIDE_SESSION_REPLAY_QUOTA_CHECK", False, type_cast=str_to_bool)

+# if `true` we disable feature flags if over quota
+DECIDE_FEATURE_FLAG_QUOTA_CHECK = get_from_env("DECIDE_FEATURE_FLAG_QUOTA_CHECK", False, type_cast=str_to_bool)


Just curious, why wouldn't we control this with our own feature flag?

haacked · 2025-02-11T21:08:59Z

Whoops, I meant to add some comments to my review. It looks good to me, but obviously I'm still learning the system.

This gives me an idea for a possible future feature. A "rate-limited" feature flag. Basically a feature flag with a counter per group and when it's hit, the flag returns false.

… feat/quota-limiting

zlwaterfield

I don't have a lot of context on the decide/flag stuff but the quote limiting stuff is good 👍 . Nice fix on the recordings issue.

zlwaterfield · 2025-02-13T14:13:54Z

posthog/api/feature_flag.py

+                QuotaResource.FEATURE_FLAGS, QuotaLimitingCaches.QUOTA_LIMITER_CACHE_KEY
+            )
+            if self.team.api_token in limited_tokens_flags:
+                return Response(


Should we be responding with an error here so they know they are being limited?

yeah I was on the fence about this too – I know we can't fail decide because of quota limiting and instead have to just return a different permutation of the decide response, maybe with local eval I can be more fail-y – I haven't really looked at the SDKs to see how they handle it but we already are pretty happy to fail local eval with a 429 if we hit the customer rate limits, perhaps it's worth doing failing for a quota limit, too. Good take. Let me try this and see how it feels.

dmarticus · 2025-02-14T00:52:43Z

posthog/api/decide.py

-        if feature_flags:
-            # Billing analytics for decide requests with feature flags
-            # Don't count if all requests are for survey targeting flags only.
-            if not all(flag.startswith(SURVEY_TARGETING_FLAG_PREFIX) for flag in feature_flags.keys()):
-                # Sample no. of decide requests with feature flags
-                if settings.DECIDE_BILLING_SAMPLING_RATE and random() < settings.DECIDE_BILLING_SAMPLING_RATE:
-                    count = int(1 / settings.DECIDE_BILLING_SAMPLING_RATE)
-                    increment_request_count(team.pk, count)


this, and other logging, is moved into _record_feature_flag_metrics

ruff

e5b00e8

greptile-apps bot reviewed Feb 11, 2025

View reviewed changes

ee/billing/test/test_quota_limiting.py Outdated Show resolved Hide resolved

dmarticus commented Feb 11, 2025

View reviewed changes

dmarticus added 2 commits February 11, 2025 12:53

import constants

03f877c

Merge branch 'master' into feat/quota-limiting

20d7c14

dmarticus requested a review from a team February 11, 2025 20:55

haacked approved these changes Feb 11, 2025

View reviewed changes

dmarticus added 15 commits February 11, 2025 13:26

add feature flags to these quota limit assertions

2ed2be7

Merge branch 'feat/quota-limiting' of github.com:PostHog/posthog into…

0dd8563

… feat/quota-limiting

darnit i missed one

51f0df5

two more bugs I fixed wow go me

b428fa8

Merge branch 'master' into feat/quota-limiting

290d681

fixing some tests

727c4cf

Merge branch 'feat/quota-limiting' of github.com:PostHog/posthog into…

3836ffe

… feat/quota-limiting

sigh idk

62b09c1

Merge branch 'master' into feat/quota-limiting

b9c38d1

Merge branch 'master' into feat/quota-limiting

fef5963

Merge branch 'feat/quota-limiting' of github.com:PostHog/posthog into…

ca1037b

… feat/quota-limiting

cleaning up quota limiting + remote config

e980e11

more response polishing

cd37a8b

Merge branch 'master' into feat/quota-limiting

1d662c9

delete prints

7d82a58

dmarticus requested a review from zlwaterfield February 12, 2025 22:10

dmarticus added 4 commits February 12, 2025 14:21

moar tests

96f59ae

Merge branch 'master' into feat/quota-limiting

efd76bc

limit local eval, too

9f06adf

Merge branch 'feat/quota-limiting' of github.com:PostHog/posthog into…

12c13c8

… feat/quota-limiting

zlwaterfield approved these changes Feb 13, 2025

View reviewed changes

dmarticus changed the title ~~feat(flags): add quota limiting for feature flags served up with /decide~~ feat(flags): add quota limiting for feature flags served up with /decide and to local_evaluation Feb 13, 2025

dmarticus added 2 commits February 13, 2025 16:28

renamed the quota limit to make it more clear

a176607

respond with an error instead; this feels more honest

f1eab3f

dmarticus commented Feb 14, 2025

View reviewed changes

dmarticus and others added 2 commits February 13, 2025 17:02

resolvin them conflicts

641988e

Update query snapshots

1f98d8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(flags): add quota limiting for feature flags served up with `/decide` and to `local_evaluation` #28564

feat(flags): add quota limiting for feature flags served up with `/decide` and to `local_evaluation` #28564

dmarticus commented Feb 11, 2025 •

edited

Loading

greptile-apps bot left a comment

dmarticus Feb 11, 2025

haacked Feb 11, 2025

dmarticus Feb 11, 2025 •

edited

Loading

haacked Feb 11, 2025

dmarticus Feb 11, 2025

dmarticus Feb 11, 2025

dmarticus Feb 11, 2025

haacked Feb 11, 2025

dmarticus Feb 11, 2025

haacked Feb 11, 2025

haacked Feb 11, 2025

haacked Feb 11, 2025

dmarticus Feb 11, 2025

pauldambra Feb 11, 2025

pauldambra Feb 11, 2025

dmarticus Feb 11, 2025

pauldambra Feb 11, 2025

dmarticus Feb 11, 2025

pauldambra Feb 11, 2025

dmarticus Feb 11, 2025

pauldambra Feb 11, 2025

haacked Feb 11, 2025

haacked Feb 11, 2025

haacked commented Feb 11, 2025

zlwaterfield left a comment

zlwaterfield Feb 13, 2025

dmarticus Feb 13, 2025

dmarticus Feb 14, 2025

feat(flags): add quota limiting for feature flags served up with /decide and to local_evaluation #28564

Are you sure you want to change the base?

feat(flags): add quota limiting for feature flags served up with /decide and to local_evaluation #28564

Conversation

dmarticus commented Feb 11, 2025 • edited Loading

Problem

Changes

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmarticus Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haacked commented Feb 11, 2025

zlwaterfield left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feat(flags): add quota limiting for feature flags served up with `/decide` and to `local_evaluation` #28564

feat(flags): add quota limiting for feature flags served up with `/decide` and to `local_evaluation` #28564

dmarticus commented Feb 11, 2025 •

edited

Loading

dmarticus Feb 11, 2025 •

edited

Loading