-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(experiments): Init new experiment query runner #28347
base: master
Are you sure you want to change the base?
Conversation
posthog/hogql_queries/experiments/test/test_experiment_query_runner.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great. Especially love that we're now unifying the data warehouse integration with the standard use case. This makes so much sense in hindsight.
I know this is very much a WIP, so feel free to disregard some of my comments :)
case _: | ||
# Else, we default to count | ||
# We then just emit 1 so we can easily sum it up | ||
metric_value = "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious how the funnel/binomial comes into play here. Would we also simply count "1" for every recorded result returned from the funnel's last step?
*test_accounts_filter, | ||
] | ||
), | ||
group_by=[ast.Field(chain=["variant"]), ast.Field(chain=["distinct_id"])], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a new problem, but I’m just noting that a single user can have two different variant values. This exposure query groups by [variant, distinct id], so if a user receives two different flag values, they'll appear in both groups. This is mainly an SDK/feature flag implementation issue, but the question is if we want to handle it on our end (e.g. counting each user only once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think what we want to do here is to exclude users with multiple variant exposure and include the count in the "health checks" we display to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exclude them entirely? Not simply use the most recent exposure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's my position, as:
- Users with multiple variant exposure screws up results
- it's an error we should warn the user about
- it complicates our queries if we have to handle this
I think we should start with that at least, as that's the most sane default behavior. More advanced handling could be implemented in the future if requested.
expr=ast.Field(chain=[series_node.table_name, series_node.distinct_id_field]), | ||
), | ||
ast.Field(chain=["exposure", "variant"]), | ||
parse_expr(f"{metric_value} as value"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a minor nit/more of a curiosity thing (I haven't written any AST myself), but is there a convention for when to use parse_expr
versus assembling the query programmatically? i.e:
This:
parse_expr(f"{metric_value} as value"),
vs this:
ast.Alias(
alias="value",
expr=parse_expr(metric_value)
)
The latter seems more consistent and less error-prone.
ast.Field(chain=["events", "timestamp"]), | ||
ast.Field(chain=["events", "distinct_id"]), | ||
ast.Field(chain=["exposure", "variant"]), | ||
ast.Field(chain=["events", "event"]), | ||
parse_expr(f"{metric_value} as value"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of manually written strings here and we'll be writing many more similar queries. It would be good to abstract the common ones into string constants to reduce the risk of typos.
next_join=ast.JoinExpr( | ||
table=events_after_exposure_query, | ||
join_type="LEFT JOIN", | ||
alias="eae", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a bit more descriptive alias here :)
posthog/schema.py
Outdated
@@ -7196,6 +7196,21 @@ class ExperimentTrendsQuery(BaseModel): | |||
response: Optional[ExperimentTrendsQueryResponse] = None | |||
|
|||
|
|||
class ExperimentQuery(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this generated from the frontend's schema-general.ts
(as it should be)? Let's commit that file too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this generated from the frontend's
schema-general.ts
(as it should be)? Let's commit that file too?
Nope, added it solely as a talking point for now.
Good! 👍
Yeah, I have consciously taken many shortcuts to identify and resolve the complex issues as quickly as possible. So I'll wait with any polishing until the core stuff feels right. It's beneficial to keep the code changes minimal until interfaces stabilizes, so iterating and refactoring is easier. |
2587bf4
to
f89e988
Compare
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
f89e988
to
a2e2e27
Compare
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
frontend/src/scenes/experiments/ExperimentView/ExperimentView.tsx
Outdated
Show resolved
Hide resolved
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
Based on the provided files and context, I'll provide a concise summary of the key changes in this PR:
Initial setup of a new experiment query runner with unified data warehouse integration. Key changes include:
- Added new
ExperimentMetric
type to replaceExperimentQuery
, simplifying the metrics structure - Introduced
ExperimentActionMetricConfig
for handling action-based metrics alongside event metrics - Created new
ExperimentQueryRunner
class in Python for handling experiment analysis through HogQL queries - Added feature flag
EXPERIMENTS_NEW_QUERY_RUNNER
to control rollout - Updated frontend components to support both legacy and new metric types with proper type safety
The changes appear well-structured and maintain backward compatibility while introducing the new query runner infrastructure. The implementation is currently WIP and not yet being called from anywhere.
Potential concerns:
- Missing error handling for database queries in
ExperimentQueryRunner
- Type safety improvements needed in
filterToMetricConfig
utility - Validation needed for statistical assumptions in query runner
34 file(s) reviewed, 33 comment(s)
Edit PR Review Bot Settings | Greptile
{result && | ||
(result.kind === 'ExperimentTrendsQuery' || result.kind === 'ExperimentFunnelsQuery') && ( | ||
<ExploreButton result={result} /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: string literals used for type comparison instead of NodeKind enum that's already imported and used elsewhere in the file
{result && | |
(result.kind === 'ExperimentTrendsQuery' || result.kind === 'ExperimentFunnelsQuery') && ( | |
<ExploreButton result={result} /> | |
{result && | |
(result.kind === NodeKind.ExperimentTrendsQuery || result.kind === NodeKind.ExperimentFunnelsQuery) && ( | |
<ExploreButton result={result} /> |
if (!metricIdx && metricIdx !== 0) { | ||
return <></> | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: returning an empty fragment when metricIdx is undefined could lead to confusing UI state - consider showing a message or error state instead
# If the metric type is continuous, we need to extract the value from the event property | ||
metric_property = self.metric.metric_config.math_property | ||
if is_data_warehouse_query: | ||
metric_value = f"toFloat('{metric_property}')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: potential SQL injection vulnerability - metric_property value should be escaped/sanitized before interpolation
self.experiment = Experiment.objects.get(id=self.query.experiment_id) | ||
self.feature_flag = self.experiment.feature_flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: missing error handling for Experiment.objects.get() which could raise DoesNotExist
self.experiment = Experiment.objects.get(id=self.query.experiment_id) | |
self.feature_flag = self.experiment.feature_flag | |
try: | |
self.experiment = Experiment.objects.get(id=self.query.experiment_id) | |
self.feature_flag = self.experiment.feature_flag | |
except Experiment.DoesNotExist: | |
raise ValidationError(f"Experiment with id {self.query.experiment_id} does not exist") |
date_range=self.date_range, | ||
team=self.team, | ||
interval=IntervalType.DAY, | ||
now=datetime.now(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: using datetime.now() without timezone info could cause inconsistencies - should use datetime.now(UTC)
now=datetime.now(), | |
now=datetime.now(UTC), |
elif end_date is not None: | ||
end_date = timezone.make_aware(end_date) # Make naive datetime timezone-aware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Redundant condition - elif end_date is not None
will always be true since it's the else branch of if end_date is None
elif end_date is not None: | |
end_date = timezone.make_aware(end_date) # Make naive datetime timezone-aware | |
else: | |
end_date = timezone.make_aware(end_date) # Make naive datetime timezone-aware |
if kind == "ExperimentQuery": | ||
from .experiments.experiment_query_runner import ExperimentQueryRunner | ||
|
||
return ExperimentQueryRunner( | ||
query=query, | ||
team=team, | ||
timings=timings, | ||
modifiers=modifiers, | ||
limit_context=limit_context, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: The ExperimentQuery case follows the same pattern as other runners but doesn't use cast() like most other cases do. Consider adding cast for consistency.
📸 UI snapshots have been updated4 snapshot changes in total. 0 added, 4 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated4 snapshot changes in total. 0 added, 4 modified, 0 deleted:
Triggered by this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 I did a final round of testing...
With the feature flag disabled:
- An existing experiments displays results as expected.
- I could add and remove metrics from the experiment as expected.
- I could create and delete shared metrics as expected.
With the feature flag enabled:
- I could edit an existing shared metric with a query.
- Creating a new shared metric used the experiment metric form. The event and property both saved as expected.
- Loading an existing experiment displayed results as expected.
- I could add new query metrics and remove existing metrics from an existing experiment.
- Creating a new experiment used the new experiment metric UX as expected.
@@ -1331,13 +1363,22 @@ export const experimentLogic = kea<experimentLogicType>([ | |||
getMetricType: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to rename this to getExperimentQueryType
in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, noted it down
raise ValueError("Control variant not found in experiment results") | ||
|
||
# Statistical analysis | ||
if self.stats_version == 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we only support stats_version == 2
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, shouldn't be possible with stats_version == 1 here. Noted it down to clean this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go!
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
Problem
We are building a new query runner for experiments, see https://github.com/PostHog/product-internal/pull/712
Changes
Initial setup of a new query runner. Currently behind a feature flag. Will only be tested internally for now.
Contains:
How did you test this code?