-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort events and properties by volume first, A-Z second #7426
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! fixed merge conflicts and moved to query volume as discussed on Slack. Only thing I'm not 100% sure is that creating this indices won't stall instances with a ton of events/properties (e.g. this customer)
posthog/models/event_definition.py
Outdated
GinIndex(name="index_event_definition_name", fields=["name"], opclasses=["gin_trgm_ops"]), | ||
] # To speed up DB-based fuzzy searching | ||
# Improve ordering performance | ||
models.Index(fields=["team_id", "-volume_30_day"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious how indexing works with the -
? Do you know any good literature on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should just add desc
in the index as well, and should actually make no difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres-level does that just mean better performance because we would order desc too? What's the impact of not having the desc
in there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, forgot a link: https://www.postgresql.org/docs/14/indexes-ordering.html
I copied this from a different model in the system, and left the -
in place since we mostly do order DESC here. If I read the doc correctly, there shouldn't be any real difference between the two. This will be truly useful if we'd sort by multiple columns. We don't here, and postgres can read indexes backwards, so it's probably better to skip the -
here for clarity.
At least I should run a few tests on it to make sure it works well, but now it's gone, so :).
You are right there. I'll remove the migration. These indices (indexes?) are just for performance. Users with a sensible number of event names probably won't notice much difference, and the user you linked will certainly experience problems. Currently this user can't even use the event stats page because of the sheer number of events, because we still need to load all of them into memory before displaying anything 🤯 Perhaps this should be one of the first uses for the "Special Migrations API"? @yakkomajuri |
To add, even without the indexes we can't merge this, as with a few dozen million event names, the taxonomic filter will get really slow. So we're blocked. |
Well we already didn't have an index for the event name yet we already ordered based on that, will performance be that bad with these changes? |
We have an index for the event name:
... but seems like it's indeed not used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! let's try this out, won't merge cause we'll all be asleep if something goes wrong, but feel free to merge once you see this
"Get in line" lol On a more serious note, I was thinking of running a change to Replicated dead letter queue tables, and ofc we have the events migration too. However, I'd be happy to have you rather than me as a "Beta" tester for writing a special migration, if you're willing. Will probably be good for feedback. |
Changes
Moves events and properties in the taxonomic filter to order by popularity
Closes Sort infinite list events & properties by popularity #5309
I think this makes the experience nicer, as things you're more likely to use are higher up.
Events before & after:
data:image/s3,"s3://crabby-images/427e8/427e883a863dd644685e0396b669ac94766a7fdc" alt="image"
data:image/s3,"s3://crabby-images/08482/08482831825b4e23073fcda5c63b32c86957f92c" alt="image"
Properties before & after:
data:image/s3,"s3://crabby-images/0ef5d/0ef5d821c090cce47b2530d6cb949cea558003bd" alt="image"
data:image/s3,"s3://crabby-images/0893e/0893e197a877c12b68e92f13c918bfd24d70b97e" alt="image"
How did you test this code?