Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Markup queries to order by pct first #2538

Merged
merged 3 commits into from
Nov 17, 2021

Conversation

tunetheweb
Copy link
Member

Makes progress on #2142

Change to order by percentage descending so mobile is not cut off

Also added a total column.
And a couple of "(not sets)" to avoid having to manipulate the data too much in Sheets

All queries have been rerun and the Sheet updated

@tunetheweb tunetheweb added the analysis Querying the dataset label Nov 17, 2021
@tunetheweb tunetheweb added this to the 2021 Analysis milestone Nov 17, 2021
@@ -30,6 +31,7 @@ GROUP BY
client,
almanac_attribute_info.name
ORDER BY
pct_ratio DESC,
Copy link
Contributor

@kevinfarrugia kevinfarrugia Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is sorting by pct_ratio preferable to using ROW_NUMBER() and then filtering on it? The latter will give you 100 rows for each partition (_TABLE_SUFFIX in this example)

SELECT
  ..
  ROW_NUMBER() OVER (PARTITION BY _TABLE_SUFFIX ORDER BY SUM(almanac_attribute_info.freq) / SUM(SUM(almanac_attribute_info.freq)) DESC) AS pos
FROM
  ..
WHERE pos <= 100

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ROW_NUMBER() OVER... does have the advantage of being able to have exactly 100 mobile and 100 desktop results. However in reality that rarely matters as we normally only report on the top 5, or 10, or 15. So as long as you have enough values (100 or 200 is usually a good enough number) and usage is high enough on both mobile and desktop then it rarely matters.

And just ordering by pct is a heck of a lot simpler 😁

@tunetheweb tunetheweb merged commit f8c97f5 into main Nov 17, 2021
@tunetheweb tunetheweb deleted the markup-queries-order-by-pct branch November 17, 2021 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Querying the dataset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants