Skip to content

Commit

Permalink
sql/stats: add simple linear regression over float64s and quantiles
Browse files Browse the repository at this point in the history
Statistics forecasting will initially use simple linear regression over
time to predict all table statistics (only keeping predictions that fit
the linear model well). Most table statistics are scalar float64s, for
which we can use a textbook ordinary least squares (OLS) method with x
as time and y as the table statistic.

For histograms, we use a variant of OLS based on the 2010 paper
"Ordinary Least Squares for Histogram Data Based on Wasserstein
Distance" by Verde and Irpino. The paper outlines an OLS method when
both x and y are histograms. In our case, x is a scalar, so we adjust
the math slightly.

Assists: #79872

Release note: None
  • Loading branch information
michae2 committed Jul 25, 2022
1 parent e63324c commit f1c7c68
Show file tree
Hide file tree
Showing 5 changed files with 1,109 additions and 7 deletions.
2 changes: 2 additions & 0 deletions pkg/sql/stats/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ go_library(
"new_stat.go",
"quantile.go",
"row_sampling.go",
"simple_linear_regression.go",
"stats_cache.go",
],
embed = [":stats_go_proto"],
Expand Down Expand Up @@ -71,6 +72,7 @@ go_test(
"main_test.go",
"quantile_test.go",
"row_sampling_test.go",
"simple_linear_regression_test.go",
"stats_cache_test.go",
],
embed = [":stats"],
Expand Down
Loading

0 comments on commit f1c7c68

Please sign in to comment.