Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default formula argument to NULL in geom_smooth() #3307

Merged
merged 14 commits into from
Sep 30, 2019
Merged
4 changes: 2 additions & 2 deletions R/geom-smooth.r
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@
geom_smooth <- function(mapping = NULL, data = NULL,
stat = "smooth", position = "identity",
...,
method = "auto",
formula = y ~ x,
method = NULL,
formula = NULL,
se = TRUE,
na.rm = FALSE,
show.legend = NA,
Expand Down
42 changes: 29 additions & 13 deletions R/stat-smooth.r
Original file line number Diff line number Diff line change
@@ -1,19 +1,23 @@
#' @param method Smoothing method (function) to use, accepts either a character vector,
#' e.g. `"auto"`, `"lm"`, `"glm"`, `"gam"`, `"loess"` or a function, e.g.
#' `MASS::rlm` or `mgcv::gam`, `stats::lm`, or `stats::loess`.
#' @param method Smoothing method (function) to use, accepts either
#' `NULL` or a character vector, e.g. `"lm"`, `"glm"`, `"gam"`, `"loess"`
#' or a function, e.g. `MASS::rlm` or `mgcv::gam`, `stats::lm`, or `stats::loess`.
#' `"auto"` is also accepted for backwards compatibility. It is equivalent to
#' `NULL`.
#'
#' For `method = "auto"` the smoothing method is chosen based on the
#' For `method = NULL` the smoothing method is chosen based on the
#' size of the largest group (across all panels). [stats::loess()] is
#' used for less than 1,000 observations; otherwise [mgcv::gam()] is
#' used with `formula = y ~ s(x, bs = "cs")`. Somewhat anecdotally,
#' `loess` gives a better appearance, but is \eqn{O(N^{2})}{O(N^2)} in memory,
#' so does not work for larger datasets.
#'
#' If you have fewer than 1,000 observations but want to use the same `gam()`
#' model that `method = "auto"` would use, then set
#' model that `method = NULL` would use, then set
#' `method = "gam", formula = y ~ s(x, bs = "cs")`.
#' @param formula Formula to use in smoothing function, eg. `y ~ x`,
#' `y ~ poly(x, 2)`, `y ~ log(x)`
#' `y ~ poly(x, 2)`, `y ~ log(x)`. `NULL` by default, in which case
#' `method = NULL` implies `formula = y ~ x` when there are fewer than 1,000
#' observations and `formula = y ~ s(x, bs = "cs")` otherwise.
#' @param se Display confidence interval around smooth? (`TRUE` by default, see
#' `level` to control.)
#' @param fullrange Should the fit span the full range of the plot, or just
Expand All @@ -37,8 +41,8 @@
stat_smooth <- function(mapping = NULL, data = NULL,
geom = "smooth", position = "identity",
...,
method = "auto",
formula = y ~ x,
method = NULL,
formula = NULL,
se = TRUE,
n = 80,
span = 0.75,
Expand Down Expand Up @@ -78,7 +82,8 @@ stat_smooth <- function(mapping = NULL, data = NULL,
StatSmooth <- ggproto("StatSmooth", Stat,

setup_params = function(data, params) {
if (identical(params$method, "auto")) {
msg <- character()
if (is.null(params$method) || identical(params$method, "auto")) {
# Use loess for small datasets, gam with a cubic regression basis for
# larger. Based on size of the _largest_ group to avoid bad memory
# behaviour of loess
Expand All @@ -88,19 +93,30 @@ StatSmooth <- ggproto("StatSmooth", Stat,
params$method <- "loess"
} else {
params$method <- "gam"
params$formula <- y ~ s(x, bs = "cs")
}
message("`geom_smooth()` using method = '", params$method,
"' and formula '", deparse(params$formula), "'")
msg <- c(msg, paste0("method = '", params$method, "'"))
}

if (is.null(params$formula)) {
params$formula <- if (identical(params$method, "gam")) {
y ~ s(x, bs = "cs")
} else {
y ~ x
}
msg <- c(msg, paste0("formula '", deparse(params$formula), "'"))
}
if (identical(params$method, "gam")) {
params$method <- mgcv::gam
}

if (length(msg) > 0) {
message("`geom_smooth()` using ", paste0(msg, collapse = " and "))
}

params
},

compute_group = function(data, scales, method = "auto", formula = y~x,
compute_group = function(data, scales, method = NULL, formula = NULL,
se = TRUE, n = 80, span = 0.75, fullrange = FALSE,
xseq = NULL, level = 0.95, method.args = list(),
na.rm = FALSE) {
Expand Down
2 changes: 1 addition & 1 deletion man/draw_key.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 13 additions & 9 deletions man/geom_smooth.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions tests/testthat/test-geom-smooth.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,15 @@ test_that("default smoothing methods for small and large data sets work", {
"method = 'gam' and formula 'y ~ s\\(x, bs = \"cs\"\\)"
)
expect_equal(plot_data$y, as.numeric(out))

# backwards compatibility of method = "auto"
p <- ggplot(df, aes(x, y)) + geom_smooth(method = "auto")

expect_message(
plot_data <- layer_data(p),
"method = 'gam' and formula 'y ~ s\\(x, bs = \"cs\"\\)"
)
expect_equal(plot_data$y, as.numeric(out))
})


Expand Down