Skip to content

Commit

Permalink
Upgrade worked examples in light of discussion stimulated by @Cerebra…
Browse files Browse the repository at this point in the history
  • Loading branch information
jennybc committed May 8, 2018
1 parent 557dd28 commit cd78808
Show file tree
Hide file tree
Showing 2 changed files with 108 additions and 16 deletions.
36 changes: 35 additions & 1 deletion ex06_runif-via-pmap.R
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,39 @@ set.seed(123)
mutate(data = pmap(., runif)))
#View(df_aug)

#' What about computing within a data frame, in the presence of the
#' complications discussed above? Use `list()` in the place of the `.`
#' placeholder above to select the target variables and, if necessary, map
#' variable names to argument names. *Thanks @hadley for [sharing this
#' trick](https://community.rstudio.com/t/dplyr-alternatives-to-rowwise/8071/29).*
#'
#' How to address variable names != argument names:
foofy <- tibble(
alpha = 1:3, ## was: n
beta = c(0, 10, 100), ## was: min
gamma = c(1, 100, 1000) ## was: max
)

set.seed(123)
foofy %>%
mutate(data = pmap(list(n = alpha, min = beta, max = gamma), runif))

#' How to address presence of 'extra variables' with either an inclusion or
#' exclusion mentality
df_oops <- tibble(
n = 1:3,
min = c(0, 10, 100),
max = c(1, 100, 1000),
oops = c("please", "ignore", "me")
)

set.seed(123)
df_oops %>%
mutate(data = pmap(list(n, min, max), runif))

df_oops %>%
mutate(data = pmap(select(., -oops), runif))

#' ## Review
#'
#' What have we done?
Expand All @@ -154,4 +187,5 @@ set.seed(123)
#' * Wrote custom wrappers around `runif()` to deal with:
#' - df var names != `.f()` arg names
#' - df vars that aren't formal args of `.f()`
#' * Added generated data as a list-column
#' * Demonstrated all of the above when working inside a data frame and adding
#' generated data as a list-column
88 changes: 73 additions & 15 deletions ex06_runif-via-pmap.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Generate data from different distributions via pmap()
================
Jenny Bryan
2018-04-10
2018-05-08

## Uniform\[min, max\] via `runif()`

Expand Down Expand Up @@ -33,9 +33,9 @@ df
#> # A tibble: 3 x 3
#> n min max
#> <int> <dbl> <dbl>
#> 1 1 0. 1.
#> 2 2 10. 100.
#> 3 3 100. 1000.
#> 1 1 0 1
#> 2 2 10 100
#> 3 3 100 1000
```

Set seed to make this repeatedly random.
Expand All @@ -48,7 +48,7 @@ set.seed(123)
#> # A tibble: 1 x 3
#> n min max
#> <int> <dbl> <dbl>
#> 1 1 0. 1.
#> 1 1 0 1
runif(n = x$n, min = x$min, max = x$max)
#> [1] 0.2875775

Expand Down Expand Up @@ -101,9 +101,9 @@ foofy
#> # A tibble: 3 x 3
#> alpha beta gamma
#> <int> <dbl> <dbl>
#> 1 1 0. 1.
#> 2 2 10. 100.
#> 3 3 100. 1000.
#> 1 1 0 1
#> 2 2 10 100
#> 3 3 100 1000
```

A: Rename the variables on-the-fly, on the way in.
Expand Down Expand Up @@ -192,9 +192,9 @@ df_oops
#> # A tibble: 3 x 4
#> n min max oops
#> <int> <dbl> <dbl> <chr>
#> 1 1 0. 1. please
#> 2 2 10. 100. ignore
#> 3 3 100. 1000. me
#> 1 1 0 1 please
#> 2 2 10 100 ignore
#> 3 3 100 1000 me
```

This will not work\!
Expand Down Expand Up @@ -261,12 +261,69 @@ set.seed(123)
#> # A tibble: 3 x 4
#> n min max data
#> <int> <dbl> <dbl> <list>
#> 1 1 0. 1. <dbl [1]>
#> 2 2 10. 100. <dbl [2]>
#> 3 3 100. 1000. <dbl [3]>
#> 1 1 0 1 <dbl [1]>
#> 2 2 10 100 <dbl [2]>
#> 3 3 100 1000 <dbl [3]>
#View(df_aug)
```

What about computing within a data frame, in the presence of the
complications discussed above? Use `list()` in the place of the `.`
placeholder above to select the target variables and, if necessary, map
variable names to argument names. *Thanks @hadley for [sharing this
trick](https://community.rstudio.com/t/dplyr-alternatives-to-rowwise/8071/29).*

How to address variable names \!= argument names:

``` r
foofy <- tibble(
alpha = 1:3, ## was: n
beta = c(0, 10, 100), ## was: min
gamma = c(1, 100, 1000) ## was: max
)

set.seed(123)
foofy %>%
mutate(data = pmap(list(n = alpha, min = beta, max = gamma), runif))
#> # A tibble: 3 x 4
#> alpha beta gamma data
#> <int> <dbl> <dbl> <list>
#> 1 1 0 1 <dbl [1]>
#> 2 2 10 100 <dbl [2]>
#> 3 3 100 1000 <dbl [3]>
```

How to address presence of ‘extra variables’ with either an inclusion or
exclusion mentality

``` r
df_oops <- tibble(
n = 1:3,
min = c(0, 10, 100),
max = c(1, 100, 1000),
oops = c("please", "ignore", "me")
)

set.seed(123)
df_oops %>%
mutate(data = pmap(list(n, min, max), runif))
#> # A tibble: 3 x 5
#> n min max oops data
#> <int> <dbl> <dbl> <chr> <list>
#> 1 1 0 1 please <dbl [1]>
#> 2 2 10 100 ignore <dbl [2]>
#> 3 3 100 1000 me <dbl [3]>

df_oops %>%
mutate(data = pmap(select(., -oops), runif))
#> # A tibble: 3 x 5
#> n min max oops data
#> <int> <dbl> <dbl> <chr> <list>
#> 1 1 0 1 please <dbl [1]>
#> 2 2 10 100 ignore <dbl [2]>
#> 3 3 100 1000 me <dbl [3]>
```

## Review

What have we done?
Expand All @@ -278,4 +335,5 @@ What have we done?
- Wrote custom wrappers around `runif()` to deal with:
- df var names \!= `.f()` arg names
- df vars that aren’t formal args of `.f()`
- Added generated data as a list-column
- Demonstrated all of the above when working inside a data frame and
adding generated data as a list-column

0 comments on commit cd78808

Please sign in to comment.