-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path10-sjPlot.Rmd
305 lines (183 loc) · 11.5 KB
/
10-sjPlot.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
# sjPlot Package

```{block type='rmdlink', echo=TRUE}
[Daniel Lüdecke](http://www.danielluedecke.de/) is German researcher that has put together several GREAT packages, including `sjPlot` which we will detail here. Documentation can be found at: http://www.strengejacke.de/sjPlot/index.html
```
```{r, include=FALSE}
knitr::opts_chunk$set(comment = "",
echo = TRUE,
warning = FALSE,
message = FALSE,
fig.align = "center", # center all figures
fig.width = 6, # set default figure width to 4 inches
fig.height = 4) # set default figure height to 3 inches
```
```{r, message=FALSE, error=FALSE}
library(tidyverse) # all things tidy
library(texreg) # Convert Regression Output to LaTeX or HTML Tables
library(lme4) # Linear, generalized linear, & nonlinear mixed modelsts models
library(haven) # read in SPSS dataset
library(sjPlot) # Quick predicitive and diagnostic plots
```
Read the SPSS data in with the `haven` package and prepare it (see previous chapter).
```{r}
data_raw <- haven::read_sav("http://www.mlminr.com/data-sets/Achieve.sav?attredirects=0")
data_achieve_center_scale <- data_raw %>%
dplyr::mutate_at(vars(id, region, corp, school, class), factor) %>%
dplyr::mutate(gender = gender %>%
factor(labels = c("Female", "Male"))) %>%
dplyr::mutate(classize = classize %>%
factor(labels = c("12-17", "18-21",
"22-26", ">26"))) %>%
dplyr::select(id, region, corp, school, class, # Identifiers
gender, age, geread, gevocab, # Pupil-level vars
classize, # Class-Level vars
senroll, ses) %>% # School-level vars
dplyr::mutate(gevocab_c = gevocab - 4.4938) %>%
dplyr::mutate(age_c = age - 107.5290) %>%
dplyr::mutate(senroll_c = senroll - 533.4148) %>%
dplyr::mutate(senroll_ch = senroll_c / 100) %>% # centered AND divided by one hundred
dplyr::mutate(ses_t = ses / 10) # JUST divide by ten
```
Fit the final model (see previous chapter)
```{r}
fit_read_11re_s <- lme4::lmer(geread ~ gevocab_c*age_c + gevocab_c*ses_t + # 2 2-way interactions
(gevocab_c | school),
data = data_achieve_center_scale,
REML = TRUE)
```
```{r, include=FALSE}
texreg::screenreg(list(fit_read_11re_s), single.row = TRUE)
```
```{r, results='asis'}
# Knit to Website: texreg::htmlreg()
# Knit to PDF: texreg::texreg()
# View on Screen: texreg::screenreg()
texreg::htmlreg(list(fit_read_11re_s),
custom.model.names = c("Final"),
caption = "MLM: Final Model",
caption.above = TRUE,
single.row = TRUE)
```
Now we will show some of the things the `sjPlot` package can do!
## Plotting Coefficients
Select `terms` that should be plotted. All other term are removed from the output.
Note that the term names must match the names of the model's coefficients. For factors, this means that the variable name is suffixed with the related factor level, and each category counts as one term. E.g. `rm.terms = "t_name [2,3]"` would remove the terms `t_name2` and `t_name3` (assuming that the variable t_name is categorical and has at least the factor levels 2 and 3).
Another example for the iris-dataset: `terms = "Species"` would not work, instead you would write `terms = "Species [versicolor, virginica]"` to remove these two levels, or `terms = "Speciesversicolor"` if you just want to remove the level versicolor from the plot.
### Fixed Effects
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "est")
```
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "est",
show.values = TRUE) # Logical, whether values should be plotted or not.
```
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "std")
```
Determines in which way estimates are sorted in the plot with the option: `sort.est = `
* If `NULL` (default), no sorting is done and estimates are sorted in the same order as they appear in the model formula.
* If `TRUE`, estimates are sorted in descending order, with highest estimate at the top.
* If `sort.est = "sort.all"`, estimates are re-sorted for each coefficient (only applies if `type = "re"` and `grid = FALSE`), i.e. the estimates of the random effects for each predictor are sorted and plotted to an own plot.
* If `type = "re"`, specify a predictor's / coefficient's name to sort estimates according to this random effect.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "std",
sort.est = TRUE)
```
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "std",
sort.est = TRUE,
show.values = TRUE) # Logical, whether values should be plotted or not.
```
Plots standardized beta values, however, standardization follows @gelman2008 suggestion, rescaling the estimates by dividing them by two standard deviations instead of just one. Resulting coefficients are then directly comparable for untransformed **binary predictors**.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "std2")
```
### Random Effects
```{r, fig.width=3, fig.height=8}
sjPlot::plot_model(fit_read_11re_s,
type = "re")
```
```{r, fig.width=3, fig.height=8}
sjPlot::plot_model(fit_read_11re_s,
type = "re",
grid = FALSE,
sort.est = TRUE)
```
## Plotting Marginal Effects
Here `terms` indicates for which terms marginal effects should be displayed. At least one term is required to calculate effects, maximum length is three terms, where the second and third term indicate the groups, i.e. predictions of first term are grouped by the levels of the second (and third) term.
`terms` may also indicate higher order terms (e.g. interaction terms).
Indicating levels in square brackets allows for selecting only specific groups. Term name and levels in brackets must be separated by a whitespace character, e.g. `terms = c("age", "education [1,3]")`.
It is also possible to specify a range of numeric values for the predictions with a colon, for instance `terms = c("education [1,3]", "age [30:50]")`.
Furthermore, it is possible to specify a function name. Values for predictions will then be transformed, e.g. `terms = "income [exp]"`. This is useful when model predictors were transformed for fitting the model and should be **back-transformed** to the original scale for predictions.
Finally, using pretty for numeric variables (e.g. `terms = "age [pretty]"`) calculates a pretty range of values for the term, roughly of proportional length to the term's value range. For more details, see the documentation for the `ggpredict` package.
### Predicted Values
Based on (i.e. is a wrapper for): `ggeffects::ggpredict()`
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "pred",
terms = c("gevocab_c", "ses_t", "age_c"))
```
The `pred.type = ` option only applies for Marginal Effects plots with **mixed effects models**. Indicates whether predicted values should be **conditioned on random effects** (`pred.type = "re"`) or fixed effects only (`pred.type = "fe"`, the default). For details, see documentation of the type-argument in `ggpredict()` function.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "pred",
pred.type = "re",
terms = c("gevocab_c", "ses_t", "age_c"))
```
### Effect Plots
Based on (i.e. is a wrapper for): `ggeffects::ggeffect()`
Similar to `type = "pred"`, however, discrete predictors are held constant at their proportions (not reference level). See the `ggeffect` package documentation for details.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "eff",
terms = c("gevocab_c", "ses_t", "age_c"))
```
### Interaction Plots
A shortcut for marginal effects plots, where *interaction terms are automatically detected* and used as terms-argument.
Furthermore, if the moderator variable (the second - and third - term in an interaction) is continuous, `type = "int"` automatically chooses useful values based on the mdrt.values-argument, which are passed to terms. Then, `ggpredict` is called.
`type = "int"` plots the interaction term that appears:
* **first** in the formula along the x-axis, while the
* **second** (and possibly third) variable in an interaction is used as grouping factor(s) (moderating variable).
Use `type = "pred"` or `type = "eff"` and specify a certain order in the terms-argument to indicate which variable(s) should be used as moderator.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "int")
```
The `mdrt.values = ` option indicates which values of the **moderator variable** should be used when plotting interaction terms (i.e. `type = "int"`).
* `minmax` (default) minimum and maximum values (lower and upper bounds) of the moderator are used to plot the interaction between independent variable and moderator(s).
* `meansd` uses the mean value of the moderator as well as one standard deviation below and above mean value to plot the effect of the moderator on the independent variable (following the convention suggested by Cohen and Cohen and popularized by Aiken and West (1991), i.e. using the mean, the value one standard deviation above, and the value one standard deviation below the mean as values of the moderator, see Grace-Martin K: 3 Tips to Make Interpreting Moderation Effects Easier).
* `zeromax` is similar to the `minmax` option, however, $0$ is always used as minimum value for the moderator. This may be useful for predictors that don't have an empirical zero-value, but absence of moderation should be simulated by using $0$ as minimum.
* `quart` calculates and uses the quartiles (lower, median and upper) of the moderator value.
* `all` uses all values of the moderator variable.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "int",
mdrt.values = "meansd")
```
## Model Diagnostics
Note: For mixed models, the diagnostic plots like linear relationship or check for Homoscedasticity, do not take the uncertainty of random effects into account, but is only based on the fixed effects part of the model.
### Slope of Coefficentes
Slope of coefficients for each single predictor, against the response (linear relationship between each model term and response).
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "slope")
```
### Residuals
Slope of coefficients for each single predictor, against the residuals (linear relationship between each model term and residuals).
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "resid")
```
### Diagnostics
Check model assumptions.
```{r}
sjPlot::plot_model(fit_read_11re_s,
type = "diag")
```