Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with difference_smooths #143

Open
mbolyanatz opened this issue Mar 10, 2022 · 7 comments
Open

Problem with difference_smooths #143

mbolyanatz opened this issue Mar 10, 2022 · 7 comments

Comments

@mbolyanatz
Copy link

Hi Gavin,

I tried to use the difference_smooths function to plot a difference in smooths in a GAMM with random smooths, and got an error. Does the function work for such models, or only for models without random smooths? My model syntax is something like this (I'm not able to share the data themselves but this is a reproduction):

model <- bam(y~ fac + 
          s(x, by=fac, bs="tp")+ 
          s(x, participant, bs="fs", m=2)+
          s(word, bs="re"),
          data=data, discrete=FALSE, method="ML")

Thank you so much for any insights!
Marishka

@gavinsimpson
Copy link
Owner

Can you try it without any bs = "re" smooths and let me know if that still fails? It will help narrow down the problem for if you can try that for me.

@gavinsimpson
Copy link
Owner

Also, what are you hoping to get out of this? Differences for

  1. s(x, by=fac, bs="tp"), or
  2. s(x, participant, bs="fs", m=2), or
  3. both sets of smooths

?

I'm not sure it knows internally about the fs smooths, but the differences would be of a different sort for those as the basis includes the group (participant) means while the by factor smooth version does differences excluding the group means.

@mbolyanatz
Copy link
Author

Sorry for the delay!

difference_smooths() works on a model which does not include any bs="re" smooths, only random smooths of the form s(x, participant, bs="fs", m=2). In other words, the model looks like

model <- bam(y~ fac + 
          s(x, by=fac, bs="tp")+ 
          s(x, participant, bs="fs", m=2),
          data=data, discrete=FALSE, method="ML")

For the original model,

model <- bam(y~ fac + 
          s(x, by=fac, bs="tp")+ 
          s(x, participant, bs="fs", m=2)+
          s(word, bs="re"),
          data=data, discrete=FALSE, method="ML")

I was hoping to get the overall effect of x on y for each level of fac first, and then to do comparisons between overall effects of x on y among all possible pairings of levels of fac. What is the best way to achieve this? Is it best to re-fit the original model with fac converted to an ordered factor?

Any clarifications would be much appreciated! Thanks!

@mbolyanatz
Copy link
Author

Just to clarify, here is a set of plots produced by gratia's draw() for one of my models with bs="re" terms.

What I really want is to:

  1. Report the overall smooth effect of PERC on the response variable for each level of gSES;
  2. Report the differences in pairs of overall smooth effects of PERC across levels of gSES.

Rplot

I thought that difference_smooths() would allow me to get the differences between the overall smooth effects of PERC across pairs of gSES levels, similar to what plot_diff() from the itsadug package does.

Thank you!

@gavinsimpson
Copy link
Owner

I think there's a missing word in

difference_smooths() works on a model which does not include any bs="re" smooths, only random smooths of the form s(x, participant, bs="fs", m=2)

which leaves me a little confused as to what doesn't work (models without bs = "re" work? but you have a ranef smooth so want difference_smooths() to work for models with ranef smooths too?)

{itsadug} is doing something different; it is producing conditional differences of estimated values of the response. I haven't implemented this in {gratia} within difference_smooths() because of the problem of what to do with models with two or more factors? As difference_smooths() works on the partial effects of the smooths, it literally just compares pairs of estimated smooth functions. When you start including additional covariates effects into the model, like the group means, you have to be careful with what those imply; typically, the intercept is the mean of Y for the reference levels of the combination of factors in the model. While it is easy to see what should be done for your model — there is but one factor and we can marginalise over the ranef smooths — it becomes trickier to decide what to do in the presence of multiple factors as the user has to specify what level(s) of the other factor(s) to condition the differences on and doing this automatically runs the risk of giving the user something that they don't understand; it's no longer possible to return the differences for the stated pairs of smooths only, as these are now the differences of the stated pair of smooth conditional upon some other level(s) of those other factor term(s) in the model.

{itsadug} does this by getting the user to specify cond and defaults to the modal (or reference?) level for other factors and the closest data value to the median for continuous.

As other people have asked for this, I will look into adding a new function conditional_differences() which will aim to do what you want and which would work similarly to {itsadug}

@mbolyanatz
Copy link
Author

Thank you again for your quick response!

Sorry for any confusion. My original model was:

model_original <- bam(y~ fac + 
          s(x, by=fac, bs="tp")+ 
          s(x, participant, bs="fs", m=2)+
          s(word, bs="re"),
          data=data, discrete=FALSE, method="ML")

When I tried difference_smooths for this model_original, it did not work.

I thought you wanted me to try to run this model but exclude "any random effects of the specific type bs="re"", so in model_simplified I took out the bs="re" smooth but kept the bs="fs" smooth:

model_simplified <- bam(y~ fac + 
          s(x, by=fac, bs="tp")+ 
          s(x, participant, bs="fs", m=2),
          data=data, discrete=FALSE, method="ML")

When I tried difference_smooths for model_simplified, it worked. So my conclusion was that the bs="re" smooth was causing difference_smooths not to work in the original model. Is this conclusion incorrect?

Ideally, I would want difference_smooths to work for the original model, not just the modified model, and in fact, with additional fixed factors:

fac12<-interaction(fac1,fac2)

model_ideal <- bam(y~ fac12 + fac3 + fac4+ 
          s(x, by=fac12, bs="tp")+ 
          s(x, participant, bs="fs", m=2)+
          s(word, bs="re"),
          data=data, discrete=FALSE, method="ML")

I would like to compare those partial "global" effects of x on y across levels of fac12 controlling for fac3 and fac4 and that is why I was hoping difference_smooths was the right function to use, but having the bs="re" smooth in the model prevents me from using the function in the model. Am I wrong in thinking that a function like conditional_differences that you're planning on writing would not help me directly compare levels of fac12 by x, the individual curves? Is it best to re-fit the original model with fac12 converted to an ordered factor so as to get differences in smooths directly?

Regardless, I really appreciate your explanation of how the plot_diff function is different, and the possible pitfalls inherent in what I'm asking for. Thank you!!

@gavinsimpson
Copy link
Owner

Turns out I was massively over-thinking this, at least for the case of including group means (the complexity where there are multiple factors just cancels for any reasonable comparison because the level of the second factor you condition on is the same for both groups you are differencing.)

From 0.7.3.12 with difference_smooths() you can include the group means in the comparison.

In figuring all this out I have a function that does the same thing in a much more complex way which is entirely redundant for the simple case of including group means, but it will be useful for reproducing the behaviour of plot_diff() (which is differencing model fitted values, possibly with some terms excluded from the model, for a continuous covariate and a specified pair of levels of a factor.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants