Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for generating quantities for the population level #296

Closed
gowerc opened this issue Mar 28, 2024 · 7 comments · Fixed by #309
Closed

Add support for generating quantities for the population level #296

gowerc opened this issue Mar 28, 2024 · 7 comments · Fixed by #309
Labels
enhancement New feature or request

Comments

@gowerc
Copy link
Collaborator

gowerc commented Mar 28, 2024

Currently we only support generated quantities for individual subjects who exist within the dataset. Francois has asked that we also have functionality to produce estimates for the population level (e.g. a hypothetical patient with 0 random effects).

@gowerc gowerc added this to the Initial Internal Release milestone Mar 28, 2024
@gowerc
Copy link
Collaborator Author

gowerc commented Mar 28, 2024

Actually @danielinteractive / @mercifr1 - Apologies I'm not entirely understanding what the expectation is here.

Take for example the phi parameter current GSF model. Currently this is formulated as:

patient_phi ~ beta(population_a,  population_b)

In this case what would you expect the "Population Phi" value to be? My naive guess would be to just take the mean of the parent distribution e.g. population_a / (population_a + population_b) but just wanted to double check that this is what you were expecting in this instance.

For reference the same issue also comes up for the other parameters if the user uses the "centered" parameterisation where the individual patient parameters are sampled as:

    lm_gsf_psi_bsld ~ lognormal(lm_gsf_mu_bsld[pt_study_index], lm_gsf_omega_bsld);
    lm_gsf_psi_ks ~ lognormal(lm_gsf_mu_ks[pt_arm_index], lm_gsf_omega_ks);
    lm_gsf_psi_kg ~ lognormal(lm_gsf_mu_kg[pt_arm_index], lm_gsf_omega_kg);

@gowerc
Copy link
Collaborator Author

gowerc commented Mar 28, 2024

@mercifr1 / @danielinteractive - Sorry also what does this mean in the case of the survival distribution where the user is allowed to use arbitrary covariates when fitting the model? How should the design matrix for the hypothetical "population patient" be constructed ?

@gowerc gowerc added the enhancement New feature or request label Mar 28, 2024
@danielinteractive
Copy link
Collaborator

Thanks @gowerc , yeah agree we have to specify this in detail first before proceeding to the implementation. A general question here will be how much and what should be easily available by options/methods and what the user need to do themselves based on samples of all model parameters.

@mercifr1
Copy link
Collaborator

mercifr1 commented Apr 2, 2024

Hi @gowerc
The phi parameter is indeed a special case, as - unlike with the other parameters - it is not parameterized hierarchically in the current jmpost implementation of GSF. However, you could probably parameterize it hierarchically, and it would solve this issue.
For instance, you could consider:

parameters{
    real mean_mu_phi;
    ...
    real<lower=0, upper=3> sd_mu_phi;
    ...
    vector<lower=0, upper=1>[n_arms] mu_phi;
    ...
    real<lower=0> omega_phi;
    ...
    vector[Nind] eta_tilde_phi;
    ...
}

transformed parameters{
    ...
    psi_phi=inv_logit(logit(mu_phi[arm_index])+eta_tilde_phi*omega_phi);
    ...
}

The hierarchical parameterization of the 'biomarker longitudinal submodel' gives direct access to the typical value of the parameter estimates, in the considered sample (ie. arm or study). In the nomenclature used in jmpost, it means that the mu_param is the typical value of the parameter estimate.
There is no hierarchical structure in the survival models implemented in jmpost. So, all we produce for this submodel are population-level or typical-value estimates (with or without covariate, it does not change this statement).

Hope this helps.

@gowerc
Copy link
Collaborator Author

gowerc commented Apr 2, 2024

@mercifr1,

Sorry just to be clear we currently do have a hierarchical parameterisation on the Phi parameter, its just our current implementation uses the "centered" parameterisation whereas what you've shown above is the "non-centred" parameterisation.

This is not limited to just the Phi parameter though, we currently support both "centred" and "non-centred" for all the parameters. As an example in the current GSF model we have currently have the following for the "k" parameter:

If the user selects "centred" parameterisation:

lm_gsf_psi_ks ~ lognormal(lm_gsf_mu_ks[pt_arm_index], lm_gsf_omega_ks);

If the user selects "non-centred":

lm_gsf_psi_ks  = exp( lm_gsf_mu_ks[pt_arm_index] + (lm_gsf_eta_tilde_ks * lm_gsf_omega_ks) )

My understanding though from the Stan documentation is that these are 100% equivalent. The purpose of the "non-centred" parameterisation is just to be more computational stable when dealing with low numbers of observations per patient whereas the "centered" parameterisation is more stable when dealing with larger numbers of observations per patient.

Unfortunately I'm not sure how to unify these two cases in terms of sampling from the population as simply setting lm_gsf_eta_tilde_ks=0 is not equivalent to taking the mean of the distribution which for the log-normal would be:

exp (lm_gsf_mu_ks[pt_arm_index] +  lm_gsf_omega_ks / 2)

With regards to the phi parameter in particular I'm a bit hesitant to convert it to the non-centred parameterisation because there is no equivalent parameteric distribution to use in the centred case thus switching between the 2 would result in different distributions being used which doesn't seem right to me.

@gowerc
Copy link
Collaborator Author

gowerc commented Apr 3, 2024

@mercifr1

Just writing up what we spoke about in our chat this morning:

For the GSF model for parameters that support the non-centered parameterisation we will just use the corresponding $\mu$ value from the LogNormal distribution e.g. lm_gsf_mu_ks in the case of the shrinkage parameter.

For distributions that don't support the non-centered parameterisation (e.g. the phi parameter) we will just take the mean on the distribution.

Sorry @mercifr1 - Just to be clear though is this only required for the longitudinal model ? As in supporting this in the survival model would be more challenging as we get into the realms of emmeans as we'd need to define the covariates for the design matrix of a hypothetical patient.

@gowerc gowerc mentioned this issue Apr 8, 2024
@gowerc
Copy link
Collaborator Author

gowerc commented Apr 8, 2024

Regarding the population survival quantities I think this was on pause for now for @mercifr1 to consider it some more due to the complications of how to actually meaningfully calculate it due to the need of TGI parameters as well as covariate specification for a "hypothetical patient".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants