Possible scenario comparison functionality #221

pratikunterwegs · 2024-04-24T11:56:06Z

This issue logs discussions and prototypes for proposed scenario comparison functionality. Please feel free to add to the conversation below.

Scenario comparison requirements

User-friendly; should be possible to compare scenarios in 1 - 2 functions and < 10 lines of code;
Inputs and outputs should be inter-operable with non-{epidemics} outputs; may have better functionality for {epidemics} outputs (?);
Should be make like-for-like comparisons (matching parameter sets and replicates across scenarios);
Flexibly select outcomes: recoveries or cases, and deaths where available (?);
Return raw differences or a summary of differences.

Discussion points

Should there be a way to prevent outputs from different model structures from being compared?
How should scenarios be identified? e.g. as a simple numeric identifier based on position in a list, or is a unique identifier necessary?

A small reprex shows my thinking on this so far, please feel free to suggest changes.

library(epidemics)

population_size <- 14e6

# prepare initial conditions as proportions
initial_conditions <- c(
  S = population_size - 11, E = 10, I = 1, H = 0, F = 0, R = 0
) / population_size

# prepare a <population> object
guinea_population <- population(
  name = "Guinea",
  contact_matrix = matrix(1), # note dummy value
  demography_vector = 14e6, # 14 million, no age groups
  initial_conditions = matrix(
    initial_conditions,
    nrow = 1
  )
)

population = guinea_population

replicates = 50

# generate set of intervention + baseline scenarios to compare
intervention_set <- list(
  baseline = NULL,
  response = list(
    transmission_rate = intervention(
      "reduce_beta", "rate", 50, 100, 0.5
    )
  ),
  response2 = list(
    transmission_rate = intervention(
      "reduce_beta", "rate", 1, 100, 0.5
    )
  )
)

data = model_ebola(
  population,
  # transmission_rate = transmission_rate,
  intervention = intervention_set,
  replicates = replicates
)

outcomes_averted = function(baseline,
                            scenarios,
                            summarise = TRUE,
                            by_group = TRUE,
                            ...) {
  
  # collect arguments to `epidemic_size()`
  args = list(...) # not used currently
  
  # get epidemic size for baseline and response scenarios
  baseline_outcomes = epidemic_size(baseline, simplify = FALSE, by_group = by_group)
  scenario_outcomes = lapply(scenarios, epidemic_size, simplify = FALSE, by_group = by_group)
  
  # get differences between response and baseline in terms of epidemic size
  averted = vapply(scenario_outcomes, function(df) {
    df$value - baseline_outcomes$value
  }, FUN.VALUE = numeric(nrow(baseline_outcomes)))
  
  # return summarised values or raw differences
  if (summarise) {
    averted_summary = apply(averted, 2, function(x) {
      averted_median = mean(x)
      averted_lims = quantile(x, probs = c(0.05, 0.95))
      names(averted_lims) = sprintf("quantile_%s", c("05", "95"))
      c(median = averted_median, averted_lims)
    })
    
    averted_summary = as.data.frame(t(averted_summary))
    averted_summary$scenario = seq_len(nrow(averted_summary))
    
    # return difference summary
    averted_summary
  } else {
    averted = as.data.frame(averted)
    colnames(averted) = sprintf("scenario_%s", seq_along(averted))
    if (nrow(averted) > 1) {
      averted$replicate = seq_len(nrow(averted))
    }
    
    # return raw difference data
    averted
  }
}

outcomes_averted(
  baseline = data[scenario == 1]$data[[1]],
  scenarios = data[scenario != 1]$data
)
#>    median quantile_05 quantile_95 scenario
#> 1 -228.78     -436.95       47.95        1
#> 2 -359.30     -527.05     -151.70        2

^{Created on 2024-04-24 with reprex v2.0.2}

adamkucharski · 2024-04-24T17:26:04Z

Thanks for sharing. Some quick initial thoughts:

Reflecting on past projects comparing outputs, the big bottlenecks where a lot of code is wasted have been book-keeping (e.g. iterating over scenarios and keeping track of which one is which), storage and retreival (e.g. recording relevant model outputs so they can be analysed later) and trajectory matching and summarising (e.g. pulling the case incidence out of each stored object then comparing total cases averted – and in the case of a stochastic model, or simulations from a posterior with parameter uncertainty, making sure that the code is doing like-for-like comparisons of the same parameter/seed in each sample)
There may be a lot of different things a user wants to compare (too many to predict), so although outcomes_averted() is a useful starting point, making data[scenario == 1]$data[[1]] etc. much easier to handle would be useful, e.g. maybe with something intuitive like get_outcome(data,"cases",1)

pratikunterwegs · 2024-04-25T08:40:38Z

Thanks, would definitely be good to discuss this more before implementing something as it might take some restructuring of inputs etc.

(e.g. iterating over scenarios and keeping track of which one is which)

What would users expect in terms of keeping track of scenarios? The model output currently makes scenarios from the input NPI + vax combinations automatically. Is there a call for changing how that's done - e.g. passing a full intervention set (NPIs + vaccinations) as a single scenario?

making sure that the code is doing like-for-like comparisons of the same parameter/seed in each sample

A question here is whether discrete values of infection parameters, e.g. $R_0$ = 1.5, 2.0, 2.5, should count as separate scenarios, because currently they are treated similar to parameter uncertainty around single values. Might need to pass the parameters, and some way of specifying uncertainty around them, to the 'scenario' object from above.

pratikunterwegs · 2024-06-10T14:16:44Z

Partially addressed in the v0.4.0 release. Moving to Discussions page.

pratikunterwegs self-assigned this Apr 24, 2024

pratikunterwegs added this to v0.4.0: Features for vectorised {epidemics} Apr 24, 2024

pratikunterwegs added the Scenarios Related to scenario comparison functionality label Apr 24, 2024

pratikunterwegs added a commit that referenced this issue Apr 29, 2024

Scenario comparison for deterministic models, WIP #221

58f429b

pratikunterwegs mentioned this issue Apr 30, 2024

Scenario comparison for deterministic models #225

Merged

pratikunterwegs moved this to Discussion in v0.4.0: Features for vectorised {epidemics} Apr 30, 2024

pratikunterwegs added a commit that referenced this issue May 7, 2024

Scenario comparison for deterministic models, WIP #221

2ed4a03

pratikunterwegs mentioned this issue May 8, 2024

Compare ebola scenarios #230

Merged

epiverse-trace locked and limited conversation to collaborators Jun 10, 2024

pratikunterwegs converted this issue into discussion #245 Jun 10, 2024

github-project-automation bot moved this from Discussion to Done in v0.4.0: Features for vectorised {epidemics} Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Possible scenario comparison functionality #221

Possible scenario comparison functionality #221

pratikunterwegs commented Apr 24, 2024

adamkucharski commented Apr 24, 2024

pratikunterwegs commented Apr 25, 2024

pratikunterwegs commented Jun 10, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

Possible scenario comparison functionality #221

Possible scenario comparison functionality #221

Comments

pratikunterwegs commented Apr 24, 2024

Scenario comparison requirements

Discussion points

adamkucharski commented Apr 24, 2024

pratikunterwegs commented Apr 25, 2024

pratikunterwegs commented Jun 10, 2024

This issue was moved to a discussion.