Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of the scenario class #3

Open
pratikunterwegs opened this issue Jan 9, 2023 · 0 comments
Open

Design of the scenario class #3

pratikunterwegs opened this issue Jan 9, 2023 · 0 comments
Assignees
Labels
discussion For package design discussions

Comments

@pratikunterwegs
Copy link
Collaborator

This isssue is an ongoing discussion about the design of the {scenario} class.

A scenario is intended to be an S3 class object that holds the specifications of an epidemic simulation run, i.e., the function and its arguments (simulation parameters), and optionally, the data obtained from executing these simulation runs. S3 rather than S4 or R6 because it is somewhat better documented and/or more widely used or does not require a dependency (R6).

Attributes

  • Simulation function: Which function returns the scenario output, e.g. final_size() from finalsize.
  • Parameters: A list of simulation parameters stored as a named list, e.g. the parameters passed to a single run of finalsize::final_size().
  • Replicates: The number of replicates of a simulation to run, using the function specified as the simulation function, with parameters specified above. Specifying more than one replicate only makes sense for stochastic simulations where epidemic outputs vary in each replicate.
    For analytical models such as final_size(), which usually converge on the same value in each run, there will need to be a way to specify that one or more simulation parameters should be drawn from distributions (e.g. R0 or a social contact matrix); alternatively, each draw of a parameter from a distribution could be a single replicate of a unique scenario. 
  • Data availability tag: A boolean tag (TRUE/FALSE) that indicates whether the scenario object has any simulation output data. The idea is to allow scenarios to exist as simulation run specifications without data (i.e., an intent to run N replicates of this epidemic simulation with these parameters). This avoids using memory and processing time until required, such as at the comparison stage. This tag should be updated after any epidemic simulations are run (see Methods/Functions). The idea is for all replicates to be run simultaneously, and this could be parallelised to improve speed.
    For discussion: Whether it should be possible to remove data after extracting summary statistics, to save working memory. 
  • Data list: A named list of epidemic scenario outputs, e.g., a list of outputs from final_size(), or from future epidemics functions [placeholder name epi_demic()]. List names are added to make list indexing easier to understand; I.e., it is easier to see what a function is doing when it selects an object as data[[“finalsize_UK_full_susceptibility”]], rather than data[[1]].
  • Summary statistics: A named list of summary statistics on the epidemic simulation outcomes, e.g., the mean and 95% CI of the final sizes of an epidemic by age group.
    For discussion: Which summary statistics to return, and how to represent them. Note: See also how epiparameter represents delay distributions (drawing on implementation in distributional). 

Methods/functions

  • Constructor: Initialise a new epidemic scenario, with a function name and parameter list. Data are not initially prepared. Alternatively, convert a list of data objects, a parameter list, and a function name into a scenario object whose data preparation tag is set to TRUE. 
  • Print (and Summary): Print a representation of the scenario object to screen. This should include important details including the function that was used to run the epidemic simulation, the parameter list (truncated for readability if necessary), the number of replicates, and the data availability tag.
  • Run scenario: A function to populate the scenario data list with output from N replicates of the specified function, using the parameter list. Calls e.g. final_size() or in future, epi_demic().
  • Summarise scenario: A function to get summary statistics from the N simulation outputs and populate the summary statistics field.
  • Access functions: Functions to access class elements, allowing users to avoid accessing them directly (e.g. using scenario$…).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion For package design discussions
Projects
None yet
Development

No branches or pull requests

1 participant