Authoring model algorithms

The ready4fun R package supports standardised approaches to code authoring that facilitate partial automation of the documenting of model algorithms.

This below section renders a vignette article from the ready4fun library. You can use the following links to:

Motivation

The ready4 youth mental health systems model is implemented using an object-oriented programming (OOP) approach. One motivation for using OOP is the concept of “abstraction” - making things as simple as possible for end-users of ready4 modules by exposing the minimal amount of code required to implement each method.

However, some users of the ready4 modules will want to “look under the hood” and examine the code that implements module algorithms in much more detail. Reasons to do so include to:

  • gain detailed insight into how methods are implemented;
  • test individual sub-components (“functions”) of methods as part of code verification and model validation checks;
  • re-use functions when authoring new methods.

Therefore when authoring ready4 code libraries, it is important to ensure that “under the hood” code can be readily understood. Two ways for achieving this goal is to ensure that all functions (even those not intended for use by modeller end-users) are adequately documented and adopt a consistent house style (e.g. naming conventions). ready4fun provides workflow tools (classes, methods, functions and datasets) to achieve these goals.

ready4fun function authoring taxonomies, abbreviations and workflow

The ready4fun package uses a dataset of taxonomies and abbreviations to ensure standardised function code style and documentation. A copy of this dataset (dataset_ls) can be downloaded from a repository associated with the ready4 package using tools from the ready4use package package.

dataset_ls <- ready4use::Ready4useRepos(gh_repo_1L_chr = "ready4-dev/ready4",
                               gh_tag_1L_chr = "Documentation_0.0") %>%
  ingest(metadata_1L_lgl = F)

Function names begin with a meaningful verb

Consistent with a naming convention popular in the R development community, all ready4 framework functions begin with a verb. Furthermore, the choice of verb is meaningful - it communicates something about the type of task a function implements. For example, all functions beginning with the word “fit” will fit a model of a specified type to a dataset. The definitions of all meaningful verbs currently used by ready4 functions (excluding methods) are stored in element fn_types_lup of dataset_ls, the key features of which are reproduced below.

dataset_ls$fn_types_lup %>% 
  ready4fun_functions() %>%
  renew(filter_cdn_1L_chr = "!is_generic_lgl & !stringr::str_detect(fn_type_nm_chr, pattern = ' ')") %>%
  exhibit(select_int = 1:2,
          scroll_box_args_ls = list(width = "100%"))
Meaningful verbs
Verb Description
Add Updates an object by adding data to that object.
Assert Validates that an object conforms to required condition(s). If the object does not meet all required conditions, program execution will be stopped and an error message provided.
Bind Binds two objects together to create a composite object.
Calculate Performs a numeric calculation.
Close Closes specified connections.
Extract Extracts data from an object.
Fit Fits a model of a specified type to a dataset
Force Checks if a specified local or global environmental condition is met and if not, updates the specified environment to comply with the condition.
Format Modifies the format of an output.
Get Retrieves a pre-existing data object from memory, local file system or online repository.
Import Reads a data object in its native format and converts it to an R object.
Impute Imputes data.
Knit Knits a rmarkdown file
Launch Launches an application
Make Creates a new R object.
Plot Plots data
Predict Makes predictions from data using a specified statistical model.
Print Prints output to console
Randomise Randomly samples from data.
Read Reads an R script into memory.
Remove Edits an object, removing a specified element or elements.
Rename Renames elements of an object based on a pre-speccified schema.
Reorder Reorders an object to conform to a pre-specified schema.
Replace Edits an object, replacing a specified element with another specified element.
Reset Edits an object, overwriting the current version with a default version.
Rowbind Performs custom rowbind operations on table objects.
Scramble Randomly reorders an object.
Transform Edits an object in such a way that core object attributes - e.g. shape, dimensions, elements, type - are altered.
Unload Performs a custom detaching of a package from the search path.
Update Edits an object, while preserving core object attributes.
Validate Validates that an object conforms to required criteria.
Write Writes a file to a specified local directory.

Function inputs and outputs have meaningful suffices

The type of input (arguments) required and output (return) produced by a function can be efficiently communicated by using meaningful suffices. For example all objects ending in “_chr” are character vectors and all objects ending in “_int” are integer vectors. The meaningful suffices currently used by to describe objects in the ready4 framework are stored in element seed_obj_type_lup of dataset_ls, the key features of which are reproduced below.

dataset_ls$seed_obj_type_lup %>% 
  ready4fun_objects() %>%
  exhibit(select_int = 1:2,
          scroll_box_args_ls = list(width = "100%"))
Meaningful suffices
Suffix Description
arr array
chr character
dbl double
df data.frame
dtm date
env environment
fct factor
fn function
int integer
lgl logical
ls list
lup lookup table
mat matrix
mdl model
plt plot
prsn person
r3 ready4 S3
r4 ready4 S4
rgx regular expression
s3 S3
s4 S4
sf simple features object
tb tibble

Consistent use of abbreviations

Further information about the purpose of a function and the nature of its inputs and outputs can be encoded by using naming conventions that make consistent use of abbreviations. A master table of the abbreviations used throughout the ready4 framework is maintained in the abbreviations_lup element of dataset_ls. The list of abbreviations is now quite extensive and continues to grow as the ready4 suite of software expands. The initial few entries of abbreviations_lup are reproduced below.

dataset_ls$abbreviations_lup %>% 
  head() %>%
  exhibit(select_int = 1:2,
          scroll_box_args_ls = list(width = "100%"))
Abbreviations
Abbreviation Description
... additional arguments
1L length one
1L_chr character vector of length one
1L_chr_ls list of character vectors of length one
1L_chr_r4 ready4 S4 collection of character vectors of length one
1L_dbl double vector of length one

Workflow

Manifest

The main class exported as part of ready4fun is the ready4 sub-module ready4fun_manifest which is used to specify metadata (including details of the repository in which the fn_types_lup, seed_obj_lup_tb and abbreviations_lup objects are stored) for the functions being authored and the R package that will contain them.

Typical Usage

A ready4fun_manifest object is most efficiently created with the aid of the make_pkg_desc_ls and make_manifest functions rather than a direct call to the ready4fun_manifest() function.

## Not run
x <- ready4fun::make_pkg_desc_ls(pkg_title_1L_chr = "Your Package Title",
                                 pkg_desc_1L_chr = "Your Package Description.",
                                 authors_prsn = c(utils::person("Author 1 Name",
                                                                role = c("aut", "cre")),
                                                  utils::person("Author 2 Name", role = c("cph"))),
                                 urls_chr = c("Package website url",
                                              "Package source code url",
                                              "Project website")) %>%
  ready4fun::make_manifest(copyright_holders_chr = "Organisation name",
                           custom_dmt_ls = ready4fun::make_custom_dmt_ls(user_manual_fns_chr = c("Functions to be included in main user manual are itemised here")),
                           dev_pkgs_chr = c("Any development package dependencies go here"),
                           path_to_pkg_logo_1L_chr = "Local path to package logo goes here",
                           piggyback_to_1L_chr = "GitHub Release Repository to which supporting files will be uploaded",
                           ready4_type_1L_chr = "authoring",
                           zenodo_badge_1L_chr = "DOI badge details go here")

The main method defined for ready4fun_manifest is author which, assuming the raw undocumented function files are saved in the appropriate directories, will author an R package in which all functions are consistently documented.

## Not run
author(x)

Examples

The ready4fun_manifest sub-module and its methods along with the make_pkg_desc_ls and make_manifestfunctions are designed to be used as part of the ready4pack R package authoring workflow. That vignette includes links to two examples of where the ready4pack workflow has been used to author R package. To illustrate how readyfun tools used as part of that workflow are used to document functions, we are just going to focus on the program used to create the ready4show package.

That program makes use of ready4fun tools that read all undocumented package functions, performs automated checks to ensure that these functions appropriately use the taxonomies and abbreviations mentioned previously (prompting authors to make specific amendments if they do not) and then rewrites these functions to the package R directory, appending tags (with the aid of the sinew package) that will generate meaningful documentation.

For example, one of the functions to be documented is the knit_from_tmpl, which is transformed to a version with tags. The tags added to all functions are then used to generate the package documentation, including the package manual. Two versions of the ready4show package manual are generated - a slimmed down version for end-users and a more detailed inventory of contents intended for developers.

Future documentation

Detailed guidance for how to apply ready4fun workflow tools has yet to be prepared but will be released in 2022.

Last modified July 18, 2023: orygen monash handover (736051b)