Installation
Install the development version from GitHub:
# install.packages("remotes")
remotes::install_github("dhis2-chap/chap_r_sdk")Note: If prompted for GitHub authentication, you can skip it by pressing Enter (the repository is public).
Load the package:
What is the Chap R SDK?
The chapr package provides infrastructure for developing
disease forecasting models compatible with the Chap platform. Chap
(Climate Health Analytics Platform) enables health ministries to run
predictive models for disease surveillance.
This SDK simplifies model development by handling:
- CLI creation: Command-line interfaces for train/predict workflows
- File I/O: Automatic CSV loading, tsibble conversion, output formatting
- Configuration: YAML/JSON config parsing with schema validation
- Validation: Test suites to verify Chap compatibility
Quick Start
The recommended pattern uses create_chap_cli() to create
a complete command-line interface:
library(chapr)
library(dplyr)
# Define training function - receives loaded tsibble, not file paths
train_my_model <- function(training_data, model_configuration = list(), run_info = list()) {
means <- training_data |>
group_by(location) |>
summarise(mean_cases = mean(disease_cases, na.rm = TRUE))
return(list(means = means))
}
# Define prediction function - all inputs already loaded
predict_my_model <- function(historic_data, future_data, saved_model,
model_configuration = list(), run_info = list()) {
predictions <- future_data |>
left_join(saved_model$means, by = "location") |>
mutate(samples = purrr::map(mean_cases, ~c(.x))) |>
select(-mean_cases)
return(predictions)
}
# Enable CLI with one function call
if (!interactive()) {
create_chap_cli(train_my_model, predict_my_model)
}Model Function Interface
Your model needs two functions:
Training Function
train_fn <- function(training_data, model_configuration = list(), run_info = list()) {
# training_data: tsibble with time_period index, location key, disease_cases
# model_configuration: optional list of parameters from config file
# run_info: runtime info from Chap (prediction_length, additional_continuous_covariates, etc.)
# Returns: model object (saved as RDS)
}Prediction Function
predict_fn <- function(historic_data, future_data, saved_model,
model_configuration = list(), run_info = list()) {
# historic_data: tsibble with historical observations
# future_data: tsibble with time periods to predict (no disease_cases)
# saved_model: object returned by train_fn
# run_info: runtime info from Chap
# Returns: tibble with samples list-column
}Important: historic_data may contain
more recent observations than training data. Time series models should
refit to historic_data before forecasting.
What the SDK Handles
You don’t need to write code for:
| Task | SDK handles it |
|---|---|
| Loading CSV files | readr::read_csv() |
| Converting to tsibbles | tsibble::as_tsibble() |
| Detecting time columns | Finds time_period, date,
week, etc. |
| Detecting key columns | Finds location, region, etc. |
| Loading/saving models |
readRDS() / saveRDS()
|
| Parsing configs | yaml::yaml.load_file() |
Your functions only contain business logic - no file I/O boilerplate.
Data Format
Training/Historic Data
CSV with time, location, target, and covariates:
time_period,location,disease_cases,population,rainfall
2023-01,LocationA,45,10000,120.5
2023-02,LocationA,52,10000,85.2
2023-01,LocationB,78,15000,130.1
Future Data
Same structure without the target variable:
time_period,location,population,rainfall
2023-05,LocationA,10000,95.0
2023-06,LocationA,10000,110.3
Configuration
Safe Parameter Extraction
config <- list(
model = list(
params = list(learning_rate = 0.01, epochs = 100)
)
)
# Extract nested parameters with defaults
lr <- get_config_param(config, "model", "params", "learning_rate", .default = 0.001)
print(lr)
#> [1] 0.01
# Returns default if path not found
missing <- get_config_param(config, "model", "missing", .default = "default")
print(missing)
#> [1] "default"Configuration Schema
Define a schema for validation and the info
subcommand:
config_schema <- create_config_schema(
title = "My Model Configuration",
properties = list(
n_samples = schema_integer(default = 100L, minimum = 1L),
learning_rate = schema_number(default = 0.01, minimum = 0, maximum = 1)
)
)
create_chap_cli(train_fn, predict_fn, model_config_schema = config_schema)Next Steps
- Building Your First Chap Model: Step-by-step tutorial with validation
- Working with Spatio-Temporal Data: Utilities for aggregation and transformation
- Function Reference: Complete API documentation
Getting Help
- Issues: GitHub Issues
- CHAP Platform: github.com/dhis2/chap-core