chap_core.models package

Submodules

chap_core.models.chapkit_rest_api_wrapper module

Synchronous REST API wrapper for CHAP service Provides synchronous methods for all available API endpoints

NOTE: Written by ai as a prototype, TODO: refactor and cleanup once working

class chap_core.models.chapkit_rest_api_wrapper.CHAPKitRestAPIWrapper(base_url: str = 'http://localhost:8001', timeout: int = 7200)[source]

Bases: object

Synchronous client for interacting with the CHAP REST API

close()[source]

Close the client connection

create_config(config: Dict[str, Any]) Dict[str, Any][source]

Create or replace a model configuration

Args:

config: Configuration dictionary

Returns:

Created configuration with ID

delete_artifact(artifact_id: str) None[source]

Delete an artifact by ID

Args:

artifact_id: Artifact ID to delete

delete_config(config_id: str) None[source]

Delete a configuration by ID

Args:

config_id: Configuration ID to delete

delete_job(job_id: str) None[source]

Cancel (if running) and delete a job

Args:

job_id: Job ID to delete

get_artifact(artifact_id: str) Dict[str, Any][source]

Get a specific artifact by ID

Args:

artifact_id: Artifact ID

Returns:

Artifact info object

get_artifact_config(artifact_id: str) Dict[str, Any][source]

Get the configuration associated with an artifact

Args:

artifact_id: Artifact ID

Returns:

Configuration object linked to the artifact

get_artifact_expand(artifact_id: str) Dict[str, Any][source]

Get artifact with expanded data

Args:

artifact_id: Artifact ID

Returns:

Expanded artifact object

get_artifact_tree_by_id(artifact_id: str) Dict[str, Any][source]

Get artifact tree starting from a specific artifact

Args:

artifact_id: Artifact ID

Returns:

Artifact tree with nested children

get_artifacts_for_config(config_id: str) List[Dict[str, Any]][source]

Get all artifacts linked to a configuration

Args:

config_id: Configuration ID

Returns:

List of artifact info objects

get_config(config_id: str) Dict[str, Any][source]

Get a specific configuration by ID

Args:

config_id: Configuration ID

Returns:

Configuration object

get_config_artifacts(config_id: str) List[Dict[str, Any]][source]

Get all artifacts linked to a configuration

Args:

config_id: Configuration ID

Returns:

List of artifact objects linked to the configuration

get_config_schema() Dict[str, Any][source]

Get JSON Schema for model configuration

Returns:

JSON Schema for configuration model

get_job(job_id: str) Dict[str, Any][source]

Get full job record by ID

Args:

job_id: Job ID

Returns:

Job record with status, times, error info, etc.

get_jobs(status: str | None = None) List[Dict[str, Any]][source]

Get all jobs, optionally filtered by status

Args:

status: Optional status filter (‘pending’, ‘running’, ‘completed’, ‘failed’, ‘canceled’)

Returns:

List of job records

health() Dict[str, str][source]

Check service health status

Returns:

Dict with status field (‘healthy’)

info() Dict[str, Any][source]

Get system information

Returns:

System info including name, version, description, etc.

Link an artifact to a configuration

Args:

config_id: Configuration ID artifact_id: Artifact ID to link

Returns:

Updated configuration or confirmation

list_configs() List[Dict[str, Any]][source]

List all model configurations

Returns:

List of model configuration objects

poll_job(job_id: str, timeout: int | None = None) Dict[str, Any][source]

Simple polling method that waits for a job to complete

Args:

job_id: Job ID to poll timeout: Maximum seconds to wait (None for no timeout)

Returns:

Final job status when completed

predict(model_artifact_id: str, future_data: DataFrame, historic_data: DataFrame | None = None, geo_features: Dict[str, Any] | None = None) Dict[str, str][source]

Make predictions with a trained model

Args:

model_artifact_id: Trained model artifact ID future_data: Future covariates as pandas DataFrame historic_data: Optional historical data as pandas DataFrame geo_features: Optional GeoJSON FeatureCollection

Returns:

Dict with job_id and prediction_artifact_id

predict_and_wait(model_artifact_id: str, future_data: DataFrame, historic_data: DataFrame | None = None, geo_features: Dict[str, Any] | None = None, timeout: int | None = 7200) Dict[str, Any][source]

Make predictions and wait for completion

Args:

model_artifact_id: Trained model artifact ID future_data: Future covariates historic_data: Optional historical data geo_features: Optional GeoJSON features timeout: Maximum seconds to wait

Returns:

Dict with job record and prediction_artifact_id

train(config_id: str, data: DataFrame, geo_features: Dict[str, Any] | None = None) Dict[str, str][source]

Train a model with data

Args:

config_id: Configuration ID to use for training data: Training data as pandas DataFrame geo_features: Optional GeoJSON FeatureCollection

Returns:

Dict with job_id and model_artifact_id

train_and_wait(config_id: str, data: DataFrame, geo_features: Dict[str, Any] | None = None, timeout: int | None = 300) Dict[str, Any][source]

Train a model and wait for completion

Args:

config_id: Configuration ID data: Training data geo_features: Optional GeoJSON features timeout: Maximum seconds to wait

Returns:

Dict with job record and model_artifact_id

Unlink an artifact from a configuration

Args:

config_id: Configuration ID artifact_id: Artifact ID to unlink

Returns:

Updated configuration or confirmation

update_config(config_id: str, config: Dict[str, Any]) Dict[str, Any][source]

Update a configuration by ID

Args:

config_id: Configuration ID config: Updated configuration dictionary

Returns:

Updated configuration

wait_for_job(job_id: str, poll_interval: int = 2, timeout: int | None = None) Dict[str, Any][source]

Wait for a job to complete

Args:

job_id: Job ID to monitor poll_interval: Seconds between status checks timeout: Maximum seconds to wait (None for no timeout)

Returns:

Final job status

Raises:

TimeoutError: If job doesn’t complete within timeout

chap_core.models.configured_model module

class chap_core.models.configured_model.ConfiguredModel[source]

Bases: ABC

A ConfiguredModel is the main interface for all models in the Chap framework. A configured model is different from a model template in that it is configured with specific hyperparameters and/or other choices. While a ModelTemplate is flexible with choices, a ConfiguredModel has fixed choices and parameters. See ExternalModel for an example of a ConfiguredModel.

abstractmethod predict(historic_data: DataSet, future_data: DataSet) DataSet[source]
abstractmethod train(train_data: DataSet, extra_args=None)[source]
class chap_core.models.configured_model.ModelConfiguration[source]

Bases: BaseModel

BaseClass used for configuration that a ModelTemplate takes for creating specific Models

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

chap_core.models.external_chapkit_model module

class chap_core.models.external_chapkit_model.ExternalChapkitModel(model_name: str, rest_api_url: str, configuration_id: str)[source]

Bases: ExternalModelBase

predict(historic_data: DataSet, future_data: DataSet) DataSet[source]
train(train_data: DataSet, extra_args=None)[source]
class chap_core.models.external_chapkit_model.ExternalChapkitModelTemplate(rest_api_url: str)[source]

Bases: object

Wrapper around External models that are based on chapkit.

Note that get_model assumes you have already created a configuration with that specific chapkitmodel.

This method is meant to be backwards compatible with ExternalModelTemplate

get_model(model_configuration: dict) ExternalChapkitModel[source]

Sends the model configuration for storing in the model (by sending to the model rest api). This returns a configuration id back that we can use to identify the model.

get_model_template_config() ModelTemplateConfigV2[source]

This method is meant to make things backwards compatible with old system. An object of type ModelTemplateConfigV2 is needed to store info about a ModelTemplate in the database.

is_healthy() bool[source]
property name

This returns a unique name for the model. In the future, this might be some sort of id given by the model

wait_for_healthy(timeout=60)[source]

chap_core.models.external_model module

class chap_core.models.external_model.ExternalModel(runner, name: str = None, adapters=None, working_dir=None, data_type=<class 'bionumpy.bnpdataclass.bnpdataclass.HealthData'>, configuration: ~chap_core.database.model_templates_and_config_tables.ModelConfiguration | None = None)[source]

Bases: ExternalModelBase

An ExternalModel is a specififc implementation of a ConfiguredModel that represents a model that is “external” in the sense that it needs to be run through a runner (e.g. a DockerRunner). This class is typically used for external models developed outside of Chap, and gives such models an interface with methods like train and predict so that they are compatible with Chap.

property configuration
property name
property optional_fields
predict(historic_data: DataSet, future_data: DataSet) DataSet[source]
property required_fields
train(train_data: DataSet, extra_args=None)[source]

Trains this model on the given dataset.

Parameters

train_dataDataSet

The data to train the model on

extra_argsstr

Extra arguments to pass to the train command

class chap_core.models.external_model.ExternalModelBase[source]

Bases: ConfiguredModel

A base class for external models that provides some utility methods

chap_core.models.external_web_model module

class chap_core.models.external_web_model.ExternalWebModel(api_url: str, name: str = None, timeout: int = 3600, poll_interval: int = 5, configuration: dict | None = None, adapters: dict | None = None, working_dir: str = './')[source]

Bases: ExternalModelBase

Wrapper for a ConfiguredModel that can only be run through a web service, defined by an URL. This web service supports a strict REST API that allows for training and prediction. This class makes such a model available through the ConfiguredModel interface, with train and predict methods.

property configuration
property name
predict(historic_data: DataSet, future_data: DataSet) DataSet[source]

Predicts by starting prediction and waiting for the model to finish prediction.

train(train_data: DataSet, extra_args=None)[source]

Trains the model by starting training and waiting for the model to finish training

chap_core.models.local_configuration module

class chap_core.models.local_configuration.LocalModelTemplateWithConfigurations(*, url: str, uses_chapkit: bool = False, versions: dict[str, str], configurations: dict[str, ModelConfiguration] = {'default': ModelConfiguration(user_option_values={}, additional_continuous_covariates=[])})[source]

Bases: BaseModel

Class only used for parsing ModelTemplate from config/models/*.yaml files.

configurations: dict[str, ModelConfiguration]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

url: str
uses_chapkit: bool
versions: dict[str, str]
chap_core.models.local_configuration.parse_local_model_config_file(file_name) list[LocalModelTemplateWithConfigurations][source]

Reads the local model configuration file and returns a Configurations object. The configuration file is in the config/models directory.

chap_core.models.local_configuration.parse_local_model_config_from_directory(directory, search_pattern='*.yaml') list[LocalModelTemplateWithConfigurations][source]

Reads the local model configuration files from the config/models directory and returns a Configurations object. The configuration files are in the config/models directory.

chap_core.models.model_rest_api_wrapper module

NOTE: not used, just an AI draft for exploration!

Wrapper script for creating a REST API for external models.

Idea is that external models can use this in their environment:

pip install chap-core export TRAIN_CMD=”python train.py” export PREDICT_CMD=”python predict.py” export PORT=8005 chap-runner # exposes /train, /predict, /jobs/{id}, …

from chap_core.chap_runner import ModelRunner import uvicorn

def train_fn(payload, files_dir):

# payload[“training_data”] -> path saved by the runner # … train, save artifacts to files_dir … return {“model_uri”: “s3://bucket/modelA:v1”, “metrics”: {“loss”: 0.12}}

def predict_fn(payload, files_dir):

# may write (files_dir / “out.csv”) return {“preds_uri”: str(files_dir / “out.csv”)}

app = ModelRunner(train_fn=train_fn, predict_fn=predict_fn).app

if __name__ == “__main__”:

uvicorn.run(app, host=”0.0.0.0”, port=8005)

chap_core.models.model_template module

class chap_core.models.model_template.ExternalModelTemplate(model_template_config: ModelTemplateConfigV2, working_dir: str, ignore_env=False)[source]

Bases: ModelTemplateInterface

This class is instanciated when a model is to be run. For parsing mlflow and putting into db/rest-api objects, this class should not be used

classmethod fetch_config_from_github_url(github_url) ModelTemplateConfigV2[source]
classmethod from_model_template_config(model_template_config: ModelTemplateConfigV2, working_dir: str, ignore_env=False)[source]
property model_template_info: ModelTemplateConfigV2
class chap_core.models.model_template.ModelTemplate(model_template_config: ModelTemplateConfigV2, working_dir: str, ignore_env=False)[source]

Bases: object

Represents a Model Template that can generate concrete models. A template defines the choices allowed for a model

classmethod from_directory_or_github_url(model_template_path, base_working_dir=PosixPath('runs'), ignore_env=False, run_dir_type='timestamp', is_chapkit_model: bool = False) ModelTemplate[source]

Gets the model template and initializes a working directory with the code for the model. model_path can be a local directory or github url

Parameters

model_template_pathstr

Path to the model. Can be a local directory or a github url

base_working_dirPath, optional

Base directory to store the working directory, by default Path(“runs/”)

ignore_envbool, optional

If True, will ignore the environment specified in the MLproject file, by default False

run_dir_typeLiteral[“timestamp”, “latest”, “use_existing”], optional

Type of run directory to create, by default “timestamp”, which creates a new directory based on current timestamp for the run. “latest” will create a new directory based on the model name, but will remove any existing directory with the same name. “use_existing” will use the existing directory specified by the model path if that exists. If that does not exist, “latest” will be used.

get_config_class() type[ModelConfiguration][source]

This will probably not be used

get_default_model() ExternalModel[source]
get_model(model_configuration: ModelConfiguration = None) ExternalModel[source]

Returns a model based on the model configuration. The model configuration is an object of the class returned by get_model_class (i.e. specified by the user). If no model configuration is passed, the default choices are used.

Parameters

model_configurationModelConfiguration, optional

The configuration for the model, by default None

Returns

ExternalModel

The model

get_model_configuration_from_yaml(yaml_file: Path) ModelConfiguration[source]
get_train_predict_runner() TrainPredictRunner[source]
property model_template_config
property name

chap_core.models.model_template_interface module

class chap_core.models.model_template_interface.ConfiguredModel[source]

Bases: ABC

abstractmethod predict(historic_data: DataSet, future_data: DataSet) DataSet[source]
abstractmethod train(train_data: DataSet, extra_args=None)[source]
class chap_core.models.model_template_interface.InternalModelTemplate[source]

Bases: ModelTemplateInterface

This is a practical base class for defining model templates in python. The goal is that this can be used to define model templates that can be used directly in python, but also provide functionality for exposing them throught the chap/mlflow api

get_schema()[source]
model_config_class: type[ModelConfiguration]
model_template_info: ModelTemplateInformation
class chap_core.models.model_template_interface.ModelTemplateInterface[source]

Bases: ABC

get_default_model() ConfiguredModel[source]
abstractmethod get_model(model_configuration: ModelConfiguration | None = None) ConfiguredModel[source]
abstractmethod get_schema() ModelTemplateInformation[source]

chap_core.models.utils module

chap_core.models.utils.get_model_from_directory_or_github_url(model_template_path, base_working_dir=PosixPath('runs'), ignore_env=False, run_dir_type: Literal['timestamp', 'latest', 'use_existing'] = 'timestamp', model_configuration_yaml: str = None) ExternalModel[source]

NOTE: This function is deprecated, can be removed in the future.

Gets the model and initializes a working directory with the code for the model. model_path can be a local directory or github url

Parameters

model_template_pathstr

Path to the model. Can be a local directory or a github url

base_working_dirPath, optional

Base directory to store the working directory, by default Path(“runs/”)

ignore_envbool, optional

If True, will ignore the environment specified in the MLproject file, by default False

run_dir_typeLiteral[“timestamp”, “latest”, “use_existing”], optional

Type of run directory to create, by default “timestamp”, which creates a new directory based on current timestamp for the run. “latest” will create a new directory based on the model name, but will remove any existing directory with the same name. “use_existing” will use the existing directory specified by the model path if that exists. If that does not exist, “latest” will be used.

model_configuration_yamlstr, optional

Path to the model configuration yaml file, by default None. This has to be a yaml that is compatible with the model configuration class given by the ModelTemplate.

chap_core.models.utils.get_model_template_from_directory_or_github_url(model_template_path, base_working_dir=PosixPath('runs'), ignore_env=False, run_dir_type='timestamp', is_chapkit_model: bool = False) ModelTemplate[source]

Note: Preferably use ModelTemplate.from_directory_or_github_url instead of using this function directly. This function may be depcrecated in the future.

Gets the model template and initializes a working directory with the code for the model. model_path can be a local directory or github url

Parameters

model_template_pathstr

Path to the model. Can be a local directory or a github url

base_working_dirPath, optional

Base directory to store the working directory, by default Path(“runs/”)

ignore_envbool, optional

If True, will ignore the environment specified in the MLproject file, by default False

run_dir_typeLiteral[“timestamp”, “latest”, “use_existing”], optional

Type of run directory to create, by default “timestamp”, which creates a new directory based on current timestamp for the run. “latest” will create a new directory based on the model name, but will remove any existing directory with the same name. “use_existing” will use the existing directory specified by the model path if that exists. If that does not exist, “latest” will be used.

chap_core.models.utils.get_model_template_from_mlproject_file(mlproject_file, ignore_env=False, working_dir=None) ModelTemplate[source]

Module contents