chap_core.models package¶
Submodules¶
chap_core.models.chapkit_rest_api_wrapper module¶
Synchronous REST API wrapper for CHAP service Provides synchronous methods for all available API endpoints
NOTE: Written by ai as a prototype, TODO: refactor and cleanup once working
- class chap_core.models.chapkit_rest_api_wrapper.CHAPKitRestAPIWrapper(base_url: str = 'http://localhost:8001', timeout: int = 7200)[source]¶
Bases:
objectSynchronous client for interacting with the CHAP REST API
- create_config(config: Dict[str, Any]) Dict[str, Any][source]¶
Create or replace a model configuration
- Args:
config: Configuration dictionary
- Returns:
Created configuration with ID
- delete_artifact(artifact_id: str) None[source]¶
Delete an artifact by ID
- Args:
artifact_id: Artifact ID to delete
- delete_config(config_id: str) None[source]¶
Delete a configuration by ID
- Args:
config_id: Configuration ID to delete
- delete_job(job_id: str) None[source]¶
Cancel (if running) and delete a job
- Args:
job_id: Job ID to delete
- get_artifact(artifact_id: str) Dict[str, Any][source]¶
Get a specific artifact by ID
- Args:
artifact_id: Artifact ID
- Returns:
Artifact info object
- get_artifact_config(artifact_id: str) Dict[str, Any][source]¶
Get the configuration associated with an artifact
- Args:
artifact_id: Artifact ID
- Returns:
Configuration object linked to the artifact
- get_artifact_expand(artifact_id: str) Dict[str, Any][source]¶
Get artifact with expanded data
- Args:
artifact_id: Artifact ID
- Returns:
Expanded artifact object
- get_artifact_tree_by_id(artifact_id: str) Dict[str, Any][source]¶
Get artifact tree starting from a specific artifact
- Args:
artifact_id: Artifact ID
- Returns:
Artifact tree with nested children
- get_artifacts_for_config(config_id: str) List[Dict[str, Any]][source]¶
Get all artifacts linked to a configuration
- Args:
config_id: Configuration ID
- Returns:
List of artifact info objects
- get_config(config_id: str) Dict[str, Any][source]¶
Get a specific configuration by ID
- Args:
config_id: Configuration ID
- Returns:
Configuration object
- get_config_artifacts(config_id: str) List[Dict[str, Any]][source]¶
Get all artifacts linked to a configuration
- Args:
config_id: Configuration ID
- Returns:
List of artifact objects linked to the configuration
- get_config_schema() Dict[str, Any][source]¶
Get JSON Schema for model configuration
- Returns:
JSON Schema for configuration model
- get_job(job_id: str) Dict[str, Any][source]¶
Get full job record by ID
- Args:
job_id: Job ID
- Returns:
Job record with status, times, error info, etc.
- get_jobs(status: str | None = None) List[Dict[str, Any]][source]¶
Get all jobs, optionally filtered by status
- Args:
status: Optional status filter (‘pending’, ‘running’, ‘completed’, ‘failed’, ‘canceled’)
- Returns:
List of job records
- health() Dict[str, str][source]¶
Check service health status
- Returns:
Dict with status field (‘healthy’)
- info() Dict[str, Any][source]¶
Get system information
- Returns:
System info including name, version, description, etc.
- link_artifact_to_config(config_id: str, artifact_id: str) Dict[str, Any][source]¶
Link an artifact to a configuration
- Args:
config_id: Configuration ID artifact_id: Artifact ID to link
- Returns:
Updated configuration or confirmation
- list_configs() List[Dict[str, Any]][source]¶
List all model configurations
- Returns:
List of model configuration objects
- poll_job(job_id: str, timeout: int | None = None) Dict[str, Any][source]¶
Simple polling method that waits for a job to complete
- Args:
job_id: Job ID to poll timeout: Maximum seconds to wait (None for no timeout)
- Returns:
Final job status when completed
- predict(model_artifact_id: str, future_data: DataFrame, historic_data: DataFrame | None = None, geo_features: Dict[str, Any] | None = None) Dict[str, str][source]¶
Make predictions with a trained model
- Args:
model_artifact_id: Trained model artifact ID future_data: Future covariates as pandas DataFrame historic_data: Optional historical data as pandas DataFrame geo_features: Optional GeoJSON FeatureCollection
- Returns:
Dict with job_id and prediction_artifact_id
- predict_and_wait(model_artifact_id: str, future_data: DataFrame, historic_data: DataFrame | None = None, geo_features: Dict[str, Any] | None = None, timeout: int | None = 7200) Dict[str, Any][source]¶
Make predictions and wait for completion
- Args:
model_artifact_id: Trained model artifact ID future_data: Future covariates historic_data: Optional historical data geo_features: Optional GeoJSON features timeout: Maximum seconds to wait
- Returns:
Dict with job record and prediction_artifact_id
- train(config_id: str, data: DataFrame, geo_features: Dict[str, Any] | None = None) Dict[str, str][source]¶
Train a model with data
- Args:
config_id: Configuration ID to use for training data: Training data as pandas DataFrame geo_features: Optional GeoJSON FeatureCollection
- Returns:
Dict with job_id and model_artifact_id
- train_and_wait(config_id: str, data: DataFrame, geo_features: Dict[str, Any] | None = None, timeout: int | None = 300) Dict[str, Any][source]¶
Train a model and wait for completion
- Args:
config_id: Configuration ID data: Training data geo_features: Optional GeoJSON features timeout: Maximum seconds to wait
- Returns:
Dict with job record and model_artifact_id
- unlink_artifact_from_config(config_id: str, artifact_id: str) Dict[str, Any][source]¶
Unlink an artifact from a configuration
- Args:
config_id: Configuration ID artifact_id: Artifact ID to unlink
- Returns:
Updated configuration or confirmation
- update_config(config_id: str, config: Dict[str, Any]) Dict[str, Any][source]¶
Update a configuration by ID
- Args:
config_id: Configuration ID config: Updated configuration dictionary
- Returns:
Updated configuration
- wait_for_job(job_id: str, poll_interval: int = 2, timeout: int | None = None) Dict[str, Any][source]¶
Wait for a job to complete
- Args:
job_id: Job ID to monitor poll_interval: Seconds between status checks timeout: Maximum seconds to wait (None for no timeout)
- Returns:
Final job status
- Raises:
TimeoutError: If job doesn’t complete within timeout
chap_core.models.configured_model module¶
- class chap_core.models.configured_model.ConfiguredModel[source]¶
Bases:
ABCA ConfiguredModel is the main interface for all models in the Chap framework. A configured model is different from a model template in that it is configured with specific hyperparameters and/or other choices. While a ModelTemplate is flexible with choices, a ConfiguredModel has fixed choices and parameters. See ExternalModel for an example of a ConfiguredModel.
- class chap_core.models.configured_model.ModelConfiguration[source]¶
Bases:
BaseModelBaseClass used for configuration that a ModelTemplate takes for creating specific Models
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
chap_core.models.external_chapkit_model module¶
- class chap_core.models.external_chapkit_model.ExternalChapkitModel(model_name: str, rest_api_url: str, configuration_id: str)[source]¶
Bases:
ExternalModelBase
- class chap_core.models.external_chapkit_model.ExternalChapkitModelTemplate(rest_api_url: str)[source]¶
Bases:
objectWrapper around External models that are based on chapkit.
Note that get_model assumes you have already created a configuration with that specific chapkitmodel.
This method is meant to be backwards compatible with ExternalModelTemplate
- get_model(model_configuration: dict) ExternalChapkitModel[source]¶
Sends the model configuration for storing in the model (by sending to the model rest api). This returns a configuration id back that we can use to identify the model.
- get_model_template_config() ModelTemplateConfigV2[source]¶
This method is meant to make things backwards compatible with old system. An object of type ModelTemplateConfigV2 is needed to store info about a ModelTemplate in the database.
- property name¶
This returns a unique name for the model. In the future, this might be some sort of id given by the model
chap_core.models.external_model module¶
- class chap_core.models.external_model.ExternalModel(runner, name: str = None, adapters=None, working_dir=None, data_type=<class 'bionumpy.bnpdataclass.bnpdataclass.HealthData'>, configuration: ~chap_core.database.model_templates_and_config_tables.ModelConfiguration | None = None)[source]¶
Bases:
ExternalModelBaseAn ExternalModel is a specififc implementation of a ConfiguredModel that represents a model that is “external” in the sense that it needs to be run through a runner (e.g. a DockerRunner). This class is typically used for external models developed outside of Chap, and gives such models an interface with methods like train and predict so that they are compatible with Chap.
- property configuration¶
- property name¶
- property optional_fields¶
- property required_fields¶
- class chap_core.models.external_model.ExternalModelBase[source]¶
Bases:
ConfiguredModelA base class for external models that provides some utility methods
chap_core.models.external_web_model module¶
- class chap_core.models.external_web_model.ExternalWebModel(api_url: str, name: str = None, timeout: int = 3600, poll_interval: int = 5, configuration: dict | None = None, adapters: dict | None = None, working_dir: str = './')[source]¶
Bases:
ExternalModelBaseWrapper for a ConfiguredModel that can only be run through a web service, defined by an URL. This web service supports a strict REST API that allows for training and prediction. This class makes such a model available through the ConfiguredModel interface, with train and predict methods.
- property configuration¶
- property name¶
chap_core.models.local_configuration module¶
- class chap_core.models.local_configuration.LocalModelTemplateWithConfigurations(*, url: str, uses_chapkit: bool = False, versions: dict[str, str], configurations: dict[str, ModelConfiguration] = {'default': ModelConfiguration(user_option_values={}, additional_continuous_covariates=[])})[source]¶
Bases:
BaseModelClass only used for parsing ModelTemplate from config/models/*.yaml files.
- configurations: dict[str, ModelConfiguration]¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- url: str¶
- uses_chapkit: bool¶
- versions: dict[str, str]¶
- chap_core.models.local_configuration.parse_local_model_config_file(file_name) list[LocalModelTemplateWithConfigurations][source]¶
Reads the local model configuration file and returns a Configurations object. The configuration file is in the config/models directory.
- chap_core.models.local_configuration.parse_local_model_config_from_directory(directory, search_pattern='*.yaml') list[LocalModelTemplateWithConfigurations][source]¶
Reads the local model configuration files from the config/models directory and returns a Configurations object. The configuration files are in the config/models directory.
chap_core.models.model_rest_api_wrapper module¶
NOTE: not used, just an AI draft for exploration!
Wrapper script for creating a REST API for external models.
Idea is that external models can use this in their environment:
pip install chap-core export TRAIN_CMD=”python train.py” export PREDICT_CMD=”python predict.py” export PORT=8005 chap-runner # exposes /train, /predict, /jobs/{id}, …
from chap_core.chap_runner import ModelRunner import uvicorn
- def train_fn(payload, files_dir):
# payload[“training_data”] -> path saved by the runner # … train, save artifacts to files_dir … return {“model_uri”: “s3://bucket/modelA:v1”, “metrics”: {“loss”: 0.12}}
- def predict_fn(payload, files_dir):
# may write (files_dir / “out.csv”) return {“preds_uri”: str(files_dir / “out.csv”)}
app = ModelRunner(train_fn=train_fn, predict_fn=predict_fn).app
- if __name__ == “__main__”:
uvicorn.run(app, host=”0.0.0.0”, port=8005)
chap_core.models.model_template module¶
- class chap_core.models.model_template.ExternalModelTemplate(model_template_config: ModelTemplateConfigV2, working_dir: str, ignore_env=False)[source]¶
Bases:
ModelTemplateInterfaceThis class is instanciated when a model is to be run. For parsing mlflow and putting into db/rest-api objects, this class should not be used
- classmethod fetch_config_from_github_url(github_url) ModelTemplateConfigV2[source]¶
- classmethod from_model_template_config(model_template_config: ModelTemplateConfigV2, working_dir: str, ignore_env=False)[source]¶
- property model_template_info: ModelTemplateConfigV2¶
- class chap_core.models.model_template.ModelTemplate(model_template_config: ModelTemplateConfigV2, working_dir: str, ignore_env=False)[source]¶
Bases:
objectRepresents a Model Template that can generate concrete models. A template defines the choices allowed for a model
- classmethod from_directory_or_github_url(model_template_path, base_working_dir=PosixPath('runs'), ignore_env=False, run_dir_type='timestamp', is_chapkit_model: bool = False) ModelTemplate[source]¶
Gets the model template and initializes a working directory with the code for the model. model_path can be a local directory or github url
Parameters¶
- model_template_pathstr
Path to the model. Can be a local directory or a github url
- base_working_dirPath, optional
Base directory to store the working directory, by default Path(“runs/”)
- ignore_envbool, optional
If True, will ignore the environment specified in the MLproject file, by default False
- run_dir_typeLiteral[“timestamp”, “latest”, “use_existing”], optional
Type of run directory to create, by default “timestamp”, which creates a new directory based on current timestamp for the run. “latest” will create a new directory based on the model name, but will remove any existing directory with the same name. “use_existing” will use the existing directory specified by the model path if that exists. If that does not exist, “latest” will be used.
- get_config_class() type[ModelConfiguration][source]¶
This will probably not be used
- get_default_model() ExternalModel[source]¶
- get_model(model_configuration: ModelConfiguration = None) ExternalModel[source]¶
Returns a model based on the model configuration. The model configuration is an object of the class returned by get_model_class (i.e. specified by the user). If no model configuration is passed, the default choices are used.
Parameters¶
- model_configurationModelConfiguration, optional
The configuration for the model, by default None
Returns¶
- ExternalModel
The model
- get_model_configuration_from_yaml(yaml_file: Path) ModelConfiguration[source]¶
- get_train_predict_runner() TrainPredictRunner[source]¶
- property model_template_config¶
- property name¶
chap_core.models.model_template_interface module¶
- class chap_core.models.model_template_interface.InternalModelTemplate[source]¶
Bases:
ModelTemplateInterfaceThis is a practical base class for defining model templates in python. The goal is that this can be used to define model templates that can be used directly in python, but also provide functionality for exposing them throught the chap/mlflow api
- model_config_class: type[ModelConfiguration]¶
- model_template_info: ModelTemplateInformation¶
- class chap_core.models.model_template_interface.ModelTemplateInterface[source]¶
Bases:
ABC- get_default_model() ConfiguredModel[source]¶
- abstractmethod get_model(model_configuration: ModelConfiguration | None = None) ConfiguredModel[source]¶
- abstractmethod get_schema() ModelTemplateInformation[source]¶
chap_core.models.utils module¶
- chap_core.models.utils.get_model_from_directory_or_github_url(model_template_path, base_working_dir=PosixPath('runs'), ignore_env=False, run_dir_type: Literal['timestamp', 'latest', 'use_existing'] = 'timestamp', model_configuration_yaml: str = None) ExternalModel[source]¶
NOTE: This function is deprecated, can be removed in the future.
Gets the model and initializes a working directory with the code for the model. model_path can be a local directory or github url
Parameters¶
- model_template_pathstr
Path to the model. Can be a local directory or a github url
- base_working_dirPath, optional
Base directory to store the working directory, by default Path(“runs/”)
- ignore_envbool, optional
If True, will ignore the environment specified in the MLproject file, by default False
- run_dir_typeLiteral[“timestamp”, “latest”, “use_existing”], optional
Type of run directory to create, by default “timestamp”, which creates a new directory based on current timestamp for the run. “latest” will create a new directory based on the model name, but will remove any existing directory with the same name. “use_existing” will use the existing directory specified by the model path if that exists. If that does not exist, “latest” will be used.
- model_configuration_yamlstr, optional
Path to the model configuration yaml file, by default None. This has to be a yaml that is compatible with the model configuration class given by the ModelTemplate.
- chap_core.models.utils.get_model_template_from_directory_or_github_url(model_template_path, base_working_dir=PosixPath('runs'), ignore_env=False, run_dir_type='timestamp', is_chapkit_model: bool = False) ModelTemplate[source]¶
Note: Preferably use ModelTemplate.from_directory_or_github_url instead of using this function directly. This function may be depcrecated in the future.
Gets the model template and initializes a working directory with the code for the model. model_path can be a local directory or github url
Parameters¶
- model_template_pathstr
Path to the model. Can be a local directory or a github url
- base_working_dirPath, optional
Base directory to store the working directory, by default Path(“runs/”)
- ignore_envbool, optional
If True, will ignore the environment specified in the MLproject file, by default False
- run_dir_typeLiteral[“timestamp”, “latest”, “use_existing”], optional
Type of run directory to create, by default “timestamp”, which creates a new directory based on current timestamp for the run. “latest” will create a new directory based on the model name, but will remove any existing directory with the same name. “use_existing” will use the existing directory specified by the model path if that exists. If that does not exist, “latest” will be used.
- chap_core.models.utils.get_model_template_from_mlproject_file(mlproject_file, ignore_env=False, working_dir=None) ModelTemplate[source]¶