chap_core.spatio_temporal_data package¶
Submodules¶
chap_core.spatio_temporal_data.converters module¶
chap_core.spatio_temporal_data.multi_country_dataset module¶
- class chap_core.spatio_temporal_data.multi_country_dataset.LazyMultiCountryDataSet(url, dataclass=<class 'bionumpy.bnpdataclass.bnpdataclass.FullData'>)[source]¶
Bases:
object
chap_core.spatio_temporal_data.temporal_dataclass module¶
- class chap_core.spatio_temporal_data.temporal_dataclass.DataSet(data_dict: dict[str, FeaturesT], polygons=None, metadata=DataSetMetaData(name='dataset', filename=None, db_id=None))[source]¶
Bases:
Generic[FeaturesT]Class representing severeal time series at different locations.
- classmethod df_from_pydantic_observations(observations: list[PeriodObservation]) TimeSeriesData[source]¶
- property end_timestamp: Timestamp¶
- classmethod from_csv(file_name: str, dataclass: Type[FeaturesT] | None = None) DataSet[FeaturesT][source]¶
- classmethod from_dict(data: dict, dataclass: type[TemporalDataclass])[source]¶
- classmethod from_fields(dataclass: type[TimeSeriesData], fields: dict[str, DataSet[TimeSeriesArray]])[source]¶
- classmethod from_pandas(df: DataFrame, dataclass: Type[FeaturesT] = None, fill_missing=False) DataSet[FeaturesT][source]¶
Create a SpatioTemporalDict from a pandas dataframe. The dataframe needs to have a ‘location’ column, and a ‘time_period’ column. The time_period columnt needs to have strings that can be parsed into a period. All fields in the dataclass needs to be present in the dataframe. If ‘fill_missing’ is True, missing values will be filled with np.nan. Else all the time series needs to be consecutive.
Parameters¶
- dfpd.DataFrame
The dataframe
- dataclassType[FeaturesT]
The dataclass to use for the time series
- fill_missingbool, optional
If missing values should be filled, by default False
Returns¶
- DataSet[FeaturesT]
The SpatioTemporalDict
Examples¶
>>> import pandas as pd >>> from chap_core.spatio_temporal_data.temporal_dataclass import DataSet >>> from chap_core.datatypes import HealthData >>> df = pd.DataFrame( ... { ... "location": ["Oslo", "Oslo", "Bergen", "Bergen"], ... "time_period": ["2020-01", "2020-02", "2020-01", "2020-02"], ... "disease_cases": [10, 20, 30, 40], ... } ... ) >>> DataSet.from_pandas(df, HealthData)
- classmethod from_period_observations(observation_dict: dict[str, list[PeriodObservation]]) DataSet[TimeSeriesData][source]¶
Create a SpatioTemporalDict from a dictionary of PeriodObservations. The keys are the location names, and the values are lists of PeriodObservations.
Parameters¶
- observation_dictdict[str, list[PeriodObservation]]
The dictionary of observations
Returns¶
- DataSet[TimeSeriesData]
The SpatioTemporalDict
Examples¶
>>> from chap_core.spatio_temporal_data.temporal_dataclass import DataSet >>> from chap_core.api_types import PeriodObservation >>> class HealthObservation(PeriodObservation): ... disease_cases: int >>> observations = { ... "Oslo": [ ... HealthObservation(time_period="2020-01", disease_cases=10), ... HealthObservation(time_period="2020-02", disease_cases=20), ... ] ... } >>> DataSet.from_period_observations(observations) >>> DataSet.to_pandas()
- join_on_time(other: DataSet[FeaturesT]) DataSet[Tuple[FeaturesT, FeaturesT]][source]¶
Join two SpatioTemporalDicts on time. Returns a new SpatioTemporalDict. Assumes other is later in time.
- property period_range: PeriodRange¶
- property polygons¶
- set_polygons(polygons: FeatureCollectionModel, ignore_validation=False) list[str][source]¶
- property start_timestamp: Timestamp¶
- class chap_core.spatio_temporal_data.temporal_dataclass.DataSetMetaData(*, name: str = 'dataset', filename: str | None = None, db_id: int | None = None)[source]¶
Bases:
BaseModel- db_id: int | None¶
- filename: str | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str¶
- class chap_core.spatio_temporal_data.temporal_dataclass.TemporalDataclass(data: FeaturesT)[source]¶
Bases:
Generic[FeaturesT]Wraps a dataclass in a object that is can be sliced by time period. Call .data() to get the data back.
- property end_timestamp: Timestamp¶
- fill_to_endpoint(end_time_stamp: TimeStamp) TemporalDataclass[FeaturesT][source]¶
- restrict_time_period(period_range: slice) TemporalDataclass[FeaturesT][source]¶
- property start_timestamp: Timestamp¶