chap_core.spatio_temporal_data package

Submodules

chap_core.spatio_temporal_data.converters module

chap_core.spatio_temporal_data.converters.dataset_model_to_dataset(dataset: DataSet)[source]
chap_core.spatio_temporal_data.converters.observations_to_dataset(dataclass, observations, fill_missing=False)[source]

chap_core.spatio_temporal_data.multi_country_dataset module

class chap_core.spatio_temporal_data.multi_country_dataset.LazyMultiCountryDataSet(url, dataclass=<class 'bionumpy.bnpdataclass.bnpdataclass.FullData'>)[source]

Bases: object

items()[source]
class chap_core.spatio_temporal_data.multi_country_dataset.MultiCountryDataSet(data: dict[str, DataSet])[source]

Bases: object

property countries
classmethod from_folder(folder_path, dataclass=<class 'bionumpy.bnpdataclass.bnpdataclass.FullData'>)[source]
classmethod from_tar(url, dataclass=<class 'bionumpy.bnpdataclass.bnpdataclass.FullData'>)[source]
items()[source]
keys()[source]
property period_range
restrict_time_period(time_period)[source]

chap_core.spatio_temporal_data.temporal_dataclass module

class chap_core.spatio_temporal_data.temporal_dataclass.DataSet(data_dict: dict[str, FeaturesT], polygons=None, metadata=DataSetMetaData(name='dataset', filename=None, db_id=None))[source]

Bases: Generic[FeaturesT]

Class representing severeal time series at different locations.

add_fields(new_type, **kwargs: dict[str, Callable])[source]
aggregate_to_parent(field_name: str = 'disease_cases', nan_indicator='disease_cases')[source]
data() Iterable[FeaturesT][source]
classmethod df_from_pydantic_observations(observations: list[PeriodObservation]) TimeSeriesData[source]
property end_timestamp: Timestamp
field_names()[source]
filter_locations(locations: Iterable[str]) DataSet[FeaturesT][source]
classmethod from_csv(file_name: str, dataclass: Type[FeaturesT] | None = None) DataSet[FeaturesT][source]
classmethod from_dict(data: dict, dataclass: type[TemporalDataclass])[source]
classmethod from_fields(dataclass: type[TimeSeriesData], fields: dict[str, DataSet[TimeSeriesArray]])[source]
classmethod from_file(file_name: str, dataclass: Type[FeaturesT]) DataSet[FeaturesT][source]
classmethod from_pandas(df: DataFrame, dataclass: Type[FeaturesT] = None, fill_missing=False) DataSet[FeaturesT][source]

Create a SpatioTemporalDict from a pandas dataframe. The dataframe needs to have a ‘location’ column, and a ‘time_period’ column. The time_period columnt needs to have strings that can be parsed into a period. All fields in the dataclass needs to be present in the dataframe. If ‘fill_missing’ is True, missing values will be filled with np.nan. Else all the time series needs to be consecutive.

Parameters

dfpd.DataFrame

The dataframe

dataclassType[FeaturesT]

The dataclass to use for the time series

fill_missingbool, optional

If missing values should be filled, by default False

Returns

DataSet[FeaturesT]

The SpatioTemporalDict

Examples

>>> import pandas as pd
>>> from chap_core.spatio_temporal_data.temporal_dataclass import DataSet
>>> from chap_core.datatypes import HealthData
>>> df = pd.DataFrame(
...     {
...         "location": ["Oslo", "Oslo", "Bergen", "Bergen"],
...         "time_period": ["2020-01", "2020-02", "2020-01", "2020-02"],
...         "disease_cases": [10, 20, 30, 40],
...     }
... )
>>> DataSet.from_pandas(df, HealthData)
classmethod from_period_observations(observation_dict: dict[str, list[PeriodObservation]]) DataSet[TimeSeriesData][source]

Create a SpatioTemporalDict from a dictionary of PeriodObservations. The keys are the location names, and the values are lists of PeriodObservations.

Parameters

observation_dictdict[str, list[PeriodObservation]]

The dictionary of observations

Returns

DataSet[TimeSeriesData]

The SpatioTemporalDict

Examples

>>> from chap_core.spatio_temporal_data.temporal_dataclass import DataSet
>>> from chap_core.api_types import PeriodObservation
>>> class HealthObservation(PeriodObservation):
...     disease_cases: int
>>> observations = {
...     "Oslo": [
...         HealthObservation(time_period="2020-01", disease_cases=10),
...         HealthObservation(time_period="2020-02", disease_cases=20),
...     ]
... }
>>> DataSet.from_period_observations(observations)
>>> DataSet.to_pandas()
classmethod from_pickle(file_name: str, dataclass: Type[FeaturesT]) DataSet[FeaturesT][source]
get_location(location: Location) FeaturesT[source]
get_locations(location: Iterable[Location]) DataSet[FeaturesT][source]
get_parent_dict() dict[str, str] | None[source]
interpolate(field_names=None)[source]
items() Iterable[Tuple[str, FeaturesT]][source]
join_on_time(other: DataSet[FeaturesT]) DataSet[Tuple[FeaturesT, FeaturesT]][source]

Join two SpatioTemporalDicts on time. Returns a new SpatioTemporalDict. Assumes other is later in time.

keys() Iterable[str][source]
locations() Iterable[Location][source]
merge(other_dataset: DataSet, result_dataclass: type[TimeSeriesData]) DataSet[source]
model_dump()[source]
property period_range: PeriodRange
plot()[source]
plot_aggregate()[source]
property polygons
remove_field(field_name, new_class=None)[source]
resample(freq)[source]
restrict_time_period(period_range: slice) DataSet[FeaturesT][source]
set_polygons(polygons: FeatureCollectionModel, ignore_validation=False) list[str][source]
property start_timestamp: Timestamp
to_csv(file_name: str, mode='w')[source]
to_pandas() DataFrame[source]

Join the pandas frame for all locations with locations as column

to_pickle(file_name: str)[source]
to_report(pdf_filename: str)[source]
values() Iterable[FeaturesT][source]
class chap_core.spatio_temporal_data.temporal_dataclass.DataSetMetaData(*, name: str = 'dataset', filename: str | None = None, db_id: int | None = None)[source]

Bases: BaseModel

db_id: int | None
filename: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str
class chap_core.spatio_temporal_data.temporal_dataclass.Polygon[source]

Bases: object

class chap_core.spatio_temporal_data.temporal_dataclass.TemporalDataclass(data: FeaturesT)[source]

Bases: Generic[FeaturesT]

Wraps a dataclass in a object that is can be sliced by time period. Call .data() to get the data back.

data() Iterable[FeaturesT][source]
property end_timestamp: Timestamp
fill_to_endpoint(end_time_stamp: TimeStamp) TemporalDataclass[FeaturesT][source]
fill_to_range(start_timestamp, end_timestamp)[source]
join(other)[source]
restrict_time_period(period_range: slice) TemporalDataclass[FeaturesT][source]
property start_timestamp: Timestamp
to_pandas() DataFrame[source]

Module contents