API Reference
- class chelo.CheLoDataset(selected_features: List[str] | None = None, selected_targets: List[str] | None = None)[source]
Bases:
ABCAbstract Base Class for datasets.
- __init__(selected_features: List[str] | None = None, selected_targets: List[str] | None = None) None[source]
Initialize the dataset with optional selected features and targets.
- Parameters:
selected_features – List of features to select (default: all).
selected_targets – List of targets to select (default: all).
- abstractmethod get_dataset_info() Dict[str, Any][source]
Provide metadata about the dataset (e.g., source, size, description).
- get_features_shape()[source]
Returns the shape of the dataset’s feature data.
- Returns:
Tuple representing the feature data shape.
- get_target_shape()[source]
Returns the shape of the dataset’s target data.
- Returns:
Tuple representing the target data shape.
- abstractmethod load_data() None[source]
Load the dataset and populate self.raw_features and self.raw_targets.
- preview(n: int = 5) Dict[str, Dict[str, List[Any]]][source]
Preview the first n rows of the dataset.
- Returns:
A dictionary with the first few rows.
- select_features(feature_names: List[str]) None[source]
Dynamically select features from the dataset.
- Parameters:
feature_names – List of feature names to select.
- select_targets(target_names: List[str]) None[source]
Dynamically select targets from the dataset.
- Parameters:
target_names – List of target names to select.
- selected_features()[source]
Returns a list of feature names selected.
- Returns:
List of feature names.
- statistics() Dict[str, Dict[str, float]][source]
Compute basic statistics for the features and targets.
- Returns:
A dictionary of statistics (mean, std, min, max) for each feature and target.
- to_numpy() Tuple[ndarray, ndarray][source]
Convert the dataset to numpy arrays.
- Returns:
Tuple of (features, targets) in numpy format.
- class chelo.DatasetRegistry[source]
Bases:
objectA registry to manage available datasets in CheLo.
- classmethod get_dataset(name: str, **kwargs: Any) Any[source]
Retrieve an instance of the specified dataset by name.
- Parameters:
name – Name of the dataset to retrieve.
kwargs – Additional arguments to pass to the dataset constructor.
- Returns:
An instance of the dataset.
- Raises:
ValueError – If the dataset is not found.
- class chelo.datasets.AmesMutagenicityDataset(selected_features: Sequence[str] | None = None, selected_targets: Sequence[str] | None = None)[source]
Bases:
CheLoDataset- __init__(selected_features: Sequence[str] | None = None, selected_targets: Sequence[str] | None = None) None[source]
Initialize the Ames Mutagenicity dataset.
- Parameters:
selected_features – Features to select (default: all features).
selected_targets – Targets to select (default: all targets).
- class chelo.datasets.BCFactorDataset(selected_features: Sequence[str] | None = None, selected_targets: Sequence[str] | None = None)[source]
Bases:
CheLoDataset- __init__(selected_features: Sequence[str] | None = None, selected_targets: Sequence[str] | None = None) None[source]
Initialize the Bioconcentration Factor (BCF) dataset.
- Parameters:
selected_features – Features to select (default: all features).
selected_targets – Targets to select (default: all targets).
- class chelo.datasets.CSTRDataset(selected_features: Sequence[str] | None = None, selected_targets: Sequence[str] | None = None, window: int | None = None)[source]
Bases:
CheLoDataset- __init__(selected_features: Sequence[str] | None = None, selected_targets: Sequence[str] | None = None, window: int | None = None) None[source]
Initialize the CSTR Dataset.
The dataset contains the concentrations of three species (A, B, and X) over time. The inlet concentrations are fixed.
- Parameters:
selected_features – Features to select (default: all features).
selected_targets – Targets to select (default: all targets).
window – Number of previous time-steps to include in each feature (default: 1).
- class chelo.datasets.CoalFiredPlantDataset(selected_features: List[str] | None = None, selected_targets: List[str] | None = None)[source]
Bases:
CheLoDatasetDataset class for Coal Fired Power Plant Thermal Performance.
Provides utilities to load, process, and interact with the dataset.
- __init__(selected_features: List[str] | None = None, selected_targets: List[str] | None = None) None[source]
Initialize the Coal Fired Power Plant Thermal Performance Dataset.
- Parameters:
selected_features – List of features to select (default: all features).
selected_targets – List of targets to select (default: all targets).
- class chelo.datasets.OPSDPVDataset(country: str = 'GR', start_date: datetime | None = None, end_date: datetime | None = None, historical_window: int = 48, prediction_horizon: int = 12, prediction_window: int = 24, prediction_step: int = 6, use_future_weather: bool = False, selected_features: List[str] | None = None, selected_targets: List[str] | None = None)[source]
Bases:
CheLoDatasetA dataset class for Open Power System Data PV dataset. Provides functionalities to download, process, and prepare the dataset for forecasting tasks.
- __init__(country: str = 'GR', start_date: datetime | None = None, end_date: datetime | None = None, historical_window: int = 48, prediction_horizon: int = 12, prediction_window: int = 24, prediction_step: int = 6, use_future_weather: bool = False, selected_features: List[str] | None = None, selected_targets: List[str] | None = None) None[source]
Initialize the OPSD PV Dataset.
- Parameters:
country – The country to use. Must be one of the available countries.
start_date – The start date of the dataset. Defaults to earliest available data if not provided. Format: YYYY-MM-DD hour:minute:second
end_date – The end date of the dataset. Defaults to the latest available data if not provided. Format: YYYY-MM-DD hour:minute:second
historical_window – Number of time steps in the historical window for feature processing.
prediction_horizon – Time steps into the future for prediction targets.
prediction_window – The length of the prediction window.
prediction_step – The step size for prediction data.
use_future_weather – Whether to use future weather as feature (e.g., as forecast).
selected_features – List of selected features to include.
selected_targets – List of selected targets to include.
- class chelo.datasets.VLEDataset(selected_features: List[str] | None = None, selected_targets: List[str] | None = None)[source]
Bases:
CheLoDataset- __init__(selected_features: List[str] | None = None, selected_targets: List[str] | None = None) None[source]
Initialize the VLEDataset.
- Parameters:
selected_features – Features to select (default: all).
selected_targets – Targets to select (default: all).
- class chelo.datasets.VaporPressureDataset(selected_features: List[str] | None = None, selected_targets: List[str] | None = None)[source]
Bases:
CheLoDataset- __init__(selected_features: List[str] | None = None, selected_targets: List[str] | None = None) None[source]
Initialize the VaporPressureDataset Dataset.
- Parameters:
selected_features – Features to select (default: all).
selected_targets – Targets to select (default: all).
- class chelo.datasets.WineQualityDataset(wine_type: str = 'red', selected_features: List[str] | None = None, selected_targets: List[str] | None = None)[source]
Bases:
CheLoDataset- __init__(wine_type: str = 'red', selected_features: List[str] | None = None, selected_targets: List[str] | None = None) None[source]
Initialize the Wine Quality Dataset.
- Parameters:
wine_type – Type of wine (‘red’ or ‘white’).
selected_features – Features to select (default: all).
selected_targets – Targets to select (default: all).