Datasets

This page lists datasets included in CheLo.

Wine Quality Dataset

The Wine Quality Dataset contains two datasets related to red and white vinho verde wine samples, from the north of Portugal. The goal is to model wine quality based on physicochemical tests. The dataset was introduced in [CCA+09]. Please refer to the API reference for options supported by this dataset (chelo.datasets.WineQualityDataset).

Coal Fired Power Plant Dataset

The Coal Fired Power Plant Dataset contains data related to the thermal performance of coal-fired power plants. This dataset includes various features, such as load factors, steam flow rates, and boiler efficiency, that are crucial for analyzing and predicting the thermal performance of power plants. Please refer to the API reference for options supported by this dataset (chelo.datasets.CoalFiredPlantDataset).

Ames Mutagenicity Dataset

The Ames Mutagenicity Dataset contains mutagenicity data for various chemicals tested on Salmonella typhimurium using the Ames test. It includes features that can be used to model and predict mutagenicity. For more information regarding the dataset refer to MLCE book Please refer to the API reference for options supported by this dataset (chelo.datasets.AmesMutagenicityDataset).

Bioconcentration Factor (BCF) Dataset

The Bioconcentration Factor (BCF) Dataset contains chemical properties and experimental bioconcentration factor (BCF) values. This dataset is useful for understanding the accumulation of chemicals in organisms. For more information regarding the dataset refer to MLCE book Please refer to the API reference for options supported by this dataset (chelo.datasets.BCFactorDataset).

CSTR Dataset

The CSTR Dataset provides time-series data of concentrations of three species (A, B, and X) in a continuous stirred-tank reactor (CSTR). It is suitable for dynamic modeling of chemical processes. For more information regarding the dataset refer to MLCE book Please refer to the API reference for options supported by this dataset (chelo.datasets.CSTRDataset).

Vapor Pressure Dataset

The Vapor Pressure Dataset contains the phase envelope of various compounds. This dataset supports research in thermodynamics and phase equilibrium. For more information regarding the dataset refer to MLCE book Please refer to the API reference for options supported by this dataset (chelo.datasets.VaporPressureDataset).

VLE Dataset

The VLE Dataset focuses on vapor-liquid equilibria (VLE) of CO2. It provides essential information for thermodynamic modeling and process design. For more information regarding the dataset refer to MLCE book Please refer to the API reference for options supported by this dataset (chelo.datasets.VLEDataset).

OPSD Dataset

The OPSD Dataset <https://open-power-system-data.org/>_ focuses on power forecasting. In this loader we include photovoltaic (PV) solar power generation data, as well as critical features such as temperature, radiation (direct and diffuse).

This dataset is particularly useful for applications in energy management, solar power optimization, and predictive modeling of renewable energy resources.

For more information about the dataset, refer to the OPSD documentation <https://open-power-system-data.org/>_. Please refer to the API reference for the supported options and usage of this dataset (:class:chelo.datasets.OPSDPVDataset).

[CCA+09]

Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. Modeling wine preferences by data mining from physicochemical properties. Decision support systems, 47(4):547–553, 2009.