TY - JOUR
T1 - ChemTastesPredictor
T2 - An ensemble of machine learning classifiers to predict the taste of molecular tastants
AU - Rojas Villa, Cristian
AU - Abril Gonzalez, Monica Fernanda
AU - Ballabio, Davide
AU - García, Fernando
AU - Rojas Villa, Cristian
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/6/15
Y1 - 2025/6/15
N2 - The sense of taste plays a critical role in food science, since it directly impacts food consumption, human nutrition, and overall health. Computational models that predict the taste of molecular tastants based on their chemical structure and machine learning classifiers serve as powerful tools in the advancing field of foodinformatics. This study describes the development of ChemTastesPredictor designed to predict the taste of 4075 molecular tastants included in the extended version of ChemTastesDB (https://zenodo.org/records/14963136). To the best of our knowledge, this represents the largest dataset with a broad-based chemical space used to calibrate machine learning (ML) models for taste prediction based on molecular descriptors and fingerprints. For validation, datasets were randomly split into training and test sets in a 75:25 ratio, ensuring balanced class distributions. In binary classification tasks, the Random Forest classifier demonstrated the highest predictive performance for sweet/bitter (NER = 0.928 and F-score = 0.927) and bitter/non-bitter (NER = 0.902 and F-score = 0.903) classification. Adaptive Boosting excelled in the prediction of sweet/non-sweet (NER = 0.861 and F-score = 0.862). The N-Nearest Neighbors classifier emerged as the optimal classifier for umami/non-umami (NER = 0.957 and F-score = 0.860) and sweet/bitter/umami (NER = 0.870 and F-score = 0.843). These models may be useful in the development and analysis of new chemical tastants.
AB - The sense of taste plays a critical role in food science, since it directly impacts food consumption, human nutrition, and overall health. Computational models that predict the taste of molecular tastants based on their chemical structure and machine learning classifiers serve as powerful tools in the advancing field of foodinformatics. This study describes the development of ChemTastesPredictor designed to predict the taste of 4075 molecular tastants included in the extended version of ChemTastesDB (https://zenodo.org/records/14963136). To the best of our knowledge, this represents the largest dataset with a broad-based chemical space used to calibrate machine learning (ML) models for taste prediction based on molecular descriptors and fingerprints. For validation, datasets were randomly split into training and test sets in a 75:25 ratio, ensuring balanced class distributions. In binary classification tasks, the Random Forest classifier demonstrated the highest predictive performance for sweet/bitter (NER = 0.928 and F-score = 0.927) and bitter/non-bitter (NER = 0.902 and F-score = 0.903) classification. Adaptive Boosting excelled in the prediction of sweet/non-sweet (NER = 0.861 and F-score = 0.862). The N-Nearest Neighbors classifier emerged as the optimal classifier for umami/non-umami (NER = 0.957 and F-score = 0.860) and sweet/bitter/umami (NER = 0.870 and F-score = 0.843). These models may be useful in the development and analysis of new chemical tastants.
KW - ChemTastesDB
KW - ChemTastesPredictor
KW - Machine learning classifiers
KW - Molecular tastant
KW - QSPR
KW - ChemTastesDB
KW - ChemTastesPredictor
KW - Machine learning classifiers
KW - Molecular tastant
KW - QSPR
UR - https://www.scopus.com/pages/publications/105000237282
UR - https://www.sciencedirect.com/science/article/abs/pii/S0169743925000656
U2 - 10.1016/j.chemolab.2025.105380
DO - 10.1016/j.chemolab.2025.105380
M3 - Artículo
AN - SCOPUS:105000237282
SN - 0169-7439
VL - 261
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
M1 - 105380
ER -