TY - GEN
T1 - Comparison of classical and machine-learning methods on spatio-temporal modeling of daily Ozone concentrations
AU - Gualan, Ronald
AU - Saquicela, Victor
AU - Tran-Thanh, Long
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10
Y1 - 2020/10
N2 - Effective actions to mitigate air pollution require of availability of high-resolution observations. Low-cost sensor technologies have emerged as an affordable solution to cope with this deficiency. However, since low-cost sensors are built with low-cost materials, they are prone to errors, gaps, bias, and noise. These problems need to be solved before data can be used to support research or decision making. Addressing lack of reliability in low-cost sensor data is a complex challenge that is still under research over several lines (e.g. accuracy estimation of low-cost sensor data). Current approaches in this line involve modeling, bias-correction, and more recently, data fusion methods relying on high-resolution air quality computational models. Overall, accuracy estimation can be reduced to a modeling problem. The focus of this work is studying, testing, and comparing suitable approaches for handling point-referenced spatio-temporal sensor data, particularly classical spatial models, spatio-temporal models, and popular machine learning methods. Among these approaches, Bayesian hierarchical models have a special consideration given the attention they have drawn during the last fifteen years. The benchmark supporting this comparison study is a real-life dataset made up of daily ozone observations taken from the USA Environmental Protection Agency (EPA) and meteorological variables extracted from the NCEP/NCAR Reanalysis Project (NNRP). The main contributions of this work are: (1) a systematic comparison of three kinds of models, using a 10-fold cross-validation exercise; and (2) a feature engineering method to create covariates meant to harness spatially correlated observations of point-referenced sensor data.
AB - Effective actions to mitigate air pollution require of availability of high-resolution observations. Low-cost sensor technologies have emerged as an affordable solution to cope with this deficiency. However, since low-cost sensors are built with low-cost materials, they are prone to errors, gaps, bias, and noise. These problems need to be solved before data can be used to support research or decision making. Addressing lack of reliability in low-cost sensor data is a complex challenge that is still under research over several lines (e.g. accuracy estimation of low-cost sensor data). Current approaches in this line involve modeling, bias-correction, and more recently, data fusion methods relying on high-resolution air quality computational models. Overall, accuracy estimation can be reduced to a modeling problem. The focus of this work is studying, testing, and comparing suitable approaches for handling point-referenced spatio-temporal sensor data, particularly classical spatial models, spatio-temporal models, and popular machine learning methods. Among these approaches, Bayesian hierarchical models have a special consideration given the attention they have drawn during the last fifteen years. The benchmark supporting this comparison study is a real-life dataset made up of daily ozone observations taken from the USA Environmental Protection Agency (EPA) and meteorological variables extracted from the NCEP/NCAR Reanalysis Project (NNRP). The main contributions of this work are: (1) a systematic comparison of three kinds of models, using a 10-fold cross-validation exercise; and (2) a feature engineering method to create covariates meant to harness spatially correlated observations of point-referenced sensor data.
KW - Air pollution
KW - Gaussian process
KW - machine-learning
KW - sensor data
KW - spatio-temporal models
UR - https://www.scopus.com/pages/publications/85113619756
U2 - 10.1109/CLEI52000.2020.00014
DO - 10.1109/CLEI52000.2020.00014
M3 - Contribución a la conferencia
AN - SCOPUS:85113619756
T3 - Proceedings - 2020 46th Latin American Computing Conference, CLEI 2020
SP - 56
EP - 65
BT - Proceedings - 2020 46th Latin American Computing Conference, CLEI 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 46th Latin American Computing Conference, CLEI 2020
Y2 - 19 October 2020 through 23 October 2020
ER -