TY - GEN
T1 - EDA and a Tailored Data Imputation Algorithm for Daily Ozone Concentrations
AU - Gualán, Ronald
AU - Saquicela, Víctor
AU - Tran-Thanh, Long
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.
AB - Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.
KW - Air pollution
KW - Data imputation
KW - Gaussian process
KW - Sensor data
UR - https://www.scopus.com/pages/publications/85055644030
U2 - 10.1007/978-3-030-02828-2_27
DO - 10.1007/978-3-030-02828-2_27
M3 - Contribución a la conferencia
AN - SCOPUS:85055644030
SN - 9783030028275
T3 - Advances in Intelligent Systems and Computing
SP - 372
EP - 386
BT - Information and Communication Technologies of Ecuador (TIC.EC)
A2 - Botto-Tobar, Miguel
A2 - Barba-Maggi, Lida
A2 - Villacrés-Cevallos, Patricio
A2 - Uvidia-Fassler, María I.
A2 - González-Huerta, Javier
A2 - S. Gómez, Omar
PB - Springer Verlag
T2 - 6th Conference on Information Technologies and Communication of Ecuador, TIC-EC 2018
Y2 - 21 November 2018 through 23 November 2018
ER -