TY - JOUR
T1 - Detección de valores atípicos con técnicas de minería de datos y métodos estadísticos
AU - Cedillo Orellana, Irene Priscila
AU - Orellana Cordero, Marcos Patricio
PY - 2020/1/1
Y1 - 2020/1/1
N2 - The detection of outliers in the field of data mining (DM) and the process of knowledge discovery in databases (KDD) is of great interest in areas that require support systems for decision making. A straightforward application can be found in the financial area, where DM can potentially detect financial fraud or find errors produced by the users. Thus, it is essential to evaluate the veracity of the information, through the use of methods for the detection of unusual behaviors in the data. This paper proposes a method to detect values that are considered outliers in a database of nominal type data. The method implements a global algorithm of "k" closest neighbors, a clustering algorithm called k-means and a statistical method called chi-square. These techniques have been implemented on a database of clients who have requested a financial credit. The experiment was performed on a data set with 1180 tuples, where, outliers were deliberately introduced. The results showed that the proposed method is able to detect all the outliers entered.
AB - The detection of outliers in the field of data mining (DM) and the process of knowledge discovery in databases (KDD) is of great interest in areas that require support systems for decision making. A straightforward application can be found in the financial area, where DM can potentially detect financial fraud or find errors produced by the users. Thus, it is essential to evaluate the veracity of the information, through the use of methods for the detection of unusual behaviors in the data. This paper proposes a method to detect values that are considered outliers in a database of nominal type data. The method implements a global algorithm of "k" closest neighbors, a clustering algorithm called k-means and a statistical method called chi-square. These techniques have been implemented on a database of clients who have requested a financial credit. The experiment was performed on a data set with 1180 tuples, where, outliers were deliberately introduced. The results showed that the proposed method is able to detect all the outliers entered.
KW - Outlier; Data mining; KNN; Chi-square; Financial fraud
KW - Outlier; Data mining; KNN; Chi-square; Financial fraud
UR - http://scielo.senescyt.gob.ec/scielo.php?script=sci_arttext&pid=S1390-65422020000100056
M3 - Artículo
JO - Enfoque UTE. Revista de ingeniería científica
JF - Enfoque UTE. Revista de ingeniería científica
ER -