Skip to main navigation Skip to search Skip to main content

A Comparative Evaluation of Preprocessing Techniques for Short Texts in Spanish

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Natural Language Processing (NLP) is used to identify key information, generating predictive models, and explaining global events or trends. Also, NLP is supported during the process to create knowledge. Therefore, it is important to apply refinement techniques in major stages such as preprocessing, when data is frequently produced and processed with poor results. This document analyzes and measures the impact of combinations of preprocessing techniques and libraries for short texts that have been written in Spanish. These techniques were applied in tweets for analysis of sentiments considering evaluation parameters in its analysis, the processing time and characteristics of the techniques for each library. The performed experimentation provides readers insights for choosing the appropriate combination of techniques during preprocessing. The results show improvement of up to 5% to 9% in the performance of the classification.

Original languageEnglish
Title of host publicationAdvances in Information and Communication - Proceedings of the 2020 Future of Information and Communication Conference FICC
EditorsKohei Arai, Supriya Kapoor, Rahul Bhatia
PublisherSpringer
Pages111-124
Number of pages14
ISBN (Print)9783030394417
DOIs
StatePublished - 2020
EventFuture of Information and Communication Conference, FICC 2020 - San Francisco, United States
Duration: 5 Mar 20206 Mar 2020

Publication series

NameAdvances in Intelligent Systems and Computing
Volume1130 AISC
ISSN (Print)2194-5357
ISSN (Electronic)2194-5365

Conference

ConferenceFuture of Information and Communication Conference, FICC 2020
Country/TerritoryUnited States
CitySan Francisco
Period5/03/206/03/20

Keywords

  • Natural Language Processing
  • Preprocessing
  • Sentiment analysis
  • Text mining
  • Twitter

Fingerprint

Dive into the research topics of 'A Comparative Evaluation of Preprocessing Techniques for Short Texts in Spanish'. Together they form a unique fingerprint.

Cite this