TY - JOUR
T1 - Systematic Review of Chatbot Development in Health and Education
T2 - 11th International Conference on eDemocracy and eGovernment, ICEDEG 2025
AU - Morocho, Villie
AU - Ordoñez-Crespo, Christian
AU - Viloria-Ramírez, Verónica
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Chatbots have emerged as valuable tools in public service domains such as healthcare and education, supporting information delivery, training, and user engagement. Despite growing adoption, development practices remain fragmented. This paper presents a Systematic Literature Review (SLR) of 27 studies to identify trends, challenges, and research gaps in chatbot development, focusing on technological approaches, frameworks, evaluation methodologies, and language distribution. Findings reveal a strong preference for open-source frameworks, particularly Rasa, valued for flexibility and scalability, while TensorFlow is often used for custom ML-based implementations. English dominates as the primary development language, with limited representation of Spanish, despite its global importance. Most studies apply quantitative (performance-based) metrics such as F1-score and accuracy; only one combines quantitative and qualitative evaluations, reflecting a lack of standardized assessment approaches. Additional gaps include the absence of hybrid evaluation frameworks, limited attention to Responsible AI principles, and scarce discussion on deployment requirements. Future research should prioritize multilingual chatbot models, integrate hybrid evaluation frameworks, and promote transparency to ensure scalability, ethical AI practices, and user-centered design.
AB - Chatbots have emerged as valuable tools in public service domains such as healthcare and education, supporting information delivery, training, and user engagement. Despite growing adoption, development practices remain fragmented. This paper presents a Systematic Literature Review (SLR) of 27 studies to identify trends, challenges, and research gaps in chatbot development, focusing on technological approaches, frameworks, evaluation methodologies, and language distribution. Findings reveal a strong preference for open-source frameworks, particularly Rasa, valued for flexibility and scalability, while TensorFlow is often used for custom ML-based implementations. English dominates as the primary development language, with limited representation of Spanish, despite its global importance. Most studies apply quantitative (performance-based) metrics such as F1-score and accuracy; only one combines quantitative and qualitative evaluations, reflecting a lack of standardized assessment approaches. Additional gaps include the absence of hybrid evaluation frameworks, limited attention to Responsible AI principles, and scarce discussion on deployment requirements. Future research should prioritize multilingual chatbot models, integrate hybrid evaluation frameworks, and promote transparency to ensure scalability, ethical AI practices, and user-centered design.
KW - chatbots
KW - educational technology
KW - evaluation metrics
KW - health applications
KW - natural language processing
KW - responsible AI
KW - systematic literature review
UR - https://www.scopus.com/pages/publications/105012856381
U2 - 10.1109/ICEDEG65568.2025.11081666
DO - 10.1109/ICEDEG65568.2025.11081666
M3 - Artículo de la conferencia
AN - SCOPUS:105012856381
SN - 2573-2005
SP - 231
EP - 238
JO - International Conference on eDemocracy and eGovernment, ICEDEG
JF - International Conference on eDemocracy and eGovernment, ICEDEG
IS - 2025
Y2 - 18 June 2025 through 20 June 2025
ER -