TY - GEN
T1 - Combining statistical and semantic approaches to the translation of ontologies and taxonomies
AU - McCrae, John
AU - Espinoza, Mauricio
AU - Montiel-Ponsoda, Elena
AU - Aguado-De-Cea, Guadalupe
AU - Cimiano, Philipp
N1 - Publisher Copyright:
© 2011 Association for Computational Linguistics
PY - 2011
Y1 - 2011
N2 - Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the “free-text” paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach.
AB - Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the “free-text” paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach.
UR - https://www.scopus.com/pages/publications/85025585621
M3 - Contribución a la conferencia
AN - SCOPUS:85025585621
T3 - 5th Workshop on Syntax, Semantics and Structure in Statistical Translation, SSST 2011 at the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL HLT 2011 - Proceedings of the Workshop
SP - 116
EP - 125
BT - 5th Workshop on Syntax, Semantics and Structure in Statistical Translation, SSST 2011 at the Annual Meeting of the Association for Computational Linguistics
A2 - Wu, Dekai
A2 - Apidianaki, Marianna
A2 - Carpuat, Marine
A2 - Specia, Lucia
PB - Association for Computational Linguistics (ACL)
T2 - 5th Workshop on Syntax, Semantics and Structure in Statistical Translation, SSST 2011 at the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL HLT 2011
Y2 - 23 June 2011
ER -