TY - JOUR
T1 - Automatically finding actors in texts: A performance review of multilingual named entity recognition tools for news texts
AU - Balluff, Paul
AU - Boomgaarden, Hajo
AU - Waldherr, Annie
N1 - Publisher Copyright:
© 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.
PY - 2024/3/19
Y1 - 2024/3/19
N2 - Named Entity Recognition (NER) is a crucial task in natural language processing and has a wide range of applications in communication science. However, there is a lack of systematic evaluations of available NER tools in the field. In this study, we evaluate the performance of various multilingual NER tools, including rule-based and transformer-based models. We conducted experiments on corpora containing texts in multiple languages and evaluated the F
1-score, speed, and features of each tool. Our results show that transformer-based language models outperform rule-based models and other NER tools in most languages. However, we found that the performance of the transformer-based models varies depending on the language and the corpus. Our study provides insights into the strengths and weaknesses of NER tools and their suitability for specific languages, which can inform the selection of appropriate tools for future studies and applications in communication science.
AB - Named Entity Recognition (NER) is a crucial task in natural language processing and has a wide range of applications in communication science. However, there is a lack of systematic evaluations of available NER tools in the field. In this study, we evaluate the performance of various multilingual NER tools, including rule-based and transformer-based models. We conducted experiments on corpora containing texts in multiple languages and evaluated the F
1-score, speed, and features of each tool. Our results show that transformer-based language models outperform rule-based models and other NER tools in most languages. However, we found that the performance of the transformer-based models varies depending on the language and the corpus. Our study provides insights into the strengths and weaknesses of NER tools and their suitability for specific languages, which can inform the selection of appropriate tools for future studies and applications in communication science.
UR - http://www.scopus.com/inward/record.url?scp=85188524285&partnerID=8YFLogxK
U2 - 10.1080/19312458.2024.2324789
DO - 10.1080/19312458.2024.2324789
M3 - Article
SN - 1931-2458
VL - 18
SP - 371
EP - 389
JO - Communication Methods & Measures
JF - Communication Methods & Measures
IS - 4
ER -