Extracting Terminological Concept Systems from Natural Language Text

Dagmar Gromann (Korresp. Autor*in), Lennart Wachowiak, Christian Lang, Barbara Heinisch

Veröffentlichungen: Beitrag in BuchBeitrag in Buch/SammelbandPeer Reviewed

Abstract

Terminology denotes a language resource that structures domain-specific knowledge by means of conceptual grouping of terms and their interrelations. Such structured domain knowledge is vital to various specialised communication settings, from corporate language to crisis communication. However, manually curating a terminology is both labour- and time-intensive. Approaches to automatically extract terminology have focused on detecting domain-specific single- and multi-word terms without taking terminological relations into consideration, while knowledge extraction has specialised on named entities and their relations. We present the Text2TCS method to extract single- and multi-word terms, group them by synonymy, and interrelate these groupings by means of a pre-specified relation typology to generate a Terminological Concept System (TCS) from domain-specific text in multiple languages. To this end, the method relies on pre-trained neural language models.
OriginalspracheEnglisch
TitelEuropean Language Grid
UntertitelA language technology platform for multilingual Europe
Redakteure*innenG. Rehm
ErscheinungsortCham
Herausgeber (Verlag)Springer
Seiten289–294
Seitenumfang6
ISBN (elektronisch)978-3-031-17258-8
ISBN (Print)978-3-031-17257-1
DOIs
PublikationsstatusVeröffentlicht - 2023

Publikationsreihe

ReiheCognitive Technologies
ISSN1611-2482

ÖFOS 2012

  • 602011 Computerlinguistik
  • 602049 Terminologielehre

Zitationsweisen