TY - JOUR
T1 - Enabling semantics-aware process mining through the automatic annotation of event logs
AU - Rebmann, Adrian
AU - van der Aa, Han
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2022/12
Y1 - 2022/12
N2 - Process mining is concerned with the analysis of organizational processes based on event data recorded during their execution. Foundational process mining techniques analyze such data in an abstract manner, without taking the meaning of these events or their payload into consideration. By contrast, other techniques may exploit specific kinds of information contained in event data, such as resources in organizational mining and business objects in object-centric analysis, to gain more specific insights into an organization's operations. However, the information required for such analyses is typically not readily available. Rather, the meaning of events is often captured in an ad hoc manner, commonly through unstructured textual attributes, such as an event's label, or in unclearly named attributes. In this work, we address this gap by proposing an approach for the automatic annotation of semantic components in event logs. To achieve this, we combine the analysis of textual attribute values, based on a state-of-the-art language model, with novel attribute classification and component categorization techniques. In this manner, our approach first identifies up to eight semantic components per event, revealing information on the actions, business objects, and resources recorded in an event log. Afterwards, our approach further categorizes the identified actions and actors, allowing for a more in-depth analysis of key process perspectives. We demonstrate our approach's efficacy through an evaluation using a broad range of event logs and highlight its usefulness through four application scenarios enabled by our approach.
AB - Process mining is concerned with the analysis of organizational processes based on event data recorded during their execution. Foundational process mining techniques analyze such data in an abstract manner, without taking the meaning of these events or their payload into consideration. By contrast, other techniques may exploit specific kinds of information contained in event data, such as resources in organizational mining and business objects in object-centric analysis, to gain more specific insights into an organization's operations. However, the information required for such analyses is typically not readily available. Rather, the meaning of events is often captured in an ad hoc manner, commonly through unstructured textual attributes, such as an event's label, or in unclearly named attributes. In this work, we address this gap by proposing an approach for the automatic annotation of semantic components in event logs. To achieve this, we combine the analysis of textual attribute values, based on a state-of-the-art language model, with novel attribute classification and component categorization techniques. In this manner, our approach first identifies up to eight semantic components per event, revealing information on the actions, business objects, and resources recorded in an event log. Afterwards, our approach further categorizes the identified actions and actors, allowing for a more in-depth analysis of key process perspectives. We demonstrate our approach's efficacy through an evaluation using a broad range of event logs and highlight its usefulness through four application scenarios enabled by our approach.
KW - Natural language processing
KW - Process mining
KW - Semantic analysis
UR - http://www.scopus.com/inward/record.url?scp=85135870725&partnerID=8YFLogxK
U2 - 10.1016/j.is.2022.102111
DO - 10.1016/j.is.2022.102111
M3 - Article
AN - SCOPUS:85135870725
SN - 0306-4379
VL - 110
JO - Information Systems
JF - Information Systems
M1 - 102111
ER -