Iranian Journal of Information Processing and Management

Iranian Journal of Information Processing and Management

Quality Metrics for Business Process Event Logs Based on High Frequency Traces

Document Type : Exploring the Relationship between Data Quality and Business Process Management

Author
Assistant Professor; Computer Department; Esfarayen University of Technology; Esfarayen,
Abstract
In today’s data-centric business landscape, characterized by the omnipresence of advanced Business Intelligence and Data Science technologies, the practice of Process Mining takes center stage in Business Process Management. This study addresses the critical challenge of ensuring the quality of event logs, which serve as the foundational data source for Process Mining. Event logs, derived from interactions among process participants and information systems, offer profound insights into the authentic behavior of business processes, reflecting the organizational rules, procedures, norms, and culture. However, the quality of these event logs is often compromised by interactions among various actors and systems. In response, our research introduces a systematic approach that leverages Python and the pm4py library for data analysis. We employ trace filtering techniques and utilize Petri nets for process model representation. This paper proposes a methodology demonstrating a significant improvement in the quality metrics of extracted subprocesses through trace filtering. Comparative analyses between the original logs and filtered logs show enhancements in fitness, precision, generalization, and simplicity, highlighting the practical importance of trace filtering in refining complex process models. These findings offer practical insights for practitioners and researchers involved in process mining and modeling, highlighting the significance of data quality in obtaining precise and dependable business process insights.
Keywords
Subjects

References
AliAbadi, A., & Mohammadi, M. (2018). A Method for Data Integration in Enterprises Using Web Service. Iranian Journal of Information Processing and Management, 33 (4), 1637-1658. DOI: 10.35050/JIPM010.2018.028
Andrews, R., Suriadi, S., Ouyang, C., & Poppe, E. (2018). Towards event log querying for data quality: Let’s start with detecting log imperfections. In On the Move to Meaningful Internet Systems. OTM 2018 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2018, Valletta, Malta, October 22-26, 2018, Proceedings, Part I (pp. 116-134). Springer International Publishing. DOI: 10.1007/978-3-030-02610-3_7
Andrews, R., Wynn, M. T., Vallmuur, K., Ter Hofstede, A. H., Bosley, E., Elcock, M., & Rashford, S. (2019). Leveraging data quality to better prepare for process mining: an approach illustrated through analysing road trauma pre-hospital retrieval and transport processes in Queensland. International journal of environmental research and public health, 16 (7), 1138. DOI: 10.3390/ijerph16071138
Andrews, R., van Dun, C. G., Wynn, M. T., Kratsch, W., Röglinger, M. K. E., & ter Hofstede, A. H. (2020). Quality-informed semi-automated event log generation for process mining. Decision Support Systems, 132, 113265. DOI: 10.1016/j.dss.2020.113265
Andrews, R., Emamjome, F., ter Hofstede, A. H., & Reijers, H. A. (2022). Root-cause analysis of process-data quality problems. Journal of Business Analytics, 5 (1), 51-75. DOI: 10.1080/2573234X.2021.1947751
Berti, A., van Zelst, S., & Schuster, D. (2023). PM4Py: A process mining library for Python. Software Impacts, 17, 100556. DOI: 10.1016/j.simpa.2023.100556
Boltenhagen, M., Chatain, T., & Carmona, J. (2019). Generalized alignment-based trace clustering of process behavior. In Application and Theory of Petri Nets and Concurrency: 40th International Conference, PETRI NETS 2019, Aachen, Germany, June 23–28, 2019, Proceedings 40 (pp. 237-257). Springer International Publishing. DOI: 10.1007/978-3-030-21571-2_1
Buijs, J. C., van Dongen, B. F., & van der Aalst, W. M. (2014). Quality dimensions in process discovery: The importance of fitness, precision, generalization and simplicity. International Journal of Cooperative Information Systems, 23 (01), 1440001. DOI: 10.1142/S0218843014400012
Ceravolo, P., Damiani, E., Torabi, M., & Barbon, S. (2017). Toward a new generation of log pre-processing methods for process mining. In Business Process Management Forum: BPM Forum 2017, Barcelona, Spain, September 10-15, 2017, Proceedings 15 (pp. 55-70). Springer International Publishing. DOI: 10.1007/978-3-319-65015-9_4
Chapela-Campa, D., Mucientes, M., & Lama, M. (2019). Mining frequent patterns in process models. Information Sciences, 472, 235-257. DOI: 10.1016/j.ins.2018.09.011
Dakic, D., Stefanovic, D., Vuckovic, T., Zizakov, M., & Stevanov, B. (2023). Event Log Data Quality Issues and Solutions. Mathematics, 11 (13), 2858. DOI: 10.3390/math11132858
De San Pedro, J., Carmona, J., & Cortadella, J. (2015). Log-based simplification of process models. In Business Process Management: 13th International Conference, BPM 2015, Innsbruck, Austria, August 31--September 3, 2015, Proceedings 13 (pp. 457-474). Springer International Publishing. DOI: 10.1007/978-3-319-23063-4_30
Fahland, D., & Van Der Aalst, W. M. (2011). Simplifying mined process models: An approach based on unfoldings. In Business Process Management: 9th International Conference, BPM 2011, Clermont-Ferrand, France, August 30-September 2, 2011. Proceedings 9 (pp. 362-378). Springer Berlin Heidelberg. DOI: 10.1007/978-3-642-23059-2_27
Fischer, D. A., Goel, K., Andrews, R., van Dun, C. G. J., Wynn, M. T., & Röglinger, M. (2020). Enhancing event log quality: Detecting and quantifying timestamp imperfections. In Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020, Proceedings 18 (pp. 309-326). Springer International Publishing. DOI: 10.1007/978-3-030-58666-9_18
Ferreira, D. R. (2017). A primer on process mining: Practical skills with python and graphviz. Cham: Springer International Publishing. DOI: 10.1007/978-3-030-41819-9
Goel, K., Leemans, S. J., Martin, N., & Wynn, M. T. (2022). Quality-informed process mining: A case for standardised data quality annotations. ACM Transactions on Knowledge Discovery from Data (TKDD), 16 (5), 1-47.
DOI: 10.1145/3511707
Hasanzadeh, A., Namdarian, L., & Elahi, S. (2012). A model for service oriented architecture (SOA) governance maturity. Iranian Journal of Information Processing and Management, 27 (3), 681-697.
Ireddy, A. T., & Kovalchuk, S. V. (2023). An Experimental Outlook on Quality Metrics for Process Modelling: A Systematic Review and Meta Analysis. Algorithms, 16 (6), 295. DOI: 10.3390/a16060295
Krogstie, J., & Krogstie, J. (2016). Quality of business process models (pp. 53-102). Springer International Publishing. DOI: 10.1007/978-3-642-34549-4_6
Leemans, M., & van der Aalst, W. M. (2015). Discovery of frequent episodes in event logs. In Data-Driven Process Discovery and Analysis: 4th International Symposium, SIMPDA 2014, Milan, Italy, November 19-21, 2014, Revised Selected Papers 4 (pp. 1-31). Springer International Publishing.
Liu, J., Xu, J., Zhang, R., & Reiff-Marganiec, S. (2021). A repairing missing activities approach with succession relation for event logs. Knowledge and Information Systems, 63 (2), 477-495. DOI:10.1007/s10115-020-01524-6
Marin-Castro, H. M., & Tello-Leal, E. (2021). Event log preprocessing for process mining: a review. Applied Sciences, 11 (22), 10556. DOI: 10.3390/app112210556
Mohammadi, M. (2017). A Review of influencing factors on the quality of business process models. Journal of Economic & Management Perspectives, 11 (3), 1833-1840.
Reijers, H. A., Mendling, J., & Recker, J. (2015). Business process quality management. Handbook on Business Process Management 1: Introduction, Methods, and Information Systems, 167-185. DOI:10.1002/9781118785317.weom070213
Salehi, A., Aghdasi, M., Khatibi, T., & Sheikhmohammady, M. (2023). A Conceptual Framework for Preprocessing and Improving Quality of Event Log in Process Mining. Iranian Journal of Information Processing and Management, 38 (3), 945-979. DOI: 10.3390/app112210556
Suriadi, S., Andrews, R., ter Hofstede, A. H., & Wynn, M. T. (2017). Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information systems, 64, 132-150.
DOI: 10.1016/j.is.2016.07.011
Tax, N., Sidorova, N., Haakma, R., & van der Aalst, W. M. (2016). Mining local process models. Journal of Innovation in Digital Ecosystems, 3 (2), 183-196. DOI: 10.1016/j.jides.2016.11.001
Van der Aalst, W., Adriansyah, A., & Van Dongen, B. (2012). Replaying history on process models for conformance checking and performance analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2 (2), 182-192. DOI: 10.1002/widm.1045
Van der Aalst, Wil MP. 2016. Process Mining - Data Science in Action, Second Edition. Springer. DOI: 10.1007/978-3-662-49851-4
Van der Aalst, W. M. (2013, May). Mediating between modeled and observed behavior: The quest for the “right” process: keynote. In IEEE 7th International Conference on Research Challenges in Information Science (RCIS) (pp. 1-12). IEEE. DOI: 10.1109/RCIS.2013.6577675
Van Der Aalst, W., Adriansyah, A., De Medeiros, A. K. A., Arcieri, F., Baier, T., Blickle, T., ... & Wynn, M. (2012). Process mining manifesto. In Business Process Management Workshops: BPM 2011 International Workshops, Clermont-Ferrand, France, August 29, 2011, Revised Selected Papers, Part I 9 (pp. 169-194). Springer Berlin Heidelberg. DOI: 10.1007/978-3-642-28108-2_19
 

  • Receive Date 14 October 2023
  • Revise Date 24 December 2023
  • Accept Date 27 December 2023