Iranian Journal of Information Processing and Management

Iranian Journal of Information Processing and Management

Improving the Quality of Business Process Event Logs Using Unsupervised Method

Document Type : Exploring the Relationship between Data Quality and Business Process Management

Author
Assistant Professor; Computer Department; Esfarayen University of Technology; Esfarayen
Abstract
In the contemporary dynamic business environment, the dependability of process mining algorithms is intricately tied to the quality of event logs, often marred by data challenges stemming from human involvement in business processes. This study introduces a novel approach that amalgamates insights from prior works with unsupervised techniques, specifically Principal Component Analysis (PCA), to elevate the precision and reliability of event log representations. Executed through Python and the pm4py library, the methodology is applied to real event logs. The adoption of Petri nets for process representation aligns with systematic approaches advocated by earlier studies, enhancing transparency and interpretability. Results demonstrate the method’s efficacy through enhanced metrics such as Fitness, Precision, and F-Measure, accompanied by visualizations elucidating the optimal number of principal components. This study offers a comprehensive and practical solution, bridging gaps in existing methodologies, and its integration of multiple strategies, particularly PCA, showcases versatility in optimizing process mining analyses. The consistent improvements observed underscore the method’s potential across diverse business contexts, making it accessible and pertinent for practitioners engaged in real-world business processes. Overall, this research contributes an innovative approach to improve event log quality, thereby advancing the field of process mining with practical implications for organizational decision-making and process optimization.
Keywords
Subjects

References
Andrews, R., Suriadi, S., Ouyang, C., & Poppe, E. (2018). Towards event log querying for data quality: Let’s start with detecting log imperfections. In On the Move to Meaningful Internet Systems. OTM 2018 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2018, Valletta, Malta, October 22-26, 2018, Proceedings, Part I (pp. 116-134). Springer International Publishing.                          
DOI: 10.1007/978-3-030-02610-3_7 
Bayomie, D., Di Ciccio, C., & Mendling, J. (2023). Event-case correlation for process mining using probabilistic optimization. Information Systems, 114, 102167.                   
DOI: 10.1016/j.is.2023.102167
Berti, A., Van Zelst, S. J., & van der Aalst, W. (2019). Process mining for python (PM4Py): bridging the gap between process-and data science. arXiv preprint arXiv:1905.06169. DOI: 10.48550/arXiv.1905.06169
Boltenhagen, M., Chatain, T., & Carmona, J. (2019). Generalized alignment-based trace clustering of process behavior. In Application and Theory of Petri Nets and Concurrency: 40th International Conference, PETRI NETS 2019, Aachen, Germany, June 23–28, 2019, Proceedings 40 (pp. 237-257). Springer International Publishing.                              
DOI: 10.1007/978-3-030-21571-2_
Bose, R. J. C., Mans, R. S., & Van Der Aalst, W. M. (2013, April). Wanna improve process mining results?. In 2013 IEEE symposium on computational intelligence and data mining (CIDM) (pp. 127-134). IEEE. DOI: 10.1109/CIDM.2013.6597227
Buijs, J. C., van Dongen, B. F., & van der Aalst, W. M. (2014). Quality dimensions in process discovery: The importance of fitness, precision, generalization and simplicity. International Journal of Cooperative Information Systems, 23(01), 1440001.                          DOI: 10.1142/S0218843014400012       
Dumas, M., Rosa, L. M., Mendling, J., & Reijers, A. H. (2018). Fundamentals of business process management. Springer-Verlag. DOI: 10.1007/978-3-662-56509-4
Ferreira, D. R. (2017). A primer on process mining: Practical skills with python and graphviz. Cham: Springer International Publishing. DOI: 10.1007/978-3-030-41819-9
Goel, K., Leemans, S. J., Martin, N., & Wynn, M. T. (2022). Quality-informed process mining: A case for standardised data quality annotations. ACM Transactions on Knowledge Discovery from Data (TKDD), 16(5), 1-47. DOI: 10.1145/3511707
Janssenswillen, G., Donders, N., Jouck, T., & Depaire, B. (2017). A comparative study of existing quality measures for process discovery. Information Systems, 71, 1-15.     
DOI: 10.1016/j.is.2017.06.002
Ko, J., & Comuzzi, M. (2021). Detecting anomalies in business process event logs using statistical leverage. Information Sciences, 549, 53-67. DOI: 10.1016/j.ins.2020.11.017
Koschmider, A., Kaczmarek, K., Krause, M., & van Zelst, S. J. (2021, September). Demystifying noise and outliers in event logs: Review and future directions. In International Conference on Business Process Management (pp. 123-135).          
Cham: Springer International Publishing. DOI: 10.1007/978-3-030-94343-1_10       
Kurita, T. (2021). Principal component analysis (PCA). In Computer vision: a reference guide (pp. 1013-1016). Cham: Springer International Publishing.                       
DOI: 10.1007/978-3-030-63416-2_649
Martin, N., Van Houdt, G., & Janssenswillen, G. (2022). DaQAPO: supporting flexible and fine-grained event log quality assessment. Expert Systems with Applications, 191, 116274. DOI: 10.1016/j.eswa.2021.116274            
Marin-Castro, H. M., & Tello-Leal, E. (2021). Event log preprocessing for process mining: a review. Applied Sciences, 11(22), 10556. DOI: 10.3390/app112210556
Mohammadi, M. (2017). A Review of influencing factors on the quality of business process models. Journal of Economic & Management Perspectives, 11(3), 1833-1840.
Mohammadi, M. (2019, September). Discovering business process map of frequent running case in event log. In 2019 international conference on information technologies (InfoTech) (pp. 1-4). IEEE. DOI: 10.1109/InfoTech.2019.8860877
Nolle, T., Seeliger, A., & Mühlhäuser, M. (2016). Unsupervised anomaly detection in noisy business process event logs using denoising autoencoders. In Discovery Science: 19th International Conference, DS 2016, Bari, Italy, October 19–21, 2016, Proceedings 19 (pp. 442-456). Springer International Publishing. DOI: 10.1007/978-3-319-46307-0_28
Nguyen, H. T. C., Lee, S., Kim, J., Ko, J., & Comuzzi, M. (2019). Autoencoders for improving quality of process event logs. Expert Systems with Applications, 131, 132-147.    
DOI: 10.1016/j.eswa.2019.04.052  
Post, R., Beerepoot, I., Lu, X., Kas, S., Wiewel, S., Koopman, A., & Reijers, H. (2021, October). Active anomaly detection for key item selection in process auditing. In International Conference on Process Mining (pp. 167-179). Cham: Springer International Publishing. DOI: 10.1007/978-3-030-98581-3_13     
Fani Sani, M., van Zelst, S. J., & van der Aalst, W. M. (2018). Applying sequence mining for outlier detection in process mining. In On the Move to Meaningful Internet Systems. OTM 2018 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2018, Valletta, Malta, October 22-26, 2018, Proceedings, Part II (pp. 98-116). Springer International Publishing. DOI: 10.1007/978-3-030-02671-4_6
Sani, M. F. (2020, June). Preprocessing event data in process mining. In CAiSE (Doctoral Consortium) (pp. 1-10).
Suriadi, S., Andrews, R., ter Hofstede, A. H., & Wynn, M. T. (2017). Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information systems, 64, 132-150. DOI: 10.1016/j.is.2016.07.011
Van der Aalst, W., Adriansyah, A., & Van Dongen, B. (2012). Replaying history on process models for conformance checking and performance analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(2), 182-192.                        
DOI: 10.1002/widm.1045                
Van Der Aalst, W., & van der Aalst, W. (2016). Data science in action (pp. 3-23). Springer Berlin Heidelberg.DOI: 10.1007/978-3-662-49851-4_1
van der Aalst, W. M. (2013, May). Mediating between modeled and observed behavior: The quest for the “right” process: keynote. In IEEE 7th International Conference on Research Challenges in Information Science (RCIS) (pp. 1-12). IEEE.                     
DOI: 10.1109/RCIS.2013.6577675
Sani, M. F., van Zelst, S. J., & Van Der Aalst, W. M. (2018). Improving process discovery results by filtering outliers using conditional behavioural probabilities. In Business Process Management Workshops: BPM 2017 International Workshops, Barcelona, Spain, September 10-11, 2017, Revised Papers 15 (pp. 216-229). Springer International Publishing. DOI: 10.1007/978-3-319-74030-0_16
Van Der Aalst, W., Adriansyah, A., De Medeiros, A. K. A., Arcieri, F., Baier, T., Blickle, T., ... & Wynn, M. (2012). Process mining manifesto. In Business Process Management Workshops: BPM 2011 International Workshops, Clermont-Ferrand, France, August 29, 2011, Revised Selected Papers, Part I 9 (pp. 169-194). Springer Berlin Heidelberg.        
DOI: 10.1007/978-3-642-28108-2_19

  • Receive Date 16 December 2023
  • Revise Date 30 January 2024
  • Accept Date 15 March 2025