Iranian Journal of Information Processing and Management

Iranian Journal of Information Processing and Management

Review of Rule Extraction Methods in Data Mining: A Systematic Review of Studies

Document Type : Review Article

Authors
Department of Industrial and Systems Engineering, Isfahan University of Technology, Isfahan, Iran
Abstract
Extracted rules from data represent one of the most important and practical forms of knowledge representation applied across various domains, including expert systems, decision support, and automated control. Rule extraction methods, valued for their transparency and interpretability, enable researchers to understand underlying patterns and substructural relationships within datasets. Consequently, numerous approaches and algorithms for rule extraction have been developed, each with its own strengths and weaknesses. This paper systematically reviews rule extraction methods in data mining. A total of 678 articles were initially collected from reputable scientific databases, with 19 selected for final analysis after screening. The analysis is conducted in three parts: 1) a critical evaluation of rule extraction strategies and performance metrics, 2) identification of research gaps based on the analysis to propose providing areas of study for future research, and 3) textual content analysis of studies using tools like VOS viewer to identify key concepts and their relationships. The findings of this study can contribute to developing more efficient rule extraction algorithms across diverse domains and pave the way for future research in this field.
Keywords
Subjects

References:
Agrawal, R., & R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB (pp. 487-499).
Asadi, S., & J. Shahrabi. 2016. ACORI: A novel ACO algorithm for rule induction. Knowledge-Based Systems 97: 175-187.
_____. 2017. Complexity-based parallel rule induction for multiclass classification. Information Sciences 380: 53-73.
Barut, C., G. Yildirim & Y. Tatar. 2024. An intelligent and interpretable rule-based metaheuristic approach to task scheduling in cloud systems. Knowledge-Based Systems 284: 111241.
Berrone, S., F. Della Santa, A. Mastropietro, S. Pieraccini, & F. Vaccarino, F. 2022. Graph-informed neural networks for regressions on graph-structured data. Mathematics 10 (5): 786.
Breskvar, M., & S. Džeroski. 2021. Multi-target regression rules with Random Output Selections. IEEE Access 9: 10509-10522.
Cao, Y., Z. Zhou, C. Hu, W. He & S. Tang. 2020. On the interpretability of belief rule-based expert systems. IEEE Transactions on Fuzzy Systems 29 (11): 3489-3503.
Casalino, G., G. Castellano, C. Castiello, V. Pasquadibisceglie & G. Zaza. 2019. A fuzzy rule-based decision support system for cardiovascular risk assessment. Fuzzy Logic and Applications: 12th International Workshop, WILF 2018, Genoa, Italy, September 6–7, 2018, Revised Selected Papers,
Castellanos-Garzón, J. A., E. Costa & J. M. Corchado. 2019. An evolutionary framework for machine learning applied to medical data. Knowledge-Based Systems 185: 104982.
Cendrowska, J. 1987. PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies 27 (4): 349-370.
Chemchem, A., & H. Drias. 2015. From data mining to knowledge mining: Application to intelligent agents. Expert Systems with Applications 42 (3): 1436-1445.
Ding, J., L. Li, H. Peng & Y. Zhang. 2019. A rule-based cooperative merging strategy for connected and automated vehicles. IEEE Transactions on Intelligent Transportation Systems 21 (8): 3436-3446.
Fu, C., B. Hou, M. Xue, L. Chang & W. Liu. 2022. Extended belief rule-based system with accurate rule weights and efficient rule activation for diagnosis of thyroid nodules. IEEE Transactions on Systems, Man, and Cybernetics: Systems 53 (1): 251-263.
Fu, Y.-G., X.-Y. Lin, G.-C. Fang, J. Li, H.-Y. Cai, X.-T. Gong & Y.-M Wang. 2024. A novel extended rule-based system based on K-Nearest Neighbor graph. Information Sciences 662: 120158.
Hastie, T. 2009. The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
Herbst, K., S. Juvekar, T. Bhattacharjee, M. Bangha, N. Patharia, T. Tei, B. Gilbert & O. Sankoh. 2015. The INDEPTH Data Repository: An International Resource for Longitudinal Population and Health Data From Health and Demographic Surveillance Systems. J Empir Res Hum Res Ethics, 10 (3), 324-333. https://doi.org/10.1177/1556264615594600
Hong, J.-S., J. Lee & M. K. Sim. 2024. Concise rule induction algorithm based on one-sided maximum decision tree approach. Expert Systems with Applications 237: 121365.
Hossin, M., & M. N. Sulaiman. 2015. A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process 5 (2): 1.
Ibarguren, I., J. M. Pérez, J. Muguerza, I. Gurrutxaga, O. & Arbelaitz. 2015. Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems 79: 51-67.
_____. 2018. UnPART: PART without the ‘partial’condition of it. Information Sciences 465: 505-522.
Ie, K. S., & T. M. Borysova. 2021. Vykorystannia shtuchnoho intelektu pry marketynhovomu analizi nestrukturovanykh danykh [Use of artificial intelligence in marketing analysis of unstructured data]. Marketynh i tsyfrovi tekhnolohii 1: 17-26.
Ignatiev, A., J. Marques-Silva, N. Narodytska & P. J. Stuckey. 2021. Reasoning-based learning of interpretable ML models. International Joint Conference on Artificial Intelligence 2021. Association for the Advancement of Artificial Intelligence (AAAI).
Kozlov, A. M., D. Darriba, T. Flouri, B. Morel & A. Stamatakis. 2019. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35 (21): 4453-4455. https://doi.org/10.1093/bioinformatics/btz305
Li, Y., I. J. Pérez, F. J. Cabrerizo, H. Garg & J. A. Morente-Molinera. 2024. A belief rule-based classification system using fuzzy unordered rule induction algorithm. Information Sciences 667: 120462.
Lin, S.-H., C.-C. Huang & Z.-X Che. 2015. Rule induction for hierarchical attributes using a rough set for the selection of a green fleet. Applied Soft Computing 37: 456-466.
Liu, H., & M. Cocea. 2018. Induction of classification rules by gini-index based rule generation. Information Sciences 436: 227-246.
Ma, J., A. Zhang, F. Gao, W. Bi & C. Tang. 2023. A novel rule generation and activation method for extended belief rule-based system based on improved decision tree. Applied Intelligence 53 (7): 7355-7368.
Maszczyk, C., M. Sikora & L. Wróbel. 2024. Classification, Regression, and Survival Rule Induction with Complex and M-of-N Elementary Conditions. Machine Learning and Knowledge Extraction 6 (1): 554-579.
McInnes, M. D. F., D. Moher, B. D. Thombs, T. A. McGrath, P. M. Bossuyt & a. t. P.-D Group. 2018. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA 319 (4): 388-396. https://doi.org/10.1001/jama.2017.19163
Mining, W. I. D. 2006. Data mining: Concepts and techniques. Morgan Kaufinann 10 (4): 559-569.
Moher, D., A. Liberati, J. Tetzlaff, D. G. Altman & T. PRISMA Group*. 2009. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Annals of internal medicine 151 (4): 264-269.
Napierała, K., & J. Stefanowski. 2015. Addressing imbalanced data with argument based rule learning. Expert Systems with Applications 42 (24): 9468-9481.
Ogunleye, A., & Q. G. Wang. 2020. XGBoost Model for Chronic Kidney Disease Diagnosis. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17 (6): 2131-2140. https://doi.org/10.1109/TCBB.2019.2911071
Pham, D. T., & A. Afify. 2005. Rules-6: a simple rule induction algorithm for supporting decision making. 31st Annual Conference of IEEE Industrial Electronics Society, 2005. IECON 2005.?
Quinlan, J. R. 1987. Simplifying decision trees. International Journal of Man-Machine Studies 27 (3): 221-234. https://doi.org/https://doi.org/10.1016/S0020-7373(87)80053-6
_____. 2014. C4. 5: programs for machine learning.?: Elsevier.
Rajab, K. D. 2019. New associative classification method based on rule pruning for classification of datasets. Ieee Access 7: 157783-157795.
Siddaway, A. P., A. M. Wood & L. V. Hedges. 2019. How to Do a Systematic Review: A Best Practice Guide for Conducting and Reporting Narrative Reviews, Meta-Analyses, and Meta-Syntheses. Annual Review of Psychology, 70 (Volume 70, 2019), 747-770. https://doi.org/https://doi.org/10.1146/annurev-psych-010418-102803
Thabtah, F., I. Qabajeh, F. & Chiclana. 2016. Constrained dynamic rule induction learning. Expert Systems with Applications 63: 74-85.
Uthayakumar, J., T. Vengattaraman & P. Dhavachelvan. 2020. Swarm intelligence based classification rule induction (CRI) framework for qualitative and quantitative approach: An application of bankruptcy prediction and credit risk analysis. Journal of King Saud University - Computer and Information Sciences, 32 (6): 647-657. https://doi.org/https://doi.org/10.1016/j.jksuci.2017.10.007
Vale, D., A. El-Sharif & M. Ali. 2022. Explainable artificial intelligence (XAI) post-hoc explainability methods: risks and limitations in non-discrimination law. AI and Ethics 2 (4): 815-826. https://doi.org/10.1007/s43681-022-00142-y
Verbeke, W., D. Martens, C. Mues & B. Baesens. 2011. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Systems with Applications 38 (3): 2354-2364. https://doi.org/https://doi.org/10.1016/j.eswa.2010.08.023
Verma, L., S. Srivastava & P. C. Negi. 2016. A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data. Journal of Medical Systems 40 (7): 178. https://doi.org/10.1007/s10916-016-0536-z
Whiting, D. G., J. V. Hansen, J. B. McDonald, C. Albrecht & W. S. Albrecht. 2012. MACHINE LEARNING METHODS FOR DETECTING PATTERNS OF MANAGEMENT FRAUD. Computational Intelligence 28 (4): 505-527. https://doi.org/https://doi.org/10.1111/j.1467-8640.2012.00425.x
Yanjie, Z., & S. Hongbo. 2018. Application of biclustering algorithm to extract rules from labeled data. International Journal of Crowd Science 2 (2): 86-98.
Yu, J., H. Lee, Y. Jeong & S. Kim. 2016. Identifying chaff echoes in weather radar data using tree-initialized fuzzy rule-based classifier. 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).?
Zhang, C., & Y. Ma. 2012. Ensemble machine learning (Vol. 144). New York: Springer.
Volume 40, Issue 4 - Serial Number 124
Summer 2025
Pages 1147-1178

  • Receive Date 20 August 2024
  • Revise Date 26 May 2025
  • Accept Date 27 May 2025