پژوهشنامه پردازش و مدیریت اطلاعات

پژوهشنامه پردازش و مدیریت اطلاعات

بررسی روش‌های استخراج قانون در داده کاوی: مرور نظام‌مند مطالعات

نوع مقاله : مقاله مروری

نویسندگان
دانشکده مهندسی صنایع و سیستم‌ها، دانشگاه صنعتی اصفهان، اصفهان، ایران
چکیده
قوانین استخراج‌شده از داده‌ها یکی از مهم‌ترین و کاربردی‌ترین اشکال نمایش دانش هستند که در حوزه‌های مختلفی مانند سیستم‌های خبره، پشتیبانی تصمیم و کنترل خودکار کاربرد دارند. روشهای استخراج قانون به‌دلیل شفافیت و تفسیرپذیری بالا به محققان امکان می‌دهند که الگوها و روابط زیربنایی درون داده‌ها را درک کنند. از این‌رو، رویکردها و الگوریتم‌های متعددی برای استخراج قانون از داده‌ها توسعه یافته‌ که هر یک نقاط ضعف و قوت مربوط به خود را دارند. این مقاله با استفاده از روش مرور نظام‌مند به بررسی روش‌های استخراج قانون در حوزه داده‌کاوی پرداخته است. در این پژوهش 678 مقاله از پایگاه‌های علمی معتبر جمع‌آوری شد و پس از غربالگری، 19 مقاله برای تحلیل نهایی انتخاب شدند. تحلیل‌ها در سه بخش انجام شده است: 1) تحلیل نقادانه راهبرد‌های استخراج قوانین و معیارهای تحلیل عملکرد آن‌ها، 2) شناسایی شکاف‌های پژوهشی بر اساس تحلیل‌های صورت‌گرفته با هدف ارائه زمینه‌های مطالعاتی برای تحقیقات آینده، و 3) تحلیل محتوای متنی پژوهش‌ها با استفاده از ابزارهایی مانند VOS viewer برای شناسایی مفاهیم کلیدی و ارتباطات میان آن‌ها. نتایج این پژوهش می‌تواند به توسعه الگوریتم‌های کارآمدتر استخراج قانون در حوزه‌های مختلف کمک کرده و زمینه‌ساز تحقیقات آینده در این حوزه باشد.
کلیدواژه‌ها
موضوعات

عنوان مقاله English

Review of Rule Extraction Methods in Data Mining: A Systematic Review of Studies

نویسندگان English

Sara Ansari
Saba Sareminia
Department of Industrial and Systems Engineering, Isfahan University of Technology, Isfahan, Iran
چکیده English

Extracted rules from data represent one of the most important and practical forms of knowledge representation applied across various domains, including expert systems, decision support, and automated control. Rule extraction methods, valued for their transparency and interpretability, enable researchers to understand underlying patterns and substructural relationships within datasets. Consequently, numerous approaches and algorithms for rule extraction have been developed, each with its own strengths and weaknesses. This paper systematically reviews rule extraction methods in data mining. A total of 678 articles were initially collected from reputable scientific databases, with 19 selected for final analysis after screening. The analysis is conducted in three parts: 1) a critical evaluation of rule extraction strategies and performance metrics, 2) identification of research gaps based on the analysis to propose providing areas of study for future research, and 3) textual content analysis of studies using tools like VOS viewer to identify key concepts and their relationships. The findings of this study can contribute to developing more efficient rule extraction algorithms across diverse domains and pave the way for future research in this field.

کلیدواژه‌ها English

systematic review
rule extraction
data mining
rule optimization
machine learning
References:
Agrawal, R., & R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB (pp. 487-499).
Asadi, S., & J. Shahrabi. 2016. ACORI: A novel ACO algorithm for rule induction. Knowledge-Based Systems 97: 175-187.
_____. 2017. Complexity-based parallel rule induction for multiclass classification. Information Sciences 380: 53-73.
Barut, C., G. Yildirim & Y. Tatar. 2024. An intelligent and interpretable rule-based metaheuristic approach to task scheduling in cloud systems. Knowledge-Based Systems 284: 111241.
Berrone, S., F. Della Santa, A. Mastropietro, S. Pieraccini, & F. Vaccarino, F. 2022. Graph-informed neural networks for regressions on graph-structured data. Mathematics 10 (5): 786.
Breskvar, M., & S. Džeroski. 2021. Multi-target regression rules with Random Output Selections. IEEE Access 9: 10509-10522.
Cao, Y., Z. Zhou, C. Hu, W. He & S. Tang. 2020. On the interpretability of belief rule-based expert systems. IEEE Transactions on Fuzzy Systems 29 (11): 3489-3503.
Casalino, G., G. Castellano, C. Castiello, V. Pasquadibisceglie & G. Zaza. 2019. A fuzzy rule-based decision support system for cardiovascular risk assessment. Fuzzy Logic and Applications: 12th International Workshop, WILF 2018, Genoa, Italy, September 6–7, 2018, Revised Selected Papers,
Castellanos-Garzón, J. A., E. Costa & J. M. Corchado. 2019. An evolutionary framework for machine learning applied to medical data. Knowledge-Based Systems 185: 104982.
Cendrowska, J. 1987. PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies 27 (4): 349-370.
Chemchem, A., & H. Drias. 2015. From data mining to knowledge mining: Application to intelligent agents. Expert Systems with Applications 42 (3): 1436-1445.
Ding, J., L. Li, H. Peng & Y. Zhang. 2019. A rule-based cooperative merging strategy for connected and automated vehicles. IEEE Transactions on Intelligent Transportation Systems 21 (8): 3436-3446.
Fu, C., B. Hou, M. Xue, L. Chang & W. Liu. 2022. Extended belief rule-based system with accurate rule weights and efficient rule activation for diagnosis of thyroid nodules. IEEE Transactions on Systems, Man, and Cybernetics: Systems 53 (1): 251-263.
Fu, Y.-G., X.-Y. Lin, G.-C. Fang, J. Li, H.-Y. Cai, X.-T. Gong & Y.-M Wang. 2024. A novel extended rule-based system based on K-Nearest Neighbor graph. Information Sciences 662: 120158.
Hastie, T. 2009. The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
Herbst, K., S. Juvekar, T. Bhattacharjee, M. Bangha, N. Patharia, T. Tei, B. Gilbert & O. Sankoh. 2015. The INDEPTH Data Repository: An International Resource for Longitudinal Population and Health Data From Health and Demographic Surveillance Systems. J Empir Res Hum Res Ethics, 10 (3), 324-333. https://doi.org/10.1177/1556264615594600
Hong, J.-S., J. Lee & M. K. Sim. 2024. Concise rule induction algorithm based on one-sided maximum decision tree approach. Expert Systems with Applications 237: 121365.
Hossin, M., & M. N. Sulaiman. 2015. A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process 5 (2): 1.
Ibarguren, I., J. M. Pérez, J. Muguerza, I. Gurrutxaga, O. & Arbelaitz. 2015. Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems 79: 51-67.
_____. 2018. UnPART: PART without the ‘partial’condition of it. Information Sciences 465: 505-522.
Ie, K. S., & T. M. Borysova. 2021. Vykorystannia shtuchnoho intelektu pry marketynhovomu analizi nestrukturovanykh danykh [Use of artificial intelligence in marketing analysis of unstructured data]. Marketynh i tsyfrovi tekhnolohii 1: 17-26.
Ignatiev, A., J. Marques-Silva, N. Narodytska & P. J. Stuckey. 2021. Reasoning-based learning of interpretable ML models. International Joint Conference on Artificial Intelligence 2021. Association for the Advancement of Artificial Intelligence (AAAI).
Kozlov, A. M., D. Darriba, T. Flouri, B. Morel & A. Stamatakis. 2019. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35 (21): 4453-4455. https://doi.org/10.1093/bioinformatics/btz305
Li, Y., I. J. Pérez, F. J. Cabrerizo, H. Garg & J. A. Morente-Molinera. 2024. A belief rule-based classification system using fuzzy unordered rule induction algorithm. Information Sciences 667: 120462.
Lin, S.-H., C.-C. Huang & Z.-X Che. 2015. Rule induction for hierarchical attributes using a rough set for the selection of a green fleet. Applied Soft Computing 37: 456-466.
Liu, H., & M. Cocea. 2018. Induction of classification rules by gini-index based rule generation. Information Sciences 436: 227-246.
Ma, J., A. Zhang, F. Gao, W. Bi & C. Tang. 2023. A novel rule generation and activation method for extended belief rule-based system based on improved decision tree. Applied Intelligence 53 (7): 7355-7368.
Maszczyk, C., M. Sikora & L. Wróbel. 2024. Classification, Regression, and Survival Rule Induction with Complex and M-of-N Elementary Conditions. Machine Learning and Knowledge Extraction 6 (1): 554-579.
McInnes, M. D. F., D. Moher, B. D. Thombs, T. A. McGrath, P. M. Bossuyt & a. t. P.-D Group. 2018. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA 319 (4): 388-396. https://doi.org/10.1001/jama.2017.19163
Mining, W. I. D. 2006. Data mining: Concepts and techniques. Morgan Kaufinann 10 (4): 559-569.
Moher, D., A. Liberati, J. Tetzlaff, D. G. Altman & T. PRISMA Group*. 2009. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Annals of internal medicine 151 (4): 264-269.
Napierała, K., & J. Stefanowski. 2015. Addressing imbalanced data with argument based rule learning. Expert Systems with Applications 42 (24): 9468-9481.
Ogunleye, A., & Q. G. Wang. 2020. XGBoost Model for Chronic Kidney Disease Diagnosis. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17 (6): 2131-2140. https://doi.org/10.1109/TCBB.2019.2911071
Pham, D. T., & A. Afify. 2005. Rules-6: a simple rule induction algorithm for supporting decision making. 31st Annual Conference of IEEE Industrial Electronics Society, 2005. IECON 2005.?
Quinlan, J. R. 1987. Simplifying decision trees. International Journal of Man-Machine Studies 27 (3): 221-234. https://doi.org/https://doi.org/10.1016/S0020-7373(87)80053-6
_____. 2014. C4. 5: programs for machine learning.?: Elsevier.
Rajab, K. D. 2019. New associative classification method based on rule pruning for classification of datasets. Ieee Access 7: 157783-157795.
Siddaway, A. P., A. M. Wood & L. V. Hedges. 2019. How to Do a Systematic Review: A Best Practice Guide for Conducting and Reporting Narrative Reviews, Meta-Analyses, and Meta-Syntheses. Annual Review of Psychology, 70 (Volume 70, 2019), 747-770. https://doi.org/https://doi.org/10.1146/annurev-psych-010418-102803
Thabtah, F., I. Qabajeh, F. & Chiclana. 2016. Constrained dynamic rule induction learning. Expert Systems with Applications 63: 74-85.
Uthayakumar, J., T. Vengattaraman & P. Dhavachelvan. 2020. Swarm intelligence based classification rule induction (CRI) framework for qualitative and quantitative approach: An application of bankruptcy prediction and credit risk analysis. Journal of King Saud University - Computer and Information Sciences, 32 (6): 647-657. https://doi.org/https://doi.org/10.1016/j.jksuci.2017.10.007
Vale, D., A. El-Sharif & M. Ali. 2022. Explainable artificial intelligence (XAI) post-hoc explainability methods: risks and limitations in non-discrimination law. AI and Ethics 2 (4): 815-826. https://doi.org/10.1007/s43681-022-00142-y
Verbeke, W., D. Martens, C. Mues & B. Baesens. 2011. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Systems with Applications 38 (3): 2354-2364. https://doi.org/https://doi.org/10.1016/j.eswa.2010.08.023
Verma, L., S. Srivastava & P. C. Negi. 2016. A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data. Journal of Medical Systems 40 (7): 178. https://doi.org/10.1007/s10916-016-0536-z
Whiting, D. G., J. V. Hansen, J. B. McDonald, C. Albrecht & W. S. Albrecht. 2012. MACHINE LEARNING METHODS FOR DETECTING PATTERNS OF MANAGEMENT FRAUD. Computational Intelligence 28 (4): 505-527. https://doi.org/https://doi.org/10.1111/j.1467-8640.2012.00425.x
Yanjie, Z., & S. Hongbo. 2018. Application of biclustering algorithm to extract rules from labeled data. International Journal of Crowd Science 2 (2): 86-98.
Yu, J., H. Lee, Y. Jeong & S. Kim. 2016. Identifying chaff echoes in weather radar data using tree-initialized fuzzy rule-based classifier. 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).?
Zhang, C., & Y. Ma. 2012. Ensemble machine learning (Vol. 144). New York: Springer.
دوره 40، شماره 4 - شماره پیاپی 124
تابستان 1404
صفحه 1147-1178

  • تاریخ دریافت 30 مرداد 1403
  • تاریخ بازنگری 05 خرداد 1404
  • تاریخ پذیرش 06 خرداد 1404