Space Constrained Fast Association Rule Mining with Optimal Support and Confidence Threshold Using Grammatical Evolution: An Effective Nudge in Policymaking

نویسندگان

KCG College of Technology Chennai

چکیده

In the world of big data and social-media-headed governance and policymaking, data analysis is judged based on the speed and accuracy of execution. This study attempts to modify the existing Association Rule Mining (ARM) techniques by improving the space constraints. Although most of the ARM research is primarily focused on computational efficiency, it has not considered the identification of either the optimal support or the confidence value. Selection of ideal support, as well as confidence value, is vital for the ‘ARM’s quality. However, with the large dataset availability, the space vector poses the latest challenge in processing. Identification of the optimal parameters adapted to the space model is non-deterministic in nature. This research will focus on a Grammatical Evolution (GE) Association Rule Miner (GE-ARM) to identify the optimal threshold parameters for mining quality rules. Simulations are done using the FoodMart2000 dataset, and then, the proposed method is compared against the Apriori, the Frequent Pattern (FP) growth, and the Genetic Algorithms (GA). Simulation results exhibit substantial enhancements in space and rules generated together with time complexity. Compared to Apriori and FP-tree methods, the proposed GE-ARM achieves lesser runtime by around 20%. Such an improvisation would categorically change the dynamics of social media analytics by reducing the space constraints and can have more significant ramifications in policymaking. Therefore, such an improvement is undoubtedly an effective nudge in policymaking.

کلیدواژه‌ها


عنوان مقاله [English]

Space Constrained Fast Association Rule Mining with Optimal Support and Confidence Threshold Using Grammatical Evolution: An Effective Nudge in Policymaking

نویسندگان [English]

  • Tina Susan Thomas
  • Balaji V
KCG College of Technology Chennai
چکیده [English]

In the world of big data and social-media-headed governance and policymaking, data analysis is judged based on the speed and accuracy of execution. This study attempts to modify the existing Association Rule Mining (ARM) techniques by improving the space constraints. Although most of the ARM research is primarily focused on computational efficiency, it has not considered the identification of either the optimal support or the confidence value. Selection of ideal support, as well as confidence value, is vital for the ‘ARM’s quality. However, with the large dataset availability, the space vector poses the latest challenge in processing. Identification of the optimal parameters adapted to the space model is non-deterministic in nature. This research will focus on a Grammatical Evolution (GE) Association Rule Miner (GE-ARM) to identify the optimal threshold parameters for mining quality rules. Simulations are done using the FoodMart2000 dataset, and then, the proposed method is compared against the Apriori, the Frequent Pattern (FP) growth, and the Genetic Algorithms (GA). Simulation results exhibit substantial enhancements in space and rules generated together with time complexity. Compared to Apriori and FP-tree methods, the proposed GE-ARM achieves lesser runtime by around 20%. Such an improvisation would categorically change the dynamics of social media analytics by reducing the space constraints and can have more significant ramifications in policymaking. Therefore, such an improvement is undoubtedly an effective nudge in policymaking.

کلیدواژه‌ها [English]

  • Association Rule Mining (ARM)
  • Social Media
  • Apriori Algorithm
  • Frequent Pattern (FP) Growth
  • Genetic Algorithms (GA)
  • and Grammatical Evolution (GE)
Al-Maolegi, M., & Arkok, B. (2014). An improved Apriori algorithm for association rules. arXiv preprint arXiv:1403.3948.
Boutorh, A., & Guessoum, A. (2014). Grammatical Evolution Association Rule Mining to Detect Gene-Gene Interaction. In BIOINFORMATICS (pp. 253-258). In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOINFORMATICS-2014), pages 253-258 ISBN: 978-989-758-012-3
Chiclana, F., Kumar, R., Mittal, M., Khari, M., Chatterjee, J. M., & Baik, S. W. (2018). ARM–AMO: an efficient association rule mining algorithm based on animal migration optimization. Knowledge-Based Systems, 154, 68-80.
Doroudi, F., & Jamshidi, Z. (2021). Assessing the Components of Information Security in Accessing & Use of Digital Libraries. Iranian Journal of Information processing and Management, 37(1), 117-134.
Fesharaki, M., Shirazi, H., & Bakhshi, A. (2011). Knowledge Acquisition from Database of Information Management and Documentation Softwares by DataMining Techniques. Iranian Journal of Information processing and Management, 26(2), 260-283.
Ganghishetti, P., & Vadlamani, R. (2014). Association rule mining via evolutionary multi-objective optimization. In International Workshop on Multi-disciplinary Trends in Artificial Intelligence (pp. 35-46). Cham: Springer.
Haldulakar, R., & Agrawal, J. (2011). Optimization of association rule mining through genetic algorithm. International Journal on Computer Science and Engineering (IJCSE), 3(3), 1252-1259.
Issac, A. C., & Baral, R. (2020). A trustworthy network or a technologically disguised scam: A biblio-morphological analysis of bitcoin and blockchain literature. Global Knowledge, Memory and Communication.
Kaushik, M., Sharma, R., Peious, S. A., Shahin, M., Yahia, S. B., & Draheim, D. (2020, November). On the Potential of Numerical Association Rule Mining. In International Conference on Future Data and Security Engineering (pp. 3-20). Springer, Singapore.
Kumar, P., & Singh, A. K. (2019). Efficient generation of association rules from numeric data using genetic algorithm for smart cities. In Security in Smart Cities: Models, Applications, and Challenges (pp. 323-343). Cham: Springer.
Menaga, D., & Saravanan, S. (2021). GA-PPARM: constraint-based objective function and genetic algorithm for privacy preserved association rule mining. Evolutionary Intelligence, 8(2) 1-12.
Moslehi, F., Haeri, A., & Martínez-Álvarez, F. (2020). A novel hybrid GA–PSO framework for mining quantitative association rules. Soft Computing, 24(6), 4645-4666.
O’Neill, M., & Ryan, C. (2001). Grammatical evolution. IEEE Transactions on Evolutionary Computation, 5(4), 349-358.
Raj, B. B., Vijay, J. F., & Mahalakshmi, T. (2016). Secure data transfer through DNA cryptography using symmetric algorithm. International Journal of Computer Applications, 133(2), 19-23.
Sarkar, S., Lohani, A., & Maiti, J. (2017, March). Genetic algorithm-based association rule mining approach towards rule generation of occupational accidents. In International Conference on Computational Intelligence, Communications, and Business Analytics (pp. 517-530). Singapore: Springer.
Sathyanarayanan, D., & Krishnamurthy, M. (2018). Association Rule Mining Using Frequent Itemsets Generation by Anti-Mirroring Of Bit Vectors. International Journal of Pure and Applied Mathematics, 118(20), 5033-5044.
Shabtay, L., Fournier-Viger, P., Yaari, R., & Dattner, I. (2021). A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data. Information Sciences, 553, 353-375.
Sharma, A., & Tivari, N. (2012). A survey of association rule mining using genetic algorithm. International Journal of Computer Applications & Information Technology, 1(2), 5-11.
Sornalakshmi, M., Balamurali, S., Venkatesulu, M., Krishnan, M. N., Ramasamy, L. K., Kadry, S., ... & Muthu, B. A. (2020). Hybrid method for mining rules based on enhanced Apriori algorithm with sequential minimal optimization in healthcare industry. Neural Computing and Applications, 13(3) 1-14.
Sotudeh, H., Yousefi, Z., Khunjush, F., & Ghanbari Aloni, F. (2021). Content analysis and Opinion mining of Tweets about Open Access and its Main Features. Iranian Journal of Information processing and Management, 37(1), 305-329.
Sukanya, N. S., & Thangaiah, P. R. J. (2020). Customized Particle Swarm Optimization Algorithm for Frequent Itemset Mining. In 2020 International Conference on Computer Communication and Informatics (ICCCI) (pp. 1-4). IEEE. Sweden.
Tanantong, T., & Ramjan, S. (2021). An Association Rule Mining Approach to Discover Demand and Supply Patterns Based on Thai Social Media Data. International Journal of Knowledge and Systems Science (IJKSS), 12(2), 1-16.
Thurachon, W., & Kreesuradej, W. (2021). Incremental Association Rule Mining with a Fast Incremental Updating Frequent Pattern Growth Algorithm. IEEE Access, 9, 55726-55741.
Thomas, T. S., & Issac, A. C. (2018). Real time monitoring of the health of infants.           https://aisel.aisnet.org/amcis2018/TREOsPDS/Presentations/98/
Xu, B., Ding, S., & Li, Y. (2020). Data association rules mining method based on genetic optimization algorithm. In Journal of Physics: Conference Series 1570(1), 012006. IOP Publishing.