Iranian Journal of Information Processing and Management

Iranian Journal of Information Processing and Management

Text Recognition in Printed Persian Documents Based on Recurrent Neural Networks

Document Type : Original Article

Authors
1 PhD in Digital image processing; Uppsala University Assistant professor; Information Technology Research Department ,Iranian Research Institute for Information Science & Technology (IranDoc); Tehran, Iran
2 PhD in Industrial engineering; Amirkabir University of Technology, Assistant professor; Information Technology Research Department, Iranian Research Institute for Information Science & Technology (IranDoc); Tehran, Iran
3 B.Sc. in Computer Engineering; Amirkabir University of Technology Research assistant; Information Technology Research Department, Iranian Research Institute for Information Science & Technology (IranDoc); Tehran, Iran
4 M.Sc. in Software Engineering; Isfahan University of Technology (IUT) Research assistant; Information Technology Research Department, Iranian Research Institute for Information Science & Technology (IranDoc); Tehran, Iran
Abstract
Automatic Persian text recognition has always been challenging due to the unique characteristics of the Persian script, including its connected structure, the high visual similarity between letters, and the significant variation in the shape of letters depending on their position within a word. The aim of this research is to develop an optical character recognition (OCR) model capable of converting Persian printed and scientific documents, including theses, articles, and books, into editable texts. Such a model is essential for tasks like labeling, indexing, and information retrieval in databases. This paper proposes a hybrid approach based on deep learning architectures for Persian text recognition. In this method, convolutional neural networks (CNNs) are used for feature extraction and recurrent neural networks (RNNs) for word recognition. The main advantage of this model is its ability to directly recognize Persian printed text without relying on complex preprocessing steps, such as letter segmentation. The proposed model is trained on a large and dedicated dataset, comprising over two million samples generated in five common Persian fonts. The model achieves an accuracy of 81 per cent in recognizing Persian letters and 60 per cent in recognizing words. The most common errors occur in words related to semi-spaces and signs.
Keywords
Subjects

References:
Alayiaboozar, E. and A. Hojjatpanah. 2022. Steps for creating two Persian specialized corpora. International Journal of Information Science and Management (IJISM) 20 (4): 231-243.
Asadi-zeydabadi, F., A. Afkari-Fahandari, A. Faraji, E. Shabaninia, & H. Nezamabadi-pour. 2023. IDPL-PFOD2: A New Large-Scale Dataset for Printed Farsi Optical Character Recognition. arXiv: 2312.01177
Alkhawaldeh, R.S. 2020. Arabic (Indian) digit handwritten recognition using recurrent transfer deep architecture. Soft Comput. 25: 3131–3141.
Avyodri, R., S. Lukas, & H. Tjahyadi. 2022. Optical Character Recognition (OCR) for Text Recognition and its Post-Processing Method: A Literature Review. In Proceedings of the 1st International Conference on Technology Innovation and Its Applications (ICTIIA), Tangerang, Indonesia, pp. 1–6.
Fasha, M., B. Hammo, N. Obeid, & J. Al Widian. 2020. A Hybrid Deep Learning Model for Arabic Text Recognition. 10.48550/arXiv.2009.01987.
Feng, W., G. Naiyang, L. Yuan, Z. Xiangand, & L. Zhigang, 2017, Audio visual speech recognition with multimodal recurrent neural networks. International Joint Conference on neural networks (IJCNN) May 14: 681-688). IEEE.
Graves A., S. Fern´andez, F. J. Gomez, and J. Schmidhuber. 2006. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. Proceedings of the 23rd international conference on Machine learning 2006 Jun 25 (pp. 369-376)
Hamida, S., O. E. Gannour, B. Cherradi, O. Hassan, & A. Raihani. 2022. Efficient feature descriptor selection for improved Arabic handwritten words recognition. International Journal of Electrical & Computer Engineering 12 (5): 2088-8708.
Hussain, S., S. Ali, Akram QU. Nastalique. 2015. Segmentation-based approach for Urdu OCR. International Journal on Document Analysis and Recognition (IJDAR), 18, 357–374.
Jain, M., M. Mathew, & C.V. Jawahar. 2017. Unconstrained OCR for urdu using deep CNN-RNN hybrid networks. 2017 4th IAPR Asian Conf. on Pattern Recognition (ACPR), Nanjing, People's Republic of China, 2017, pp. 747–752.
Javed, S.T., S. Hussain, A. Maqbool, S. Asloob, S. Jamil, & H. Moin. 2010. Segmentation free nastalique urdu ocr. World Academy of Science, Engineering and Technology 46: 456-461.
Khosravi, H., and E. Kabir. 2009. Blackboard approach towards integrated Farsi OCR system. International Journal of Document Analysis and Recognition (IJDAR) 12 (1): 2132.
Khosrobeigi, Z., H. Veisi, H. Ahmadi, H. Shabanian. 2020. based post-processing approach A rule- to improve Persian OCR performance. Scientia Iranica 27 (6): 3019-3033. doi: 10.24200/sci.2020.53435.3267
Li, M., T. Lv, J. Chen, L. Cui, Y. Lu, D. Florencio, C. Zhang, Z. Li, and F. Wei. 2021. TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models. ArXiv. /abs/2109.10282
Mithe R., S. Indalkar, & N. Divekar. 2013. Optical Character Recognition. International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878 2 (1): 72-75.
Mirzaee, M. 2012. Text detection in images for Persian optical character recognition. MSc Thesis, University Of Tehran. Iran.
Momeni, S., & B. Babaali. 2022. Arabic Offline Handwritten Text Recognition with Transformers. Research Square (Research Square). https://doi.org/10.21203/rs.3.rs-2300065/v1
Mori, S., C. Y. Suen, and K. Yamamoto. 1992. Historical review of OCR research and development. Proceedings of the IEEE, 80 (7), 1029–1058. doi:10.1109/5.156468 10.1109/5.156468
Mostafa, A., O. Mohamed, A. Ashraf, A. Elbehery, S. Jamal, G.  Khoriba, & A.S. Ghoneim. 2021. OCFormer: A Transformer-Based Model for Arabic Handwritten Text Recognition. In Proceedings of the International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 27; pp. 182–186.
Moudgil, A., S. Singh, & V. Gautam. 2022. Recent Trends in OCR Systems: A Review. In Machine Learning for Edge Computing; CRC Press: Boca Raton, FL, USA.
Naz, S., A. I. Umar, R. Ahmad, I. Siddiqi, S. B. Ahmed, M. I. Razzak, & F. Shafait. 2017. Urdu Nastaliq recognition using convolutional–recursive deep learning. Neurocomputing 243: 80–87.
Peng, X., H. Cao, S. Setlur, V. Govindaraju, & P. Natarajan. 2013. Multilingual OCR research and applications: An overview. In Proceedings of the International Workshop on Multilingual OCR, Washington, DC, USA,;[Kaboudan21]  pp. 1–8.
Radwan, M. A., M. I. Khalil, and H. M. Abbas. 2018. Neural networks pipeline for off line machine printed Arabic OCR. Neural Process. Lett. 48 (2): 769–787.
Rahmati M., M. Fateh, M. Rezvani, A. Tajary, V. Abolghasemi. 2020. Printed Persian OCR system using deep learning. IET Image Processing, 14: 3920-3931. https://doi.org/10.1049/iet-ipr.2019.0728
Raj, R., & A. Kos. 2022. A Comprehensive Study of Optical Character Recognition. In Proceedings of the 29th International Conference on Mixed Design of Integrated Circuits and System (MIXDES), Lodz, Poland, 25–27 June 2022; pp. 151–154.
Shi, B., X. Bai, and C. Yao. 2017. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (11): 2298-2304.
Vaswani, A., N. Shazeer, N. Parmar, J.  Uszkoreit, L.  Jones, A.N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention Is All You Need. ArXiv. /abs/1706.03762
Zand, M., A. Naghsh Nilchi, & S.A Monadjemi. 2008. Recognition-based segmentation in Persian character recognition, Proceedings of World Academy of Science, Engineering and Technology. International Journal of Computer, Electrical, Automation, Control and Information Engineering 2 (1): 14-18.
Volume 40, Issue 4 - Serial Number 124
Summer 2025
Pages 1283-1305

  • Receive Date 02 February 2025
  • Revise Date 11 May 2025
  • Accept Date 19 May 2025