[Home ] [Archive]   [ فارسی ]  
:: ::
Back to browse issues page Back to the articles list
Categorization of Various Essential Datasets and Methods for Textual Spelling Detection and Normalization
Hadi Abdi Ghavidel, Molouk Sadat Hosseini Beheshti
Assistant Professor Iranian Research Institute for Information Science and Technology(IranDoc)
Abstract:   (612 Views)

One of the most primary phases of automatic text processing is spelling error detection and grapheme normalization. Storing textual documents faces several problems without passing this phase, which causes a disturbance in retrieving the documents automatically. Therefore, specialists in the fields of natural language processing and computational linguistics usually make an attempt to sample various data through presenting ideal methods and algorithms in order to reach the normalized data. Several researches have been conducted on English and some other languages, which have been followed by a certain amount of researches on Farsi too. Sometimes, these several researches have remained to be a pure study and sometimes they have been released as a product. This paper carries out the categorization of the different methods and essential datasets in these researches and depicts each category individually and the evaluation measurements methods generally. Moreover, it describes the performance of the monolingual Farsi systems and the way they meet the Farsi challenges.

Keywords: spelling error detection, grapheme normalization, categorization of the methods, monolingual Farsi systems, Farsi language challenges
Full-Text [PDF 1158 kb]   (384 Downloads)    
Type of Study: Review | Subject: Information Technology
Received: 2016/01/26 | Accepted: 2016/08/2 | Published: 2016/08/21
Send email to the article author

Add your comments about this article
Your username or email:

Write the security code in the box >



XML   Persian Abstract   Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Abdi Ghavidel H, Hosseini Beheshti M S. Categorization of Various Essential Datasets and Methods for Textual Spelling Detection and Normalization. Journal of Information Processing and Management. 2009;
URL: http://jipm.irandoc.ac.ir/article-1-3088-en.html
Back to browse issues page Back to the articles list
پژوهشنامه پردازش و مدیریت اطلاعات Journal of Information processing and Management
Persian site map - English site map - Created in 0.245 seconds with 797 queries by yektaweb 3388