[Home ] [Archive]   [ فارسی ]  
:: Volume 33, Issue 2 (Winter 2018) ::
2018, 33(2): 885-914 Back to browse issues page
A new Persian Text Summarization Approach based on Natural Language Processing and Graph Similarity
Tayyebeh Hosseinikhah , Abbas Ahmadi , Azadeh Mohebi
M.Sc. in Industrial Engineering Amirkabir University of Technology
Abstract:   (2996 Views)

A significant amount of available information is stored in textual databases which contains a large collection of documents from different sources (such as news, articles, books, emails and web pages). The increasing visibility and importance of this class of information motivates us to work on having better automatic evaluation tools for textual resources.

The automatic summarization of text is one of the ways to prevent the waste of users’ time. The extractive text summarization consists of the extraction of the more important sentences with the purpose of shortening input text while maintaining the topics covered and the subjects discussed.

In this paper, we have tried to improve the accuracy of the extracted summaries by combining natural language processing and text mining techniques. By modifying the mentioned algorithms and sentence scoring measures, accuracy is increased as compared to the previously used techniques.

Part of speech tagging is used for calculating coefficient of words’ importance. Using this approach will in turn help us with to pick the more meaningful words and phrases that will result in better accuracy of the system.

Graph similarity‘s methods are used to select sentences. Changing weight of the selected sentences in each step leads to solve the redundancy problem.

Standard evaluation measures such as “Precision” and “Recall” are used to evaluate results based on a Persian corpus.

Keywords: Extractive Summarization, Natural Language Processing, Text Mining, Part of Speech Tagging, Similarity Graph.
Full-Text [PDF 1167 kb]   (1663 Downloads)    
Type of Study: Research | Subject: Information Technology
Received: 2015/04/23 | Accepted: 2017/01/8 | Published: 2017/02/1
Send email to the article author

Add your comments about this article
Your username or Email:

Write the security code in the box >

XML   Persian Abstract   Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Hosseinikhah T, Ahmadi A, Mohebi A. A new Persian Text Summarization Approach based on Natural Language Processing and Graph Similarity. Journal of Information Processing and Management. 2018; 33 (2) :885-914
URL: http://jipm.irandoc.ac.ir/article-1-2842-en.html

Volume 33, Issue 2 (Winter 2018) Back to browse issues page
پژوهشنامه پردازش و مدیریت اطلاعات Journal of Information processing and Management
Persian site map - English site map - Created in 0.21 seconds with 30 queries by YEKTAWEB 3701