Downloads
Abstract
Document summarization is to make an abridge version of document or a set of documents. The problem of document summarization has a long history of development. The first work in this research direction has belonged to Luhn since 1958. With the increase of volume of information in the Internet especially the Web pages in English or in Vietnamese language Web pages, the problem of developing the narization techniques which can help to summarize the content of web pages or documents has been the interest of researchers. In this paper, we would like to present the results of building a summarization system for summarizing the content of Vietnamese web pages based on the extraction of salience sentences from original document. We use the natural language processing such as word segmentation, POS tagging, compound noun extracting for increasing the efficiency of document summarization and opening a solution for semantic text summarization.
Issue: Vol 8 No 10 (2005)
Page No.: 13-22
Published: Oct 31, 2005
Section: Article
DOI: https://doi.org/10.32508/stdj.v8i10.3076
Download PDF = 364 times
Total = 364 times