Open Access

Downloads

Download data is not yet available.

Abstract

Document summarization is to make an abridge version of document or a set of documents. The problem of document summarization has a long history of development. The first work in this research direction has belonged to Luhn since 1958. With the increase of volume of information in the Internet especially the Web pages in English or in Vietnamese language Web pages, the problem of developing the narization techniques which can help to summarize the content of web pages or documents has been the interest of researchers. In this paper, we would like to present the results of building a summarization system for summarizing the content of Vietnamese web pages based on the extraction of salience sentences from original document. We use the natural language processing such as word segmentation, POS tagging, compound noun extracting for increasing the efficiency of document summarization and opening a solution for semantic text summarization.



Author's Affiliation
Article Details

Issue: Vol 8 No 10 (2005)
Page No.: 13-22
Published: Oct 31, 2005
Section: Article
DOI: https://doi.org/10.32508/stdj.v8i10.3076

 Copyright Info

Creative Commons License

Copyright: The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License CC-BY 4.0., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

 How to Cite
Phuc, D., & Anh Thu, H. (2005). EXTRACTING AND SUMMARIZING THE CONTENT OF VIETNAMESE WEB PAGES. Science and Technology Development Journal, 8(10), 13-22. https://doi.org/https://doi.org/10.32508/stdj.v8i10.3076

 Cited by



Article level Metrics by Paperbuzz/Impactstory
Article level Metrics by Altmetrics

 Article Statistics
HTML = 714 times
Download PDF   = 283 times
Total   = 283 times