Open Access

Downloads

Download data is not yet available.

Abstract

Accurate part-of-speech (POS) tagging for words in Vietnamese texts is very important problem. It will support for texts parsing, resolve polysemy, assist with semantic information extraction systems, etc. Therefore, this paper presents an approach to POS tagging for Vietnamese texts. This method used probability model and based on a lexicon with information about possible POS tags for each word, a manually labelled corpus, syntax and context of texts. Concurrently, we also built a corpus with 75,000 entries and a lexicon with 80,000 entries for the purpose of Vietnamese language processing research and application development.



Author's Affiliation
Article Details

Issue: Vol 9 No 2 (2006)
Page No.: 11-22
Published: Feb 28, 2006
Section: Article
DOI: https://doi.org/10.32508/stdj.v9i2.2879

 Copyright Info

Creative Commons License

Copyright: The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License CC-BY 4.0., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

 How to Cite
Nguyen, C., Phan, T., & Cao, T. (2006). VIETNAMESE PART-OF-SPEED TAGGING BASED ON STYLE OF TEXTS AND PROBABILITY MODEL. Science and Technology Development Journal, 9(2), 11-22. https://doi.org/https://doi.org/10.32508/stdj.v9i2.2879

 Cited by



Article level Metrics by Paperbuzz/Impactstory
Article level Metrics by Altmetrics

 Article Statistics
HTML = 663 times
Download PDF   = 334 times
Total   = 334 times