Article Open Access Logo

VIETNAMESE-ENGLISH CROSS-LANGUAGE INFORMATION RETRIEVAL (CLIR) USING BILINGUAL DICTIONARY

Nguyen Han Doan 1
Volume & Issue: Vol. 10 No. 13 (2007) | Page No.: 18-30 | DOI: 10.32508/stdj.v10i13.2861
Published: 2007-12-31

Online metrics


Statistics from the website

  • Abstract Views: 3226
  • Galley Views: 742

Statistics from Dimensions

This article is published with open access by Viet Nam National University, Ho Chi Minh City, Viet Nam. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0) which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Abstract

Web content is growing each day explosively. A case study performed in 2001 [9] suggested that 70 percent of internet content is in English, but only about 44 percent of Internet users are native English speakers. These numbers are expected to change but English language is still expected to play a dominate role. To gain access to English digital documents from a search query written in Vietnamese language, we propose a Cross Language Information Retrieval (CLIR) technique which takes a query and translate it into phrases, as query text, to retrieve relevance English search result. The technique employs web query logs to arrive at statistical information regarding patterns of words usage. The information then helps to eliminate translation ambiguities by selecting a proper word sense (meaning) for translation. The proposed work also is concerning about the structure of translated query, in posing translated words in a certain order, to obtain relevance search result.

Comments