Downloads
Abstract
The problem of discovering the similar sub-sequences in a set of DNA biological sequences is an important problem of bio-technology. From the positions of similar sub sequences, we can discover the features of a group of similar function genes or the position for point mutation. In this paper, we analyze two problems of repeated and upproximate sub sequence discovery and employ the large set discovery algorithm for discovering the repeated sub-sequence and the genetic algorithm for discovering the approximate sub-sequence in a set of DNA biological sequence. In the repeated sub-sequence discovery algorithm, we consider the repeated sub-sequence as a large set and employed the Apriori-TiD algorithm (Agrawal , 1994) for discovering the maximal large set. In the approximate sub-sequence discovery algorithm, we consider the chromosome of genetic algorithm as a potential solution and employ the genetic algorithm for selecting the right solution. Two proposed algorithms work very well with large data set, therefore they satisfy the demand of the large data set of genes. Besides, we also propose a heuristic for improving the speed of solution discovery. We apply our proposed algorithms to the data of the promoters from the University of Irvine, USA and show the experiment results.
Issue: Vol 3 No 7&8 (2000)
Page No.: 5-11
Published: Aug 31, 2000
Section: Article
DOI: https://doi.org/10.32508/stdj.v3i7&8.3573
Download PDF = 314 times
Total = 314 times
Most read articles by the same author(s)
- Do Phuc, Le Anh Tai, DEVELOPING ALGORITHMS FOR BUILDING PHYLOGENETIC TREES , Science and Technology Development Journal: Vol 3 No 9&10 (2000)
- Hoang Kiem, Do Phuc, A COMBINED MULTI-DIMENSIONAL DATA MODEL, SELF ORGANIZING ALGORITHM AND GENETIC ORITHM FOR CLUSTER DISCOVERY IN DATA MINING , Science and Technology Development Journal: Vol 2 No 2&3 (1999)
- Hoang Kiem, Do Phuc, USING DATA MINING IN EDUCATION AND TRAINING , Science and Technology Development Journal: Vol 2 No 4&5 (1999)
- Tran Tien Duc, Hoang Kiem, RECOGNITION OF THE VIETNAMESE SPEECH 'S ISOLATED DIGITS USING HIDDEN MARKOV MODEL , Science and Technology Development Journal: Vol 2 No 4&5 (1999)