Article Open Access Logo

DEVELOPING A MOTIF BASED CLUSTERING ALGORITHM FOR SUPPORTING THE QUERY IN DATABASE OF DNA SEQUENCES

Hoang Kiem 1
Do Phuc 1
Volume & Issue: Vol. 4 No. 1&2 (2001) | Page No.: 83-89 | DOI: 10.32508/stdj.v4i1&2.3477
Published: 2001-02-28

Online metrics


Statistics from the website

  • Abstract Views: 1754
  • Galley Views: 569

Statistics from Dimensions

Copyright The Author(s) 2023. This article is published with open access by Vietnam National University, Ho Chi Minh city, Vietnam. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0) which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited. 

Abstract

We have developed a system for supporting the query in a database of DNA sequences. We would like to develop a system for grouping similar DNA sequences into clusters based on frequent motifs or motif phrases. Each cluster is represented by a cluster feature vector of maximal frequent motifs or motif phrases). A motif tree of cluster features is built. The similarity search will be divided into two steps. Firstly, the system will search the clusters which have the high matching with the query pattern. Secondly, the traditional matching techniques (FASTA or BLAST) will be used for matching between pattern and a small number of DNA sequences of selected cluster.

Comments