CLASSIFYING THE BIOLOGICAL SEQUENCES USING THE ORDERED SET OF FREQUENT MOTIFS
Abstract
The paper focuses on developing the algorithms for discovering the frequent motifs and the ordered co-occurrence set of frequent motifs supporting the classification of the family of biological sequences. AprioriBioSequence is the name of our proposed algorithm which has been developed from the approach of data mining. AprioriBiosequence can discover the frequent motifs without specifying the length of discovered motifs. Besides, paper also deals with the algorithm for discovering the ordered co-occurrence set of frequent motifs for classifying the biological sequences. The experiment of the proposed algorithms with the E-coli promoter sequences is carried out and presents the results.