Open Access

Abstract

This paper describes the method for creating neural networks based continuous Vietnamese speech recognizer. By Vietnamese phonetic analyzing, we can determine the context-dependent phonemes from a given vocabulary, which means that one phoneme is classified differently denpending on the phonemes that surround it. The feature extraction has used the current popular model as the mel-ceptrum, where the short-time spectrum is warped according to the mel scale, then direct transformation of the log power spectrum to the cepstral domain using an inverse discrete cosine transform. The neural netwoks has been applied to estimate context-dependent phoneme probabilities that outputs value in the range 0 to 1. The phoneme probabilities for the successive frames are arranged in a matrix. We then use the Viterbi algorithm to find the legal string of phonmes throught the matrix gives us highest score, that is also the target word. The experiments were programmed in Microsoft Visual C++ 6.0. The high accurate results confirmed applicable of the neural networks for Vietnamese speech recognition.