Voiced-Unvoiced-Silence Detection of Speech Signal using Combined Spectro-Temporal Features
محورهای موضوعی : Fuzzy Optimization and Modeling Journal
1 - Department of Electrical Engineering, Qaemshahr Branch, Islamic Azad University, Qaemshahr, Iran
کلید واژه: Clustering, Weighted Gaussian mixture model, Speech segmentation, Spectro-temporal features,
چکیده مقاله :
This paper presents a new method for classification of voiced, unvoiced and silence segments of speech signal. In the proposed method, combination of spectro-temporal features is used for speech segmentation. Combined features are extracted using clustering in spectro-temporal domain. Multi-dimensional output of auditory model is clustered using weighted Gaussian mixture model. In this method, after extracting the main clusters for each frame, combined spectro-temporal features such as cluster’s energy, energy difference of clusters and minimum value of normalized cross-correlation between clusters are used for detection of voiced, unvoiced and silence regions of speech. In the proposed algorithm, speech segmentation is performed by comparing each class of features with the appropriate threshold value. Combined spectro-temporal features are used for speech segmentation in noisy conditions. The results demonstrate performance of the proposed algorithm comparing to the other features for speech segmentation.