Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Subject Areas : Image, Speech and Signal ProcessingReza Ashrafidoost 1 , Saeed Setayeshi 2 , Arash Sharifi 3
1 - Department of Computer Science, Science and Research Branch, Islamic Azad University, Tehran, Iran
2 - Amirkabir University of Technology, Tehran, Iran
3 - Department of Computer Science, Science and Research Branch, Islamic Azad University, Tehran, Iran
Keywords:
Abstract :
[1] Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., and Taylor, J. G., “Emotion recognition in human-computer interaction”, IEEE Signal Processing magazine, vol. 18, no. 1, pp. 32-80, January 2001.
[2] S. Wu, T. H. Falk, W. Chan, “Automatic speech emotion recognition using modulation spectral features”, Journal of Speech Communication, May, 2011, vol. 53, pp. 768–785.
[3] X. Anguera, S. Bozonnet, N. Evans, C. Fredouille, “Speaker Diarization: A Review of Recent Research”, IEEE Transactions on Audio, Speech, and Language Processing. DOI: 10.1109/TASL.2011.2125954.
[4] C.N. Van der Wal, W. Kowalczyk, “Detecting Changing Emotions in Human Speech by Machine and Humans”, Springer Science and Business Media, NY - Applied Intelligence, December 2013. DOI: 10.1007/s10489-013-0449-1.
[5] B. Fergani, M. Davy, and A. Houacine, “Speaker diarization using one-class support vector machines,” Speech Communication, vol. 50, pp. 355-365, 2008. DOI: 10.1016/j.specom.2007.11.006.
[6] F. Valente, “Variational Bayesian Methods for Audio Indexing,” PhD. dissertation, Universite de Nice-Sophia Antipolis, 2005. DOI: 10.1007/11677482_27.
[7] P. Kenny, D. Reynolds, and F. Castaldo, “Diarization of telephone conversations using factor analysis,” Selected Topics in Signal Processing, IEEE Journal of, vol. 4, pp. 1059-1070, 2010.
[8] S. Davis, P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Trans. Audio Speech Language Processing. 28, 357–366, 1980.
[9] Wojtek Kowalczyk, C. Natalie van der Wal., “Detecting Changing Emotions in Natural Speech”, Springer Science Business Media New York, Appl Intell (2013) 39:675–691 DOI:10.1007/s10489-013-0449-1.
[10] Sara Motamed, Saeed Setayeshi, “Speech Emotion Recognition Based on Learning Automata in Fuzzy Petri-net”, Journal of mathematics and computer science, vol. 12, August 2014.
[11] Rahul B. Lanjewar, Swarup Mathurkar, Nilesh Patel, “Implementation and Comparison of Speech Emotion Recognition System using Gaussian Mixture Model (GMM) and K-Nearest Neighbor (K-NN) techniques,” In Procedia Computer Science 49 (2015) pp. 50-57, DOI: 10.1016@j.procs.2015.04.226, 2015.
[12] J. H. Wolfe, “Pattern clustering by multivariate analysis,” Multivariate Behavioral Research, vol. 5, pp. 329-359, 1970.
[13] D. Ververidis, C. Kotropoulos, “Emotional Speech Classification Using Gaussian Mixture Models and the Sequential Floating Forward Selection Algorithm,” IEEE International Conference on Multimedia and Expo, Amsterdam, 2005. DOI:10.1109/ICME.2005.1521717.
[14] H. Farsaie Alaie, L. Abou-Abbas, C. Tadj, “Cry-based infant pathology classification using GMMs,” Speech Communication (2015), DOI: 10.1016/j.specom.2015.12.001, 2015.
[15] Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B., “A database of german emotional speech”, INTERSPEECH, pp.1517–1520, 2005.
[16] Yang, M.Lugger, “Emotion recognition from speech signals using new harmony features,” Special Section on Statistical Signal & Array Processing, vol. 90, Issue 5, May 2010, pp. 1415–1423, DOI: 10.1016/j.sigpro.2009.09.009.
[17] A. Rabiee, S. Setayeshi, “Robust and optimum features for Persian accent classification using artificial neural network,” in the proceedings of the 19th international conference on Neural Information Processing - Volume Part IV. DOI: 10.1007/978-3-642-34478-7_54.
[18] J. Kittler, “Feature set search algorithms,” Journal of Pattern Recognition and Signal Process, 1978, pp. 41–60.
[19] R. Ashrafidoost, S. Setayeshi, “A Method for Modelling and Simulation the Changes Trend of Emotions in Human Speech”, In Proc. of 9th European Congress on Modelling and Simulation (Eurosim), Sep.2016, p.444-450, DOI:10.1109/EUROSIM.2016.30.
[20] L. R. Welch, “Hidden Markov models and the Baum-Welch algorithm,” IEEE Information Theory Society Newsletter vol. 53, pp. 1, 10-13, Dec 2003.
[21] B. Schuller, G. Rigoll, and M. Lang, “Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine - belief network architecture,” in Proc. 2004 IEEE Int. Conf. Acoustics, Audio and Signal Processing, May 2004, vol. 1, pp. 577-580.
[22] C. M. Lee, S. Narayanan, “Towards detecting emotion in spoken dialogs,” IEEE Trans. Speech and Audio Processing, vol. 13, no. 2, pp. 293-303, 2005. DOI: 10.1016/j.specom.2010.08.013
[23] A. B. Ingale, D. S. Chaudhari, “Speech Emotion Recognition,” International Journal of Soft Computing and Engineering (IJSCE), Volume-2, Issue-1, March 2012.
[24] A. S. Utane, S. L. Nalbalwar, “Emotion Recognition through Speech Using Gaussian Mixture Model and Support Vector Machine,” International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013.
[25] R. B. Lanjewar, S. Mathurkar, N. Patel, “Implementation and Comparison of Speech Emotion Recognition System using Gaussian Mixture Model (GMM) and K-Nearest Neighbour (NKK) Techniques,” Elsevier, Procedia Computer Science, December 2015, DOI: 10.1016/j.procs.2015.04.226.