Subject Areas : Electrical Engineering
1 - Department of Electrical Engineering, Nowshahr Branch, Islamic Azad University, Nowshahr, Iran
Keywords:
Abstract :
[1] R. Johny Elton, J. Mohanalin and P. Vasuki,“A novel voice activity detection algorithm using modified global thresholding,” International Journal of Speech Technology, vol. 24, pp. 127–142, 2021.
[2] M. Eshaghi and M.R. Karami Mollaei,“Voice activitydetection based on using wavelet packet,” Digital Signal Processing, vol. 20, pp. 1102-1115, 2010.
[3] C.T. Hsieh, P.Y. Huang, T.W. Chen and Y. Chen,“Speech enhancement based on sparse representation under color noisy environment,” 2015 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS),pp.134 - 138,2015.
[4] G. Martin, A. Abeer, E. Dan and et al.,“All for one: feature combination for highly channel-degraded speech activity detection,”INTERSPEECH, Lyon 2013, pp.709–713, 2013.
[5] M. Kolbæk, Zh. Tan , S. Jensen and J. Jensen,“on Loss Functions for Supervised Monaural Time-Domain Speech Enhancement,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 825-838, 2020.
[6] M. Eshaghi,F. Razzazi and A. Behrad,“A New VAD Algorithm using Sparse Representation in Spectro-Temporal Domain,”Journal of Information Systems and Telecominication (JIST),vol. 7, pp.709–713, 2019.
[7] M. Mirbagheri, N. Mesgarani, and Sh. Shamma,“Nonlinear filtering of spectro-temporal modulation in speech enhancement,”2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5478-5481,2010.
[8] N. Mesgarani, S. David, and S.A. Shamma, “Representation of phoneme in primary auditory cortex: how the brain analyzes speech,”2007 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), pp. 765-768, 2007.
[9] M. Eshaghi, F. Razzazi and A. Behrad,“A voice activity detection algorithm in spectro-temporal domain using sparse representation,” International Journal of Machine Learning and Cybernetics, vol. 10,pp. 1791–1803, 2019.
[10] W. Li, Y. Zhou, N. Poh, F. Zhou, and Q. Liao,“ Feature Denoising Using Joint Sparse Representation for In-car Speech Recognition,”IEEE Transactions on audio, speech, and language processing, vol. 20, pp. 681-684, 2013.
[11] C. Mart´ınez, J. Goddardb, D. Milone, and H. Rufiner,“sparse spectro-temporal representation of speech forrobust classification,” Computer Speech and Language,vol.26, pp. 336-345,2012.
[12] M. Elad,“Sparse and redundant representations: from theory to applicationsin signal and image processing,”Springer Science & BusinessMedia, 2010.
[13] R. Rubinstein, A. M. Bruckstein and M. Elad,“Dictionaries for sparserepresentation modeling,” Proceedings of the IEEE,vol. 98, pp.1045–1057, 2010.
[14] M. Wei, Zh. Liu, X. Chen and H. Zhao,“Speech enhancement based on sparse representation using joint dictionary,”2018 International Conference on Computer Science, Electronics and Communication Engineering (CSECE),vol. 80, pp.500–503, 2018.
[15] K. Kreutz-Delgado, J.F. Murray, B.D. Rao, K. Engan, T. Lee and T.J. Sejnowski,“Dictionary learning algorithms for sparse representation,” Neural Computer,vol. 15, pp.349–396,2003.
[16] P. O. Hoyer,“Non-negative matrix factorization with sparseness con-straints,”The Journal of Machine Learning Research,vol. 5, pp. 1457–1469,2004.
[17] M. Aharon, M. Elad, and A. Bruckstein,“K-svd: A algorithm for designing over complete dictionaries for sparse representation,”IEEE Transactions on Signal Processing,vol.54, pp.4311–4322, 2006.
[18] R. Zdunek, and A. Cichocki,“Non-negative matrix factorization with quadratic programming,”Neural computation,vol. 71, pp. 2309-2320, 2007.
[19] G. H. Mohimani, M. Babaie-Zadeh and Ch. Jutten,“A fast approach for overcomplete sparse decomposition based on smoothed L0 norm,” IEEE Transactions on Signal Processing,vol.57, pp.289-301,2009.
[20] M. S. Lewicki and T. J. Sejnowski,“Learning overcomplete represen-tations,” Neural computation,vol. 12, pp. 337–365, 2000.
[21] Z. Jiang, G. Zhang, and L. S. Davis,“Submodular dictionary learn-ing for sparse coding,”2012IEEE Conference on Computer Vision andPattern Recognition (CVPR), pp. 3418–3425, 2012.
[22] J.F. Gemmeke, H.V. Hamme, B. Cranen and L. Boves ,“Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition,”IEEE Journal of selected topics in signal processing,vol. 4, pp. 273-82, 2010.
[23] W. M. Fisher, G. R. Doddington, M. Goudie and M. Kathleen,“The DARPA speech recognition research database: specifications and status,” Proceedings of DARPA Workshop on Speech Recognition,CD-ROMs, 2005.
[24] A. Varga, H. J. M. Steeneken, M. Tomlinson and D. Jones,“The NOISEX-92 study the effect of additive noise on automatic speech recognition,” Documentation included in the NOISEX-92 CD-ROMs, 1992.
[25] J. McLoughlin,“Super-Audible Voice Activity Detection,” IEEE Transactions on Speech and Audio Processing,vol.22, pp.1424-1433, 2014.
[26] P.K. Ghosh, A. Tsiartas and S. Narayanan,“Robust voice activity detection using long-term signal variability,”IEEE Transactions on Audio, Speech and Language Processing,vol. 11, pp. 600–613,2011.
[27] J. Sohn, N. S. Kim and W. Sung,“A statistical model-based voice activity detection,”IEEE Signal Process,vol. 6 , pp.1–3,1999.
[28] A. Benyassine, E. Shlomot, H. Y. Su, D. Massaloux, C. Lamblin and J. P. Petit,“ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications,”IEEE Communications Magazine,vol. 35, pp. 64-73, 1997.
[29] N. Mesgarani and Sh. Shamma,“Denoising in the Domain of Spectro-temporal Modulations,” EURASIP Journal on Audio, Speech, and Music Processing,vol. 12, pp. 1-9 ,2007.
[30] L. N. Tan, B. J. Borgstrom, and A. Alwan,“Voice activity detection using harmonic frequency components in likelihood ratio test,”2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 4466 - 4469, 2010.
[31] J. Ramirez, J. Segura, C. Benitez, A. Torre and A. Rubio,“ Voice activity detection with noise reduction and long-term spectral divergence estimation,”2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.271–287, 2004.
[32] M. Yanna and A. Nishihara,“Efficient voice activity detection algorithm using long-term spectral flatness measure,”EURASIP Journal on Audio, Speech, and Music Processing, ,vol. 87, pp. 1-18, 2013.
[33] X.K Yang, L. He, D. Qu and W. Q.Zhang,“Voice activity detection algorithm based on long-term pitch information,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 14, pp. 1-9 ,2016.