روشی جدید در تشخیص گوینده مستقل از متن در محیط‌های نویزی

حیدری اصفهانی, نونا; محمودیان, حمید

رقم المقالة : 554892 زيارة : 305 الصفحة: 33 - 44

20.1001.1.23223871.1393.5.19.4.7

نوع المخطوط: ابحاث

روشی جدید در تشخیص گوینده مستقل از متن در محیط‌های نویزی

الموضوعات :

نونا حیدری اصفهانی ¹ , حمید محمودیان ²

1 - کارشناس ارشد، شرکت پرشیان فولاد اصفهان
2 - استادیار - دانشکده برق، دانشگاه آزاد اسلامی، واحد نجف آباد

تاريخ الإرسال : 15 الثلاثاء , رمضان, 1434 تاريخ التأكيد : 23 الخميس , رجب, 1435 تاريخ الإصدار : 09 الإثنين , صفر, 1436

الکلمات المفتاحية: MLP, آنتروپی شانون, بازشناسی گوینده, ضرایب MFCC, فرکانس پایه, فرمنت‌,

ملخص المقالة :

در این مقاله بازشناسی مقاوم به نویز گوینده در حالت مستقل از متن مورد توجه قرار گرفته است. روش پیشنهادی بر مبنای حذف سکوت از جملات و تقطیع آنها به واحدهای کوچک‌تر شامل چند آوا و حداقل یک واکه برای استخراج ویژگی‌های زمان‌بلند از جمله آنتروپی عمل می‌کند. یک واکه پرانرژی در هر قطعه گفتاری برای استخراج فرکانس پایه و فرمنت‌ها شناسایی می‌شود. با اعمال یک روش خوشه‌بندی، ویژگی‌های زمان‌کوتاه یعنی ضرایبِ MFCC با ویژگی‌های زمان‌بلند ترکیب می‌شوند. نتایج آزمایشات با استفاده از طبقه‌بندی کننده از نوع MLP نشان می‌دهد که میانگین نرخ بازشناسی گوینده با روش پیشنهادی در حالت بدون نویز 33/97% و در نسبت سیگنال به نویز 2- دسی‌بل 33/61% است که نسبت به روش‌های متداول بهبود نشان می‌دهد.

المصادر:

[1] R. ShanthaSelvaKumari, S. SelvaNidhyananthan, G. Anand, "Fused Mel feature sets based text-independent speaker identification using Gaussian mixture model", Procedia Engineering, Vol. 30, pp. 319-326, 2012.
[2] K. Daqrouq, K.Y. Al Azzawi, "Average framing linear prediction coding with wavelet transform for text-independent speaker identification system", Computers & Electrical Engineering, Vol. 38, No. 6, pp. 1467-1479, Nov. 2012.
[3] A. Shafik, S.M. Elhalafawy, S.M. Diab, B.M. Sallam, F.E. Abd El-samie, "A wavelet based approach for speaker identification from degraded speech", International Journal of Communication Networks and Information Security (IJCNIS), Vol. 1, No. 3, Dec. 2009.
[4] M.I. Abdalla, S.A. Hanaa, "Wavelet-based mel-frequency cepstral coefficients for speaker identification using hidden markov models", JOURNAL OF TELECOMMUNICATIONS, Vol. 1, No 2, March 2010.
[5] K. Daqrouq, "Wavelet entropy and neural network for text-independent speaker identification", Engineering Applications of Artificial Intelligence, Vol. 24, No 5, pp. 796–802, Aug. 2011.
[6] Md. Murad Hossain, B. Ahmed, M. Asrafi, "A real time speaker identification using artificial neural network", 10th international conference on computer and information technology, iccit, pp.1-5, 27-29 Dec. 2007.
[7] E. Avci, "A new optimum feature extraction and classification method for speaker recognition: GWPNN ", Expert Systems with Applications, Vol. 32, No. 2, pp. 485–498, Feb. 2007.
[8] H. Harb, C. Liming, "Gender identification using a general audio classifier", Proceeding of the IEEE/ICME, Vol. 2, pp. II-733-736, July 2003.
[9] H. Harb, L. Chen, "Voice-based gender identification in multimedia applications", Journal of Intelligent Information Systems, Vol. 24, No. 2-3, pp. 179-198, March 2005.
[10] J.A. Bachorowski, M.J. Owren, "Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech", Journal of the Acoustical Society of America, Vol. 106, No. 2, pp. 1054–1063, Aug. 1999.
[11] A. Cherif, L. Bouafif, T. Dabbabi, "Pitch detection and formants analysis of arabic speech processing", Applied Acoustcs, Vol. 62, No. 10, pp. 1129–1140, Oct. 2001.
[12] A.M. Noll, "Cepstrum pitch determination", Journal of the Acoustical Society of America, Vol. 41, pp. 293-309, 1967.
[13] W. Yutai, L. Bo, J. Xiaoqing, L. Feng, W. Lihao, "Speaker recognition based on dynamic MFCC parameters", Proceeding of the IEEE/IASP, pp. 406-409, April 2009.
[14] S. Chougule, P.P. Rege, "Language independent speaker identification", Proceeding of the IEEE/ICIT, pp. 364-368, 15-17 Dec. 2006.
[15] S. Haykin, "Neural networks", Macmillan College Publishing Company, Section 5.3: The Steepest Descent Method, 1994.
[16] M. Katz, "Fractals and the analysis of waveforms", Computers in Biology and Medicine, Vol. 18, No. 3, pp. 145-156, 1988.
[17] J.D. Wu, B.F. Lin, "Speaker identification using discrete wavelet packet transform technique with irregular decomposition", Expert Systems with Applications, Vol. 36, No. 2, pp. 3136–3143, March 2009.
[18] S. Pandiaraj, H.N.R. Keziah, D.S. Vinothini, L. Gloria, "A confidence measure based – score fusion technique to integrate MFCC and Pitch for speaker verification", Proceeding of the IEEE/ICECT, Vol. 3, pp. 317-320, April 2011.

_||_

شارک

عنوان URL للمقالة

روشی جدید در تشخیص گوینده مستقل از متن در محیط‌های نویزی

سند

الروابط

المراكز ذات الصلة

دعامة

الصفحات الرسمية