مدل‌سازی سیستم تشخیص گفتار با استفاده از تکنيک يادگيری عميق شبکه‌های عصبی اسپايکینگ

محورهای موضوعی : مهندسی کامپیوتر و فناوری اطلاعات

ملیکا حامیان ¹ , کریم فائز ² , سهیلا نظری ³ , ملیحه ثابتی ⁴

1 - گروه مهندسی کامپیوتر، واحد تهران شمال، دانشگاه آزاد اسلامی، تهران، ایران
2 - گروه مهندسی کامپیوتر، واحد تهران شمال، دانشگاه آزاد اسلامی، تهران، ایران
3 - گروه مهندسی کامپیوتر، واحد تهران شمال، دانشگاه آزاد اسلامی، تهران، ایران
4 - گروه مهندسی کامپیوتر، واحد تهران شمال، دانشگاه آزاد اسلامی، تهران، ایران

تاریخ دریافت : 1402/09/15 تاریخ پذیرش : 1403/07/02 تاریخ انتشار : 1403/10/07

کلید واژه: شبکه‌های عصبی اسپایکینگ, بهینه‌سازی مبتنی بر گرادیان, بهینه‌سازی اسب وحشی,

چکیده مقاله :

ساختار شبکه عصبی اسپایکینگ با الهام از نورون‌های اسپایکینگ پویا معرفی شده است. شبکه‌های عصبی اسپایکینگ پتانسیل فوق‌العاده‌ای برای درک الگوی درهم وابسته به زمان توسط نورون‌های اسپایکینگ پویا دارند و می‌توانند داده‌های رمزگذاری شده را مطابق با رویداد زمان پردازش کنند. با این حال، آموزش شبکه‌های عصبی اسپایکینگ عمیق ساده نیست. در این مقاله، یک چارچوب جدید یادگیری لایه‌ای شبکه عصبی اسپایکینگ برای تشخیص الگوی سریع و کارآمد پیشنهاد می‌شود که از الگوریتم‌های بهینه‌سازی برای یادگیری شبکه‌های عصبی اسپایکینگ عمیق استفاده می‌کند. در روش اشاره شده در مساله یادگیری عمیق، به کمک الگوریتم‌های مختلف بهینه‌سازی مبتنی بر گرادیان و بهینه‌سازی اسب وحشی، دو پارامتر اصلی نرون‌های اسپایک برای لایه‌های مختلف جستجو و محاسبه می‌شود. در این مقاله، از شبکه عصبی اسپایکینگ برای مدل‌سازی سیستم تشخیص گفتار رقمی استفاده و عملکرد آن‌ها در سناریوهای مختلف با سایر روش‌های یادگیری عمیق مقایسه و ارزیابی می‌شود. در مقایسه نتایج، روش پیشنهادی شبکه‌های عصبی اسپایکینگ با بهینه‌سازی اسب وحشی توانسته به دقت‌های 95.47% و 92.3% در بین همتایان خود دست پیدا کند؛ که افزایش دقت شناسایی و تخمین را نسبت به کارهای انجام شده نشان می‌دهند.

چکیده انگلیسی:

The architecture of spiking neural network (SNN) is introduced inspired by dynamic spiking neurons. SNNs have great potential to understand time-dependent entanglement pattern by dynamic spiking neurons and can process coded data according to time event. However, training deep SNNs is not straightforward. In this paper, we propose a new layered SNN learning framework for fast and efficient pattern recognition, which uses optimization algorithms to learn deep SNNs. In the mentioned method in the deep learning problem of our deep SNN layers, with the help of different algorithms of gradient-based optimization (GBO) and wild horse optimization (WHO), the two main parameters of spike neurons are searched and calculated for different layers. We use SNN to model the digital speech recognition system and compare and evaluate their performance in different scenarios with other deep learning methods. The results of SNN training for data extracted from different datasets show an increase in identification and estimation accuracy compared to the performed tasks. Comparing the results, the proposed SNN-WHO method was able to achieve accuracies of 95.47% and 92.3% among its counterparts, and they show an increase in the accuracy of identification and estimation compared to the performed works.

منابع و مأخذ:

[1] J. L. Lobo, J. Del Ser, A. Bifet, and N. Kasabov, “Spiking Neural Networks and online learning: An overview and perspectives,” Neural Networks, vol. 121, pp. 88–100, Jan. 2020, doi: https://doi.org/10.1016/j.neunet.2019.09.004.
[2] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, Oct. 1986, doi: https://doi.org/10.1038/323533a0.
[3]Stork, “Is backpropagation biologically plausible?,” International Joint Conference on Neural Networks, 1989, doi: https://doi.org/10.1109/ijcnn.1989.118705.
[4] J. Wu, C. Xu, D. Zhou, H. Li, and K. C. Tan, “Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks,” arXiv.org, 2020. https://arxiv.org/abs/2007.01204
[5]J. Wu, E. Yilmaz, M. Zhang, H. Li, and K. C. Tan, “Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition,” arXiv.org, 2019. https://arxiv.org/abs/1911.08373
[6] G. Eappen, S. T, and R. Nilavalan, “Cooperative relay spectrum sensing for cognitive radio network: Mutated MWOA-SNN approach,” Applied Soft Computing, vol. 114, p. 108072, Jan. 2022, doi: https://doi.org/10.1016/j.asoc.2021.108072.
[7] J. Fitzgerald and KongFatt Wong-Lin, “Multi-Objective Optimisation of Cortical Spiking Neural Networks With Genetic Algorithms,” Ulster University Research Portal (Ulster University), vol. 71, pp. 1–6, Jun. 2021, doi: https://doi.org/10.1109/issc52156.2021.9467860.
[8] Yeshwanth Bethi, Y. Xu, G. Cohen, A. V. Schaik, and S. Afshar, “An Optimized Deep Spiking Neural Network Architecture Without Gradients,” IEEE Access, vol. 10, pp. 97912–97929, Jan. 2022, doi: https://doi.org/10.1109/access.2022.3200699.
[10]A. Pitti, Mathias Quoy, C. Lavandier, and Sofiane Boucenna, “Gated spiking neural network using Iterative Free-Energy Optimization and rank-order coding for structure learning in memory sequences (INFERNO GATE),” Neural Networks, vol. 121, pp. 242–258, Jan. 2020, doi: https://doi.org/10.1016/j.neunet.2019.09.023.
[11]Y. Kim and P. Panda, “Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing,” Neural Networks, vol. 144, pp. 686–698, Dec. 2021, doi: https://doi.org/10.1016/j.neunet.2021.09.022.
[12]A. Woodward, T. Froese, and T. Ikegami, “Neural coordination can be enhanced by occasional interruption of normal firing patterns: A self-optimizing spiking neural network model,” Neural Networks, vol. 62, pp. 39–46, Feb. 2015, doi: https://doi.org/10.1016/j.neunet.2014.08.011.
[13]W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115–133, Dec. 1943, doi: https://doi.org/10.1007/bf02478259.
[14]“APA PsycNet,” psycnet.apa.org. https://psycnet.apa.org/record/1959-09865-001
[15]K. S. Sayarkin, A. V. Popov, and A. A. Zhilenkov, “Spiking neural network model MATLAB implementation based on Izhikevich mathematical model for control systems,” vol. 8, pp. 979–982, Jan. 2018, doi: https://doi.org/10.1109/eiconrus.2018.8317253.
[16]“Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting (Computational Neuroscience) by Izhikevich, Eugene M. (2010) Paperback (Computational Neuroscience Series): Izhikevich, Eugene M. M: 9780262514200: Amazon.com: Books,” Amazon.com, 2024. https://www.amazon.com/Dynamical-Systems-Neuroscience-Excitability-Computational/dp/0262514206 (accessed Nov. 08, 2024).
[17] www.izhikevich.com.
[18] J. Vreeken, “Spiking neural networks, an introduction.” Accessed: Nov. 08, 2024. [Online]. Available: https://webdoc.sub.gwdg.de/ebook/serien/ah/UU-CS/2003-008.pdf
[19] H. Paugam-Moisy and S. Bohte, “Computing with Spiking Neuron Networks,” Handbook of Natural Computing, pp. 335–376, 2012, doi: https://doi.org/10.1007/978-3-540-92910-9_10.
[20] I. Naruei and F. Keynia, “Wild horse optimizer: a new meta-heuristic algorithm for solving engineering optimization problems,” Engineering with Computers, Jun. 2021, doi: https://doi.org/10.1007/s00366-021-01438-z.
[21]I. Ahmadianfar, O. Bozorg-Haddad, and X. Chu, “Gradient-based optimizer: A new metaheuristic optimization algorithm,” Information Sciences, vol. 540, pp. 131–159, Nov. 2020, doi: https://doi.org/10.1016/j.ins.2020.06.037.
[22] H. Mostafa, “Supervised learning based on temporal coding in spiking neural networks,” arXiv.org, 2016. https://arxiv.org/abs/1606.08165 (accessed Nov. 08, 2024).
[23] B. Rueckauer, I.-A. Lungu, Y. Hu, M. Pfeiffer, and S.-C. Liu, “Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification,” Frontiers in Neuroscience, vol. 11, Dec. 2017, doi: https://doi.org/10.3389/fnins.2017.00682.
[24] A. Sengupta, Y. Ye, R. Wang, C. Liu, and K. Roy, “Going Deeper in Spiking Neural Networks: VGG and Residual Architectures,” arXiv.org, 2018. https://arxiv.org/abs/1802.02627 (accessed Nov. 08, 2024).
[25]“Gradient-based learning applied to document recognition - IEEE Journals & Magazine,” Ieee.org, 2019. https://ieeexplore.ieee.org/document/726791
[26] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” Apr. 2009. Available: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
[27] O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” arXiv.org, 2014. https://arxiv.org/abs/1409.0575
[28] A. Tavanaei, M. Ghodrati, S. R. Kheradpisheh, T. Masquelier, and A. Maida, “Deep learning in spiking neural networks,” Neural Networks, vol. 111, pp. 47–63, Mar. 2019, doi: https://doi.org/10.1016/j.neunet.2018.12.002.
[29] “Spiking Neural Networks: Learning, Applications, and Analysis: 9783845405155: Computer Science Books @ Amazon.com,” Amazon.com, 2024. https://www.amazon.com/Spiking-Neural-Networks-Learning-Applications/dp/3845405155 (accessed Nov. 08, 2024).
[30] D. Ellis. Clean Digits and Digit Strings (Sound Examples), http://www.ee.columbia.edu/~dpwe/sounds/tidigits/
[31] M. Xu, L.-Y. Duan, J. Cai, L.-T. Chia, C. Xu, and Q. Tian, “HMM-Based Audio Keyword Generation,” Advances in Multimedia Information Processing - PCM 2004, pp. 566–574, 2004, doi: https://doi.org/10.1007/978-3-540-30543-9_71.
[32] M. Xu, L.-Y. Duan, J. Cai, L.-T. Chia, C. Xu, and Q. Tian, “HMM-Based Audio Keyword Generation,” Advances in Multimedia Information Processing - PCM 2004, pp. 566–574, 2004, doi: https://doi.org/10.1007/978-3-540-30543-9_71.
[33] A. Graves and N. Jaitly, “Towards End-To-End Speech Recognition with Recurrent Neural Networks,” proceedings.mlr.press, Jun. 18, 2014. https://proceedings.mlr.press/v32/graves14.html

اشتراک گذاری

آدرس مقاله

مدل‌سازی سیستم تشخیص گفتار با استفاده از تکنيک يادگيری عميق شبکه‌های عصبی اسپايکینگ

سکوی نشر دانش

پیوندهای سایت

مراکز مرتبط

پشتیبانی

صفحات رسمی