Modeling the speech recognition system using the deep learning technique of spiking neural networks
Subject Areas : Computer Engineering and ITmelika hamian 1 , کریم فائز 2 , Soheila Nazari 3 , Maliheh Sabeti 4
1 - گروه مهندسی کامپیوتر، واحد تهران شمال، دانشگاه آزاد اسلامی، تهران، ایران
2 - گروه الکترونیک، دانشکده مهندسی برق، دانشگاه صنعتی امیرکبیر
3 - Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran.
4 - Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran
Keywords: spiking Neural Networks (SNN), Gradient Based Optimization (GBO) , Wild Horse Optimization (HWO),
Abstract :
The architecture of spiking neural network (SNN) is introduced inspired by dynamic spiking neurons. SNNs have great potential to understand time-dependent entanglement pattern by dynamic spiking neurons and can process coded data according to time event. However, training deep SNNs is not straightforward. In this paper, we propose a new layered SNN learning framework for fast and efficient pattern recognition, which uses optimization algorithms to learn deep SNNs. In the mentioned method in the deep learning problem of our deep SNN layers, with the help of different algorithms of gradient-based optimization (GBO) and wild horse optimization (WHO), the two main parameters of spike neurons are searched and calculated for different layers. We use SNN to model the digital speech recognition system and compare and evaluate their performance in different scenarios with other deep learning methods. The results of SNN training for data extracted from different datasets show an increase in identification and estimation accuracy compared to the performed tasks. Comparing the results, the proposed SNN-WHO method was able to achieve accuracies of 95.47% and 92.3% among its counterparts, and they show an increase in the accuracy of identification and estimation compared to the performed works.
[1] J. L. Lobo, J. Del Ser, A. Bifet, and N. Kasabov, “Spiking Neural Networks and online learning: An overview and perspectives,” Neural Networks, vol. 121, pp. 88–100, Jan. 2020, doi: https://doi.org/10.1016/j.neunet.2019.09.004.
[2] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, Oct. 1986, doi: https://doi.org/10.1038/323533a0.
[3]Stork, “Is backpropagation biologically plausible?,” International Joint Conference on Neural Networks, 1989, doi: https://doi.org/10.1109/ijcnn.1989.118705.
[4] J. Wu, C. Xu, D. Zhou, H. Li, and K. C. Tan, “Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks,” arXiv.org, 2020. https://arxiv.org/abs/2007.01204
[5]J. Wu, E. Yilmaz, M. Zhang, H. Li, and K. C. Tan, “Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition,” arXiv.org, 2019. https://arxiv.org/abs/1911.08373
[6] G. Eappen, S. T, and R. Nilavalan, “Cooperative relay spectrum sensing for cognitive radio network: Mutated MWOA-SNN approach,” Applied Soft Computing, vol. 114, p. 108072, Jan. 2022, doi: https://doi.org/10.1016/j.asoc.2021.108072.
[7] J. Fitzgerald and KongFatt Wong-Lin, “Multi-Objective Optimisation of Cortical Spiking Neural Networks With Genetic Algorithms,” Ulster University Research Portal (Ulster University), vol. 71, pp. 1–6, Jun. 2021, doi: https://doi.org/10.1109/issc52156.2021.9467860.
[8] Yeshwanth Bethi, Y. Xu, G. Cohen, A. V. Schaik, and S. Afshar, “An Optimized Deep Spiking Neural Network Architecture Without Gradients,” IEEE Access, vol. 10, pp. 97912–97929, Jan. 2022, doi: https://doi.org/10.1109/access.2022.3200699.
[10]A. Pitti, Mathias Quoy, C. Lavandier, and Sofiane Boucenna, “Gated spiking neural network using Iterative Free-Energy Optimization and rank-order coding for structure learning in memory sequences (INFERNO GATE),” Neural Networks, vol. 121, pp. 242–258, Jan. 2020, doi: https://doi.org/10.1016/j.neunet.2019.09.023.
[11]Y. Kim and P. Panda, “Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing,” Neural Networks, vol. 144, pp. 686–698, Dec. 2021, doi: https://doi.org/10.1016/j.neunet.2021.09.022.
[12]A. Woodward, T. Froese, and T. Ikegami, “Neural coordination can be enhanced by occasional interruption of normal firing patterns: A self-optimizing spiking neural network model,” Neural Networks, vol. 62, pp. 39–46, Feb. 2015, doi: https://doi.org/10.1016/j.neunet.2014.08.011.
[13]W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115–133, Dec. 1943, doi: https://doi.org/10.1007/bf02478259.
[14]“APA PsycNet,” psycnet.apa.org. https://psycnet.apa.org/record/1959-09865-001
[15]K. S. Sayarkin, A. V. Popov, and A. A. Zhilenkov, “Spiking neural network model MATLAB implementation based on Izhikevich mathematical model for control systems,” vol. 8, pp. 979–982, Jan. 2018, doi: https://doi.org/10.1109/eiconrus.2018.8317253.
[16]“Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting (Computational Neuroscience) by Izhikevich, Eugene M. (2010) Paperback (Computational Neuroscience Series): Izhikevich, Eugene M. M: 9780262514200: Amazon.com: Books,” Amazon.com, 2024. https://www.amazon.com/Dynamical-Systems-Neuroscience-Excitability-Computational/dp/0262514206 (accessed Nov. 08, 2024).
[17] www.izhikevich.com.
[18] J. Vreeken, “Spiking neural networks, an introduction.” Accessed: Nov. 08, 2024. [Online]. Available: https://webdoc.sub.gwdg.de/ebook/serien/ah/UU-CS/2003-008.pdf
[19] H. Paugam-Moisy and S. Bohte, “Computing with Spiking Neuron Networks,” Handbook of Natural Computing, pp. 335–376, 2012, doi: https://doi.org/10.1007/978-3-540-92910-9_10.
[20] I. Naruei and F. Keynia, “Wild horse optimizer: a new meta-heuristic algorithm for solving engineering optimization problems,” Engineering with Computers, Jun. 2021, doi: https://doi.org/10.1007/s00366-021-01438-z.
[21]I. Ahmadianfar, O. Bozorg-Haddad, and X. Chu, “Gradient-based optimizer: A new metaheuristic optimization algorithm,” Information Sciences, vol. 540, pp. 131–159, Nov. 2020, doi: https://doi.org/10.1016/j.ins.2020.06.037.
[22] H. Mostafa, “Supervised learning based on temporal coding in spiking neural networks,” arXiv.org, 2016. https://arxiv.org/abs/1606.08165 (accessed Nov. 08, 2024).
[23] B. Rueckauer, I.-A. Lungu, Y. Hu, M. Pfeiffer, and S.-C. Liu, “Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification,” Frontiers in Neuroscience, vol. 11, Dec. 2017, doi: https://doi.org/10.3389/fnins.2017.00682.
[24] A. Sengupta, Y. Ye, R. Wang, C. Liu, and K. Roy, “Going Deeper in Spiking Neural Networks: VGG and Residual Architectures,” arXiv.org, 2018. https://arxiv.org/abs/1802.02627 (accessed Nov. 08, 2024).
[25]“Gradient-based learning applied to document recognition - IEEE Journals & Magazine,” Ieee.org, 2019. https://ieeexplore.ieee.org/document/726791
[26] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” Apr. 2009. Available: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
[27] O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” arXiv.org, 2014. https://arxiv.org/abs/1409.0575
[28] A. Tavanaei, M. Ghodrati, S. R. Kheradpisheh, T. Masquelier, and A. Maida, “Deep learning in spiking neural networks,” Neural Networks, vol. 111, pp. 47–63, Mar. 2019, doi: https://doi.org/10.1016/j.neunet.2018.12.002.
[29] “Spiking Neural Networks: Learning, Applications, and Analysis: 9783845405155: Computer Science Books @ Amazon.com,” Amazon.com, 2024. https://www.amazon.com/Spiking-Neural-Networks-Learning-Applications/dp/3845405155 (accessed Nov. 08, 2024).
[30] D. Ellis. Clean Digits and Digit Strings (Sound Examples), http://www.ee.columbia.edu/~dpwe/sounds/tidigits/
[31] M. Xu, L.-Y. Duan, J. Cai, L.-T. Chia, C. Xu, and Q. Tian, “HMM-Based Audio Keyword Generation,” Advances in Multimedia Information Processing - PCM 2004, pp. 566–574, 2004, doi: https://doi.org/10.1007/978-3-540-30543-9_71.
[32] M. Xu, L.-Y. Duan, J. Cai, L.-T. Chia, C. Xu, and Q. Tian, “HMM-Based Audio Keyword Generation,” Advances in Multimedia Information Processing - PCM 2004, pp. 566–574, 2004, doi: https://doi.org/10.1007/978-3-540-30543-9_71.
[33] A. Graves and N. Jaitly, “Towards End-To-End Speech Recognition with Recurrent Neural Networks,” proceedings.mlr.press, Jun. 18, 2014. https://proceedings.mlr.press/v32/graves14.html