طراحی و پیادهسازی سامانه امنیتی نظارتی مبتنی بر الگوریتم YOLO و فناوری اینترنت اشیاء برپایه شبکه داده همراه
محورهای موضوعی : مهندسی برق الکترونیکمحمدرضا مسائلی 1 , سید محمدعلی زنجانی 2
1 - دانشکده مهندسي برق، واحد نجف¬آباد، دانشگاه آزاد اسلامی، نجف¬آباد، ايران
2 - دانشکده مهندسی برق، واحد نجف آباد، دانشگاه آزاد اسلامی، نجف آباد، ایران
کلید واژه: سامانه تشخیص فعالیت¬های انسانی HAR, بینایی ماشین, مقابله با خشونت, صادرکردن و کمّی¬سازی, الگوریتم YOLO,
چکیده مقاله :
افزایش چشمگیر امنیت، بهره¬وری مقیاس¬پذیری، پاسخگویی سریع و قابلیت اطمینان از مزایای طراحی و پیادهسازی سامانه امنیتی نظارتی مبتنی بر الگوریتم YOLO و فناوری اینترنت اشیا، در مقایسه با روشهای سنتی است. در این مقاله، به جنبه¬های ایجاد یک سامانه امنیتی نوین پرداخته می¬شود که با تشخیص پنج رده شامل انسان، سر انسان، تفنگ، چاقو و تشخیص سقوط، هشدار را فعال می¬کند. نظارت بر عملکرد سامانه، بهصورت برخط است. این سامانه در هر نقطه به کمک شبکه داده تلفن همراه، قابلیت اتصال به اینترنت را دارد تا در صورت شناسایی تهدیدات، تصاویر را در پنل مدیریتی بارگذاری و گزارش آن را به کاربر ارسال کند. برای تعلیم اشیاء از الگوریتم YOLOv8 استفاده شده است تا از مزایایی مانند رابط خط فرمان کاربرپسند، پشتیبانی آن از شناسایی اشیاء، تقسیمبندی نمونه و طبقهبندی تصاویر بهره گیرد. برای افزایش سرعت پردازش، ضمن حفظ دقت، مدل بهینهسازیشده در بورد رزبری¬پای نسل چهارم استفاده شده است. واضح است که بهینهسازی سرعت پردازش و استفاده از تکنیکهای کمّیسازی منجر به کاهش مصرف انرژی (سامانه انرژی سبز) و کاهش هزینههای عملیاتی سامانه می¬شود. بهمنظور بهبود سرعت مدل در فرایند تشخیص اشیاء، از تکنیک صادرکردن، کمّی¬سازی وزنهای تعلیمی و افزایش فرکانس پردازنده (اورکلاک) استفاده می¬شود. مقایسه وزنهای صادرشده جدید با وزن اصلی تعلیمی، در شاخص دقت و سرعت، بیانگر آن است که دو تکنیک صادرکردن و کمّی¬سازی، منجر به افزایش سرعت پردازش، به¬ازای کاهش دقت در تشخیص می¬شود. درنهایت، در مدل تعلیمی با روشهای بهبود مطرح شده می¬توان بهدقت متوسط mAP ≅ 0.67 با تعداد قابِ تصویر در ثانیه FPS ≅ 4.3 دستیافت.
The design and implementation of a surveillance security system based on the YOLO algorithm and Internet of Things (IoT) technology has significant advantages in terms of security, efficiency, scalability, rapid response, and reliability, compared to traditional methods. This paper discusses the aspects of creating a novel security system that activates an alert by detecting five categories: human, human head, gun, knife, and fall detection. The system is monitored online and can connect to the internet via a cellular data network at any location to upload images to the management panel and send a report to the user if threats are detected. The YOLOv8 algorithm is used for object training to take advantage of its user-friendly command line interface, object detection support, sample segmentation, and image classification capabilities. To increase processing speed while maintaining accuracy, the optimized model is deployed on the Raspberry Pi 4th generation board. It is clear that optimizing processing speed and using quantization techniques lead to reduced energy consumption (green energy system) and reduced operational costs of the system. To improve the speed of the model in the object detection process, the techniques of exporting, quantizing the training weights, and increasing the processor frequency (overclocking) are used. A comparison of the newly exported weights with the original training weights in terms of accuracy and speed shows that the two techniques of exporting and quantization lead to an increase in processing speed at the cost of a decrease in detection accuracy.Finally, in the training model with the proposed improvement methods, an average accuracy of mAP ≅ 0.67 with a frame rate of FPS ≅ 4.3 can be achieved.
[1] K. Y. Loh and S. C. Reddy, “Understanding and preventing computer vision syndrome,” Malaysian Family Physician, vol. 3, no. 3. Academy of Family Physicians of Malaysia, p. 128, 2008. Accessed: May 17, 2024. [Online]. Available: /pmc/articles/PMC4170366/
[2] R. Hebbalaguppe, “A computer vision based approach for reducing false alarms caused by spiders and cobwebs in surveillance camera networks,” 2014.
[3] W. Aitfares, A. Kobbane, and A. Kriouile, Suspicious behavior detection of people by monitoring camera, vol. 0. pp. 113–117. doi: 10.1109/ICMCS.2016.7905601.
[4] W. E. I. B. W. N. Afandi and N. M. Isa, “Object Detection: Harmful Weapons Detection Using YOLOv4,” IEEE Symp. Wirel. Technol. Appl. ISWTA, vol. 2021-August, pp. 63–70, 2021, doi: 10.1109/ISWTA52208.2021.9587423.
[5] L. Zhang, L. Lin, X. Liang, and K. He, “Is faster R-CNN doing well for pedestrian detection?,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9906 LNCS, pp. 443–457, 2016, doi: 10.1007/978-3-319-46475-6_28/TABLES/5.
[6] J. Li, X. Liang, S. Shen, T. Xu, J. Feng, and S. Yan, “Scale-Aware Fast R-CNN for Pedestrian Detection,” IEEE Trans. Multimed., vol. 20, no. 4, pp. 985–996, Apr. 2018, doi: 10.1109/TMM.2017.2759508.
[7] S. Zhang, R. Benenson, and B. Schiele, “Filtered Channel Features for Pedestrian Detection”, doi: 10.48550/arXiv.1501.05759.
[8] S. Zhang, R. Benenson, M. Omran, J. Hosang, and B. Schiele, “How far are we from solving pedestrian detection?,” in Proceedings of the iEEE conference on computer vision and pattern recognition, 2016, pp. 1259–1267. doi: 10.48550/arXiv.1602.01237.
[9] J. Hosang, M. Omran, R. Benenson, and B. Schiele, “Taking a Deeper Look at Pedestrians.” pp. 4073–4082, 2015. doi: 10.48550/arXiv.1501.05790.
[10] P. Dollar, Z. Tu, P. Perona, and S. Belongie, “Integral Channel Features,” in Procedings of the British Machine Vision Conference 2009, British Machine Vision Association, 2009, pp. 91.1-91.11. doi: 10.5244/C.23.91.
[11] P. Dollar, R. Appel, S. Belongie, and P. Perona, “Fast feature pyramids for object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 8, pp. 1532–1545, 2014, doi: 10.1109/TPAMI.2014.2300479.
[12] P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: A benchmark,” pp. 304–311, Mar. 2010, doi: 10.1109/CVPR.2009.5206631.
[13] M. Fabbri, G. Brasó, G. Maugeri, O. Cetintas, R. Gasparini, A. Ošep, S. Calderara, L. Leal-Taixé, and R. Cucchiara, “MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking?,” Proc. IEEE Int. Conf. Comput. Vis., pp. 10829–10839, Aug. 2021, doi: 10.1109/ICCV48922.2021.01067.
[14] J. Mao, T. Xiao, Y. Jiang, and Z. Cao, “What Can Help Pedestrian Detection?” pp. 3127–3136, 2017. doi: 10.48550/arXiv.1705.02757.
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017, doi: 10.1145/3065386.
[16] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., Sep. 2014, doi: 10.48550/arXiv.1409.1556.
[17] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” pp. 770–778, 2016. doi: 10.48550/arXiv.1512.03385.
[18] Z. Cai, Q. Fan, R. S. Feris, and N. Vasconcelos, “A unified multi-scale deep convolutional neural network for fast object detection,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9908 LNCS, pp. 354–370, 2016, doi: 10.1007/978-3-319-46493-0_22/FIGURES/8.
[19] X. Wang, T. Xiao, Y. Jiang, S. Shao, J. Sun, and C. Shen, “Repulsion Loss: Detecting Pedestrians in a Crowd.” pp. 7774–7783, 2018. doi: 10.48550/arXiv.1711.07752.
[20] R. K. Tiwari and G. K. Verma, “A Computer Vision based Framework for Visual Gun Detection Using Harris Interest Point Detector,” Procedia Comput. Sci., vol. 54, pp. 703–712, Jan. 2015, doi: 10.1016/J.PROCS.2015.06.083.
[21] H. Jain, A. Vikram, Mohana, A. Kashyap, and A. Jain, “Weapon Detection using Artificial Intelligence and Deep Learning for Security Applications,” Proc. Int. Conf. Electron. Sustain. Commun. Syst. ICESC 2020, pp. 193–198, Jul. 2020, doi: 10.1109/ICESC48915.2020.9155832.
[22] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9905 LNCS, pp. 21–37, 2016, doi: 10.1007/978-3-319-46448-0_2/FIGURES/5.
[23] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. , Accessed: May 17, 2024. [Online]. Available: https://github.com/
[24] T. S. S. Hashmi, N. U. Haq, M. M. Fraz, and M. Shahzad, “Application of Deep Learning for Weapons Detection in Surveillance Videos,” 2021 Int. Conf. Digit. Futur. Transform. Technol. ICoDT2 2021, May 2021, doi: 10.1109/ICODT252288.2021.9441523.
[25] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection.” pp. 779–788, 2016. doi: 10.48550/arXiv.1506.02640.
[26] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” Apr. 2020, doi: 10.48550/arXiv.2004.10934.
[27] A. Singh, T. Anand, S. Sharma, and P. Singh, “IoT Based Weapons Detection System for Surveillance and Security Using YOLOV4,” Proc. 6th Int. Conf. Commun. Electron. Syst. ICCES 2021, pp. 488–493, Jul. 2021, doi: 10.1109/ICCES51350.2021.9489224.
[28] M. T. Bhatti, M. G. Khan, M. Aslam, and M. J. Fiaz, “Weapon Detection in Real-Time CCTV Videos Using Deep Learning,” IEEE Access, vol. 9, pp. 34366–34382, 2021, doi: 10.1109/ACCESS.2021.3059170.
[29] X. Zhang, J. Zou, K. He, and J. Sun, “Accelerating Very Deep Convolutional Networks for Classification and Detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 10, pp. 1943–1955, Oct. 2016, doi: 10.1109/TPAMI.2015.2502579.
[30] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision.” pp. 2818–2826, 2016. doi: 10.48550/arXiv.1512.00567.
[31] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning,” Proc. AAAI Conf. Artif. Intell., vol. 31, no. 1, pp. 4278–4284, Feb. 2017, doi: 10.1609/aaai.v31i1.11231.
[32] S. Barratt and R. Sharma, “A Note on the Inception Score,” Jan. 2018, doi: 10.48550/arXiv.1801.01973.
[33] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” Apr. 2017, doi: 10.48550/arXiv.1704.04861.
[34] Y. Harjoseputro, I. P. Yuda, and K. P. Danukusumo, “MobileNets: Efficient Convolutional Neural Network for Identification of Protected Birds,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 10, no. 6, pp. 2290–2296, Dec. 2020, doi: 10.18517/ijaseit.10.6.10948.
[35] İ. Karakaya, I. Şafak, O. Öztürk, M. Bal, and Y. E. Esin, “Gun Detection with Faster R-CNN in X-Ray Images,” in 2020 28th Signal Processing and Communications Applications Conference (SIU), IEEE, Oct. 2020, pp. 1–4. doi: 10.1109/SIU49456.2020.9302457.
[36] J. Lim, M. I. Al Jobayer, V. M. Baskaran, J. M. Lim, K. Wong, and J. See, “Gun detection in surveillance videos using deep neural networks,” 2019 Asia-Pacific Signal Inf. Process. Assoc. Annu. Summit Conf. APSIPA ASC 2019, pp. 1998–2002, Nov. 2019, doi: 10.1109/APSIPAASC47483.2019.9023182.
[37] S. Shao, Z. Zhao, B. Li, T. Xiao, G. Yu, X. Zhang, and J. Sun, “CrowdHuman: A Benchmark for Detecting Human in a Crowd,” Apr. 2018, doi: 10.48550/arXiv.1805.00123.
[38] N. Yu and J. Lv, “Human body posture recognition algorithm for still images,” J. Eng., vol. 2020, no. 13, pp. 322–325, 2020, doi: 10.1049/joe.2019.1146.
[39] G. Santos, P. Endo, K. Monteiro, E. Rocha, I. Silva, and T. Lynn, “Accelerometer-Based Human Fall Detection Using Convolutional Neural Networks,” Sensors, vol. 19, no. 7, p. 1644, Apr. 2019, doi: 10.3390/s19071644.
[40] B. LUO and L. U. O. Bo, “Human Fall Detection for Smart Home Caring using Yolo Networks,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 4, p. 2023, 2023, doi: 10.14569/IJACSA.2023.0140409.
[41] R. Girshick, F. Iandola, T. Darrell, and J. Malik, “Deformable part models are convolutional neural networks,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2015, pp. 437–446. doi: 10.48550/arXiv.1409.5403.
[42] R. Padilla, W. L. Passos, T. L. B. Dias, S. L. Netto, and E. A. B. Da Silva, “A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit,” Electron. 2021, Vol. 10, Page 279, vol. 10, no. 3, p. 279, Jan. 2021, doi: 10.3390/ELECTRONICS10030279.
[43] P. Henderson and V. Ferrari, “End-to-End Training of Object Class Detectors for Mean Average Precision,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10115 LNCS, pp. 198–213, 2017, doi: 10.1007/978-3-319-54193-8_13.
[44] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (VOC) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, Jun. 2010, doi: 10.1007/S11263-009-0275-4/METRICS.
[45] E. Bisong, “Building Machine Learning and Deep Learning Models on Google Cloud Platform,” Build. Mach. Learn. Deep Learn. Model. Google Cloud Platf., 2019, doi: 10.1007/978-1-4842-4470-8.
[46] H. K. Jabbar and R. Z. Khan, “Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study),” in Computer Science, Communication and Instrumentation Devices, Singapore: Research Publishing Services, 2014, pp. 163–172. doi: 10.3850/978-981-09-5247-1_017.
[47] K. Raza, H. Song, and S. Hong, “Fast and Accurate Fish Detection Design with Improved YOLO-v3 Model and Transfer Learning,” Artic. Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 2, 2020, doi: 10.14569/IJACSA.2020.0110202.