A Stochastic-Process Methodology for Detecting Anomalies at Runtime in Embedded Systems
محورهای موضوعی : Transactions on Fuzzy Sets and SystemsAlfredo Cuzzocrea 1 , Enzo Mumolo 2 , Islam Belmerabet 3 , Abderraouf Hafsaoui 4
1 - iDEA Lab, University of Calabria, Rende, Italy.
2 - Department of Engineering, University of Trieste, Trieste, Italy.
3 - iDEA Lab, University of Calabria, Rende, Italy.
4 - iDEA Lab, University of Calabria, Rende, Italy.
کلید واژه: Anomaly detection, Embedded systems, Stochastic processes, Inference models.,
چکیده مقاله :
Embedded computing systems are very vulnerable to anomalies that can occur during execution of deployed software. Anomalies can be due, for example, to faults, bugs or deadlocks during executions. These anomalies can have very dangerous consequences on the systems controlled by embedded computing devices. Embedded systems are designed to perform autonomously, i.e., without any human intervention, and thus the possibility of debugging an application to manage the anomaly is very difficult, if not impossible. Anomaly detection algorithms are the primary means of being aware of anomalous conditions. In this paper, we describe a novel approach for detecting an anomaly during the execution of one or more applications. The algorithm exploits the differences in the behavior of memory reference sequences generated during executions. Memory reference sequences are monitored in real-time using the PIN tracing tool. The memory reference sequence is divided into randomly-selected blocks and spectrally described with the Discrete Cosine Transform (DCT) [36]. Experimental analysis performed on various benchmarks shows very low error rates for the anomalies tested.
Embedded computing systems are very vulnerable to anomalies that can occur during execution of deployed software. Anomalies can be due, for example, to faults, bugs or deadlocks during executions. These anomalies can have very dangerous consequences on the systems controlled by embedded computing devices. Embedded systems are designed to perform autonomously, i.e., without any human intervention, and thus the possibility of debugging an application to manage the anomaly is very difficult, if not impossible. Anomaly detection algorithms are the primary means of being aware of anomalous conditions. In this paper, we describe a novel approach for detecting an anomaly during the execution of one or more applications. The algorithm exploits the differences in the behavior of memory reference sequences generated during executions. Memory reference sequences are monitored in real-time using the PIN tracing tool. The memory reference sequence is divided into randomly-selected blocks and spectrally described with the Discrete Cosine Transform (DCT) [36]. Experimental analysis performed on various benchmarks shows very low error rates for the anomalies tested.
[1] Bonifati A, Cuzzocrea A. Storing and retrieving XPath fragments in structured P2P networks. Data & Knowledge Engineering. 2006; 59(2): 247-269. DOI: https://doi.org/10.1016/j.datak.2006.01.011
[2] Bonifati A, Cuzzocrea A. Efficient fragmentation of large XML documents. In: International Conference on Database and Expert Systems Applications, DEXA 2007, 3-7 September, 2007, Regensburg, Germany. 2007. p.539-550. DOI: https://doi.org/10.1007/978-3-540-74469-6 53
[3] Calzarossa M, Marie R, Trivedi KS. System performance with user behavior graphs. Performance Evaluation. 1990; 11(3): 155-164. DOI: https://doi.org/10.1016/0166-5316(90)90008-7
[4] Calzarossa MC, Massari L, Tessera D. Workload characterization: A survey revisited. ACM Computing Surveys. 2016; 48(3): p.1-43. DOI: https://doi.org/10.1145/2856127
[5] Chandola V, Banerjee A, Kumar V. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering. 2012; 24(5): 823-839. DOI: https://doi.org/10.1109/TKDE.2010.235
[6] Chong F, Chua T, Lim EP, Huberman BA. Detecting flow anomalies in distributed systems. In: 2014 IEEE International Conference on Data Mining, ICDM 2014, 14-17 December, 2014, Shenzhen, China. IEEE; 2014. p.100-109. DOI: https://doi.org/10.1109/ICDM.2014.94
[7] Crovella ME, Bestavros A. Self-similarity in world wide web traffic: Evidence and possible causes. IEEE/ACM Transactions on Networking. 1997; 5(6): 835-846. DOI: https://doi.org/10.1109/90.650143
[8] Cunha C, Bestavros A, Crovella M. Characteristics of WWW Client-based Traces. Boston University; 1995.
[9] Cuzzocrea A, Darmont J, Mahboubi H. Fragmenting very large XML data warehouses via K-means clustering algorithm. International Journal of Business Intelligence and Data Mining. 2009; 4(3-4): 301-328. DOI: https://doi.org/10.1504/IJBIDM.2009.029076
[10] Cuzzocrea A, Furfaro F, Masciari E, Sacc`a D, Sirangelo C. Approximate Query Answering on Sensor Network Data Streams. In: Stefanidis A, Nittel S. (eds.) GeoSensor Networks. Boca Raton, FL, USA: CRC Press; 2004; p.53-72. DOI: https://doi.org/10.1201/9780203356869.ch4
[11] Cuzzocrea A, Furfaro F, Sacc`a D. Hand-OLAP: A system for delivering OLAP services on handheld devices. In: The Sixth International Symposium on Autonomous Decentralized Systems, ISADS 2003, 9-11 April 2003, Pisa, Italy. IEEE; 2003. p. 80-87. DOI: https://doi.org/10.1109/ISADS.2003.1193935
[12] Cuzzocrea A, Matrangolo U. Analytical synopses for approximate query answering in OLAP environments. In: International Conference on Database and Expert Systems Applications, DEXA 2004, August 30-September 3, 2004, Zaragoza, Spain. 2004. p.359-370. DOI: https://doi.org/10.1007/978-3-540-30075- 5 35
[13] Cuzzocrea A, Mumolo E, Cecolin R. Runtime anomaly detection in embedded systems by binary tracing and hidden markov models. In: 2015 IEEE Annual Computer Software and Applications Conference, COMPSAC 2015, 1-5 July 2015, Taichung, Taiwan. IEEE; 2015, p.15-22. DOI: https://doi.org/10.1109/COMPSAC.2015.89
[14] Cuzzocrea A, Sacc`a D. Balancing accuracy and privacy of OLAP aggregations on data cubes. In: Proceedings of the ACM 13th International Workshop on Data Warehousing and OLAP, DOLAP 2010, 30 October 2010, Toronto, Ontario, Canada. New York, NY, United States: Association for Computing Machinery; 2010. p.93-98. DOI: https://doi.org/10.1145/1871940.1871960
[15] Cuzzocrea A, Sacc`a D, Serafino P. A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes. In: Data Warehousing and Knowledge Discovery: 8th International Conference, DaWaK 2006, Krakow, Poland, September 4-8, 2006, Krakow, Poland. 2006. p.106-119. DOI: https://doi.org/10.1007/11823728 11
[16] Cuzzocrea A, Sacc`a D, Ullman JD. Big data: A research agenda. In: Proceedings of the 17th International Database Engineering & Applications Symposium, IDEAS 2013, 9-11 October 2013, Barcelona, Spain. New York, NY, United States: Association for Computing Machinery; 2013. p.198-203. DOI: https://doi.org/10.1145/2513591.2527071
[17] Devijver PA. Baums forwardbackward algorithm revisited. Pattern Recognition Letters. 1985; 3(6): 369-373. DOI: https://doi.org/10.1016/0167-8655(85)90023-6
[18] Gao B, Ma H-Y, Yang Y-H. HMMs (Hidden Markov models) based on anomaly intrusion detection method. In: Proceedings. International Conference on Machine Learning and Cybernetics, ICMLC 2002, 4-5 November 2002, Beijing, China. IEEE; 2002. p.381-385. DOI: https://doi.org/10.1109/ICMLC.2002.1176779
[19] Gonz´alez FA, Dasgupta D. Anomaly detection using real-valued negative selection. Genetic Programming and Evolvable Machines. 2003; 4: 383-403. DOI: https://doi.org/10.1023/A:1026195112518
[20] Hansen JP, Tan KMC, Maxion RA. Anomaly detector performance evaluation using a parameterized environment. In: Proceedings of 9th International Symposium on Recent Advances in Intrusion Detection, RAID 2006, 20-22 September 2006, Hamburg, Germany. 2006. p.106-126. DOI: https://doi.org/10.1007/11856214 6
[21] Haring G, Lindemann C, Reiser M. Performance Evaluation: Origins and Directions. Berlin, Germany: Springer; 2000. DOI: https://doi.org/10.1007/3-540-46506-5
[22] Hsu WW, Smith AJ, Young HC. Characteristics of production database workloads and the TPC benchmarks. IBM Systems Journal. 2001; 40(3): 781-802. DOI: https://doi.org/10.1147/sj.403.0781
[23] Intel. Intel PIN Tool. http://www.pintool.org/ [Accessed 15th June 2024].
[24] Jeffrey N, Tan Q, Villar JR. A review of anomaly detection strategies to detect threats to cyber-physical systems. Electronics. 2023; 12(15): 3283. DOI: https://doi.org/10.3390/electronics12153283
[25] KCachegrind. kCacheGrind/CoreGrind. http://kcachegrind.sourceforge.net [Accessed 15th June 2024].
[26] Kuo S, Agazzi OE. Automatic keyword recognition using hidden markov models. Journal of Visual Communication and Image Representation. 1994; 5(3): 265-272. DOI: https://doi.org/10.1006/jvci.1994.1024
[27] Landauer C, Bellman KL. Detecting anomalies in constructed complex systems. In: Proceedings of the 33rd IEEE Annual Hawaii International Conference on System Sciences, HICSS 2000, 4-7 January, 2000, Maui, Hawaii, USA. IEEE; 2000. p.9. DOI: https://doi.org/10.1109/HICSS.2000.926733
[28] Li X, Han J, Kim S, Gonzalez H. Anomaly detection in moving object. In:Intelligence and Security Informatics. 2008; 357-381. DOI: https://doi.org/10.1007/978-3-540-69209-6 19
[29] Li Z, Tian JF, Yang XH. Program behavior monitoring based on system call attributes. Journal of Computer Research and Development. 2012; 49(8): 1676-1684.
[30] Li N, Yu SZ. Periodic hidden Markov model-based workload clustering and characterization. In: 2008 8th IEEE International Conference on Computer and Information Technology, CIT 2008, 8-11 July 2008, Sydney, NSW, Australia. IEEE; 2008. p.378-383. DOI: https://doi.org/10.1109/CIT.2008.4594705
[31] Linde Y, Buzo A, Gray R. An algorithm for vector quantizer design. IEEE Transactions on Communications. 1980; 28(1): 84-95. DOI: https://doi.org/10.1109/TCOM.1980.1094577
[32] Lu S, Lysecky R. Time and Sequence Integrated Runtime Anomaly Detection for Embedded Systems. ACM Transactions on Embedded Computing Systems. 2017; 17(2): 1-27. DOI: https://doi.org/10.1145/3122785
[33] Luthi J. Histogram-based characterization of workload parameters and its consequences on model analysis. In: Proceedings of 6th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS 1998, 19-24 July 1998, Montreal, Canada. 1998. p.1.
[34] Ma X, Schonfeld D, Khokhar A. A general two-dimensional hidden Markov model and its application in image classification. In: 2007 IEEE International Conference on Image Processing, ICIP 2007, 16 September - 19 October 2007, San Antonio, TX, USA. IEEE; 2007. p.VI41-VI44. DOI: https://doi.org/10.1109/ICIP.2007.4379516
[35] Madhyastha TM, Reed DA. Learning to classify parallel input/output access patterns. IEEE Transactions on Parallel and Distributed Systems. 2002; 13(8): 802-813. DOI: https://doi.org/10.1109/TPDS.2002.1028437
[36] Makhoul J. A fast cosine transform in one and two dimensions. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1980; 28(1): 27-34. DOI: https://doi.org/10.1109/TASSP.1980.1163351
[37] Maxion RA, Tan KMC. Anomaly detection in embedded systems. IEEE Transactions on Computers. 2002; 51(2): 108-120. DOI: https://doi.org/10.1109/12.980003
[38] McDonell KJ. Benchmark frameworks and tools for modelling the workload profile. Performance Evaluation. 1995; 22(1): 23-41. DOI: https://doi.org/10.1016/0166-5316(94)E0036-I
[39] Moro A, Mumolo E, Nolich M. Ergodic continuous hidden Markov models for workload characterization. In: 2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis, ISPA 2009, 16-18 September 2009, Salzburg, Austria. IEEE; 2009. p.99-104. DOI: https://doi.org/10.1109/ISPA.2009.5297771
[40] Moro A, Mumolo E, Nolich M. Workload modeling using pseudo2D-HMM. In: 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems, MASCOTS 2009, 21-23 September 2009, London, UK. IEEE; 2009. p.1-2. DOI: https://doi.org/10.1109/MASCOT.2009.5366721
[41] Nikiforov V. The energy of graphs and matrices. Journal of Mathematical Analysis and Applications. 2007; 326(2): 1472-1475. DOI: https://doi.org/10.1016/j.jmaa.2006.03.072
[42] Nikolaou C, Labrindis A, Bohn V, Ferguson D, Artavanis M, Kloukinas C, Marazakis M. The impact of workload clustering on transaction routing. FORTH, Institute of Computer Science, Technical Report. 1998; p.238.
[43] Ohno Y, Sugaya M, Van Der Zee A, Nakajima T. Anomaly detection system using resource pattern learning. In: 2009 Software Technologies for Future Dependable Distributed Systems, STFSSD 2009, 17 March 2009, Tokyo, Japan. IEEE; 2009. p.38-42. DOI: https://doi.org/10.1109/STFSSD.2009.41
[44] Pavliotis GA. Stochastic Processes and Applications. Springer New York, NY; 2014. DOI: https://doi.org/10.1007/978-1-4939-1323-7
[45] Paxson V, Floyd S. Wide-area traffic: The failure of Poisson modeling. IEEE/ACM Transactions on Networking. 1995; 3(3): 226-244. DOI: https://doi.org/10.1109/90.392383
[46] Peiris M, Hill JH, Thelin J, Bykov S, Kliot G, Konig C. PAD: Performance anomaly detection in multi-server distributed systems. In: 2014 IEEE 7th International Conference on Cloud Computing, CLOUD 2014, 27 June - 2 July, 2014, Anchorage, AK, USA. IEEE; 2014. p.769-776. DOI: https://doi.org/10.1109/CLOUD.2014.107
[47] Pentakalos OI, Menasce DA, Yesha Y. Automated clustering-based workload characterization. In: NASA Conference Publication, Proceedings of the 5th NASA Goddard Mass Storage Systems and Technologies Conference. 1996. p.253-263.
[48] Qiao Y, Xin XW, Bin Y, Ge S. Anomaly intrusion detection method based on HMM. Electronics Letters. 2002; 38(13): 663-664. DOI: https://doi.org/10.1049/el:20020467
[49] Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE. 1989; 77(2): 257-286. DOI: https://doi.org/10.1109/5.18626
[50] Rabiner LR, Juang BH. Foundamentals of Speech Recognition. Prentice Hall Signal Processing Series; 1993.
[51] Raghavan SV, Vasukiamaiyar D, Harign G. Hierarchical approach to building generative networkload models. Computer Networks and ISDN Systems. 1995; 27(7): 1193-1206. DOI: https://doi.org/10.1016/0169-7552(94)00012-I
[52] Rajasegarar S, Leckie C, Palaniswami M. Hyperspherical cluster based distributed anomaly detection in wireless sensor networks. Journal of Parallel and Distributed Computing. 2014; 74(1): 1833-1847. DOI: https://doi.org/10.1016/j.jpdc.2013.09.005
[53] Sapia C. PROMISE: Predicting query behavior to enable predictive caching strategies for OLAP systems. In: Proceedings of 2nd International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2000, 4-6 September 2000, London, UK. Springer-Verlag; 2000. p.224-233. DOI: https://doi.org/10.5555/646109.679288
[54] Singh JP, Stone HS, Thiebaut DF. A model of workloads and its use in miss-rate prediction for fully associative caches. IEEE Transactions on Computers. 1992; 41(7): 811-825. DOI: https://doi.org/10.1109/12.256450
[55] Song B, Ernemann C, Yahyapour R. Parallel computer workload modeling with Markov chains. In: Proceedings of 10th International Conference on Job Scheduling Strategies for Parallel Processing, JSSPP 2004, 13 June 2004, New York, NY, USA. 2004. p.47-62. DOI: https://doi.org/10.1007/11407522 3
[56] SPEC. SPEC CPU2006 Benchmark. http://www.spec.org/cpu2006/ [Accessed 15th June 2024].
[57] Strang G. The discrete cosine transform. SIAM Review. 1999; 41(1): 135-147. DOI: https://doi.org/10.1137/S0036144598336745
[58] Sugaya M, Ohno Y, Van Der Zee A, Nakajima T. A lightweight anomaly detection system for information appliances. In: 2009 IEEE International Symposium on Object/Component/Service-Oriented RealTime Distributed Computing, ISORC 2009, 17-20 March, 2009, Tokyo, Japan. 2009; 257266. DOI: https://doi.org/10.1109/ISORC.2009.39
[59] Tan X, Wang W, Xi H, Yin B. A Markov model of system calls sequence and its application in anomaly detection. Computer Engineering. 2002; 43: 189-191.
[60] Thiebaut D, Wolf JL, Stone HS. Synthetic traces for trace-driven simulation of cache memories. IEEE Transaction of Computers. 1992; 41(04): 388-410. DOI: https://doi.org/10.1109/12.135552
[61] Thornock NC, Flanagan JK. Using the BACH trace collection mechanism to characterize the SPEC2000 integer benchmarks. Workload Characterization of Emerging Computer Applications. 2001; 610: 121-143. DOI: https://doi.org/10.1007/978-1-4615-1613-2 6
[62] Valgrind. Valgrind Instrumentation Framework. http://valgrind.org/ [Accessed 15th June 2024].
[63] Wang P, Shi L, Wang B, Wu Y, Liu Y. Survey on HMM based anomaly intrusion detection using system calls. In: 2010 5th International Conference on Computer Science & Education, ICCSE 2010, 24-27 August 2010, Hefei, China. IEEE; 2010. p.102-105. DOI: https://doi.org/10.1109/ICCSE.2010.5593839
[64] Wang Z, Wei Z, Gao C, Chen Y, Wang F. A framework for data anomaly detection based on iterative optimization in IoT systems. Computing. 2023; 105: 23372362. DOI: https://doi.org/10.1007/s00607- 023-01186-6
[65] Wu Z, Zhou X, Xu J. A result fusion based distributed anomaly detection system for android smartphones. Journal of Networks. 2013; 8(2): 273-282. DOI: https://doi.org/10.4304/jnw.8.2.273-282
[66] Yin Q-B, Shen L-R, Zhang R-B, Li X-Y, Wang H-Q. Intrusion detection based on hidden Markov model. In: Proceedings of the 2003 International Conference on Machine Learning and Cybernetics, ICMLC 2003, 5 November 2003, Xi’an, China. IEEE; 2003. p.3115-3118. DOI: https://doi.org/10.1109/ICMLC.2003.1260114
[67] Yousuf S, Kadri MB. A ubiquitous architecture for wheelchair fall anomaly detection using low-cost embedded sensors and isolation forest algorithm. Computers and Electrical Engineering. 2023; 105: 108518. DOI: https://doi.org/10.1016/j.compeleceng.2022.108518
[68] Yu SZ, Liu Z, Squillante MS, Xia C, Zhang L. A hidden semi-Markov model for web workload selfsimilarity. In: Conference Proceedings of the IEEE International Performance, Computing, and Communications Conference, IPCCC 2002, 3-5 April 2002, Phoenix, AZ, USA. IEEE; 2002. p.65-72. DOI: https://doi.org/10.1109/IPCCC.2002.995137
[69] Wang M, Zhang C, Yu J. (2006, June). Native API based windows anomaly intrusion detection method using SVM. In: Proceedings of the IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing ,SUTC’06, 5-7 June 2006. IEEE; 2006; 1: 6. DOI: https://doi.org/10.1109/SUTC11330.2006
[70] Zadeh MMZ, Salem M, Kumar N, Cutulenco G, Fishmeister S. SiPTA: Signal processing for tracebased anomaly detection. In: Proceedings of the 14th International Conference on Embedded Software, EMSOFT 2014, 12-17 October 2014, Uttar Pradesh, India. Association for Computing Machinery; 2014. p.1-10. DOI: https://doi.org/10.1145/2656045.2656071
[71] Zandrahimi M, Zarandi HR, Mottaghi MH. Two effective methods to detect anomalies in embedded systems. Microelectronics Journal. 2012; 43(1): 77-87. DOI: https://doi.org/10.1016/j.mejo.2011.11.003
[72] Zhang X, Fan P, Zhu Z. A new anomaly detection method based on hierarchical HMM. In: Proceedings of the 4th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2003, 29 August 2003, Chengdu, China. IEEE; 2003. p.249-252. DOI: https://doi.org/10.1109/PDCAT.2003.1236299