Real-Time Scalable Task Offloading in Edge Computing Using Semi-Markov Decision Processes and Attention-Based Deep Reinforcement Learning
Subject Areas : Computer Networks
Abbas Mirzaei
1
,
Nasser Mikaeilvand
2
,
babak nouri moghadam
3
,
Sajjad Jahanbakhsh Gudakahriz
4
,
Ailin Khosravani
5
,
fatemeh Tahmasebizade
6
,
Ali Seifi
7
,
Hosein Hatami
8
1 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
2 - Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran
3 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
4 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
5 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
6 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
7 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
8 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
Keywords: محاسبات لبه, برنامه ریزی وظایف, یادگیری تقویتی, مقیاس پذیری سیستم,
Abstract :
محاسبات لبه به عنوان چارچوبی پویا ظهور کرده که در آن وظایف محاسباتی به سرورهای لبه توزیعشده (ESs) منتقل میشوند تا خدمات با تأخیر کم و کارآمد ارائه دهند. با رشد مقیاس و پیچیدگی سیستمهای لبه، استفاده از یادگیری تقویتی عمیق (DRL) به رویکردی برجسته برای بهینهسازی بارگذاری وظایف و مدیریت منابع تبدیل شده است. با این حال، روشهای مبتنی بر یادگیری تقویتی عمیق (DRL) سنتی با چالشهای متعددی مواجه هستند: (1) چارچوبهای تصمیمگیری گسستهزمان، مانند فرآیندهای تصمیمگیری مارکوف (MDPs)، اغلب بارگذاری را در زمانهای ثابت تحمیل میکنند که منجربه تأخیر در زمانبندی و استفاده ناکارآمد از منابع میشود. (۲) ساختارهای محاسباتی ایستا در تطبیق با تعداد متغیر سرورهای لبه یا دستگاههای کاربری دچار مشکل میشوند که منجر به مسائل مقیاسپذیری و ناکارآمدی سیستم میگردد. برای غلبه بر این محدودیتها، ما یک مکانیزم جدید بارگذاری در زمان واقعی مبتنی بر یادگیری تقویتی عمیق (DRL) معرفی میکنیم که برای محیطهای لبهای پویا و مقیاسپذیر طراحی شده است. رویکرد ما مسئله بارگذاری را در چارچوب یک فرآیند تصمیمگیری نیمهمارکوفی (SMDP) بازتعریف میکند و یک مکانیزم بهینهسازی تطبیقی را معرفی میکند که از عملیات گراف مبتنی بر توجه برای محیطهای منابع ناهمگن استفاده میکند. این مکانیزم، که از اولویتبندی پویای وظایف و تقسیم منابع الهام گرفته شده است، نمرات توجه بین وظایف و سرورهای لبه را همراستا میکند تا سیاستهای تخصیص کارآمدی استخراج کند. علاوه براین، یک استراتژی شکلدهی به پاداش، اختلافات بین مدلهای نظری و بازخوردهای زمان واقعی را پل میزند و بهبود اکتشاف و عملکرد کلی سیستم را فراهم میکند. شبیهسازیهای گستردهای کارایی رویکرد پیشنهادی را تأیید میکنند و نشان میدهند که این رویکرد به طور قابل توجهی هزینههای سیستم را کاهش میدهد در حالی که مقیاسپذیری برتری را در برابر باهای کاری مختلف و اندازههای زیرساختی به دست میآورد و در سناریوهای محاسبات لبهای از راهحلهای پیشرفته موجود پیشی میگیرد.
[1] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey on mobile edge computing: The communication perspective,” IEEE Communications Surveys & Tutorials, vol. 19, no. 4, pp. 2322–2358, 2017.
[2] Mirzaei, A. and Najafi Souha, A., 2021. Towards optimal configuration in MEC Neural networks: deep learning-based optimal resource allocation. Wireless Personal Communications, 121(1), pp.221-243.
[3] Zhou, Guoliang, and Amin Mohajer. "Blind reconfigurable intelligent surfaces for dynamic offloading in fixed-NOMA mobile edge networks." International Journal of Sensor Networks 46, no. 3 (2024): 142-160.
[4] H. Guo, J. Li, J. Liu, N. Tian, and N. Kato, “A survey on space-airground- sea integrated network security in 6g,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, pp. 53–87, 2022.
[5] Duan, H., & Mirzaei, A. (2023). Adaptive Rate Maximization and Hierarchical Resource Management for Underlay Spectrum Sharing NOMA HetNets with Hybrid Power Supplies. Mobile Networks and Applications, 1-17.
[6] Zhou, Nan, Ya Nan Li, and Amin Mohajer. "Distributed capacity optimisation and resource allocation in heterogeneous mobile networks using advanced serverless connectivity strategies." International Journal of Sensor Networks 45, no. 3 (2024): 127-147.
[7] X. Huang, Y. Chen, J. Liu, M. Wang, P. Li, and Q. Zhao, “Joint interdependent task scheduling and energy balancing for multi-uav enabled aerial edge computing: A multi-objective optimization approach,” IEEE Internet of Things Journal, vol. 10, no. 4, pp. 3147–3160, 2023.
[8] Z. Yang, C. Pan, K. Wang, and M. Shikh-Bahaei, “Energy efficient Resource allocation in uav enabled mobile edge computing networks,”IEEE Transactions on Wireless Communications, vol. 18, no. 9, pp. 4576–4589, 2019.
[9] Mohajer, Amin, Mohammad Yousefvand, Ehsan Noori Ghalenoo, Parviz Mirzaei, and Ali Zamani. "Novel approach to sub-graph selection over coded wireless networks with QoS constraints." IETE Journal of Research 60, no. 3 (2014): 203-210.
[10] X. Zhang, J. Zhang, J. Xiong, L. Zhou, J. Wei, and H. Li, “Energyefficient multi-uav-enabled multiaccess edge computing incorporating noma,” IEEE Internet of Things Journal, vol. 7, no. 6, pp. 5613–5627, 2020.
[11] Mirzaei, A. (2022). A novel approach to QoS‐aware resource allocation in NOMA cellular HetNets using multi‐layer optimization. Concurrency and Computation: Practice and Experience, 34(21), e7068.
[12] T. Zhang, Y. Xu, J. Loo, D. Yang, L. Xiao, and Y. Zhao, “Joint computation and communication design for uav-assisted mobile edge computing in iot,” IEEE Transactions on Industrial Informatics, vol. 16, no. 8, pp. 5505–5516, 2020.
[13] Z. Liu, X. Tan, M. Wen, S. Wang, C. Liang, and Q. Zhao, “An energyefficient selection mechanism of relay and edge computing in uavassisted cellular networks,” IEEE Transactions on Green Communications and Networking, vol. 5, no. 3, pp. 1306–1318, 2021.
[14] Mohajer, Amin, Javad Hajipour, and Victor CM Leung. "Dynamic Offloading in Mobile Edge Computing with Traffic-Aware Network Slicing and Adaptive TD3 Strategy." IEEE Communications Letters (2024).
[15] Yang, Jiuting, and Amin Mohajer. "Multi objective constellation optimization and dynamic link utilization for sustainable information delivery using PD-NOMA deep reinforcement learning." Wireless Networks (2024): 1-21.
[16] Somarin, A. M., Barari, M., & Zarrabi, H. (2018). Big data based self-optimization networking in next generation mobile networks. Wireless Personal Communications, 101(3), 1499-1518.
[17] Kuang, Shuhong, Jiyong Zhang, and Amin Mohajer. "Reliable information delivery and dynamic link utilization in MANET cloud using deep reinforcement learning." Transactions on Emerging Telecommunications Technologies 35, no. 9 (2024): e5028.
[18] Hua, Yuxiu, Rongpeng Li, Zhifeng Zhao, Xianfu Chen, and Honggang Zhang. "GAN-powered deep distributional reinforcement learning for resource management in network slicing." IEEE Journal on Selected Areas in Communications 38, no. 2 (2019): 334-349.
[19] X. Qin, Z. Song, Y. Hao, and X. Sun, “Joint Resource allocation and trajectory optimization for multi-uav-assisted multi-access mobile edge computing,” IEEE Wireless Communications Letters, vol. 10, no. 7, pp. 1400–1404, 2021.
[20] Wang, Qianxing, Wei Li, and Amin Mohajer. "Load-aware continuous-time optimization for multi-agent systems: Toward dynamic resource allocation and real-time adaptability." Computer Networks 250 (2024): 110526.
[21] H. Hu, Z. Chen, F. Zhou, Z. Han, and H. Zhu, “Joint Resource and trajectory optimization for heterogeneous-uavs enabled aerial-ground cooperative computing networks,” IEEE Transactions on Vehicular Technology, vol. 72, no. 6, pp. 7119–7133, 2023.
[22] Mirzaei, A., Barari, M., & Zarrabi, H. (2019). Efficient resource management for non-orthogonal multiple access: A novel approach towards green hetnets. Intelligent Data Analysis, 23(2), 425-447.
[23] Gu, LiFen, and Amin Mohajer. "Joint throughput maximization, interference cancellation, and power efficiency for multi-IRS-empowered UAV communications." Signal, Image and Video Processing 18, no. 5 (2024): 4029-4043.
[24] G. Chen, Q. Wu, R. Liu, J. Wu, and C. Fang, “Irs aided mec systems with binary offloading: A unified framework for dynamic irs beamforming,”IEEE Journal on Selected Areas in Communications, vol. 41, no. 2, pp. 349–365, 2023.
[25] X. Li, Y. Qin, J. Huo, and W. Huangfu, “Computation offloading and trajectory planning of multi-uav-enabled mec: A knowledge-assisted multiagent reinforcement learning approach, IEEE Internet of Things Journal, 2023.
[26] Yang, Ting, Jiabao Sun, and Amin Mohajer. "Queue stability and dynamic throughput maximization in multi-agent heterogeneous wireless networks." Wireless Networks (2024): 1-27.
[27] Mirzaei, A., & Rahimi, A. (2019). A Novel Approach for Cluster Self-Optimization Using Big Data Analytics. Information Systems & Telecommunication, 50.
[28] Y. Gu, C. Yin, Y. Guo, B. Xia, and Z. Chen, “Communicationcomputation- aware user association in mec hetnets: A meta-analysis,” IEEE Transactions on Wireless Communications, vol. 22, no. 9, pp. 6090–6105, 2023.
[29] Zhang, Qi, Zhigang Li, Zhenteng Qin, Xiaochuan Sun, and Haijun Zhang. "Temporal Feature-Enhanced Deep Reinforcement Learning for RAN Slicing with User Mobility." IEEE Communications Letters (2023).
[30] F. Zhou, Y. Wu, R. Q. Hu, and Y. Qian, “Computation rate maximization in uav-enabled wireless-powered mobile-edge computing systems,” IEEE Journal on Selected Areas in Communications, vol. 36, no. 9, pp. 1927–1941, 2018.
[31] Q. Hu, Y. Cai, G. Yu, Z. Qin, M. Zhao, and G. Y. Li, “Joint offloading and trajectory design for uav-enabled mobile edge computing systems,”IEEE Internet of Things Journal, vol. 6, no. 2, pp. 1879–1892, 2019.
[32] Zhao, Zhongyong, Yu Chen, Jiangnan Liu, Yingying Cheng, Chao Tang, and Chenguo Yao. "Evaluation of operating state for smart electricity meters based on transformer–encoder–BiLSTM." IEEE Transactions on Industrial Informatics 19, no. 3 (2022): 2409-2420.
[33] Mohajer, Amin, Maryam Bavaghar, Rashin Saboor, and Ali Payandeh. "Secure dominating set-based routing protocol in MANET: Using reputation." In 2013 10th International ISC Conference on Information Security and Cryptology (ISCISC), pp. 1-7. IEEE, 2013.
[34] Y. Xu, T. Zhang, Y. Liu, D. Yang, L. Xiao, and M. Tao, “Cellular connected multi-uav mec networks: An online stochastic optimization approach,” IEEE Transactions on Communications, vol. 70, no. 10, pp. 6630–6647, 2022.
[35] Nemati, Z., Mohammadi, A., Bayat, A., & Mirzaei, A. (2024). Metaheuristic and Data Mining Algorithms-based Feature Selection Approach for Anomaly Detection. IETE Journal of Research, 1-15.
[36] Li, Rongpeng, Chujie Wang, Zhifeng Zhao, Rongbin Guo, and Honggang Zhang. "The LSTM-based advantage actor-critic learning for resource management in network slicing with user mobility." IEEE Communications Letters 24, no. 9 (2020): 2005-2009.
[37] L. Zhang, J. Li, Y. Wang, Z. Chen, Q. Liu, and Y. Sun, “Task offloading and trajectory control for uav-assisted mobile edge computing using deep reinforcement learning,” IEEE Access, vol. 9, pp. 53 708–53 719, 2021.
[38] X. Zhang, J. Zhang, J. Xiong, L. Zhou, J. Wei, and H. Li, “Energy efficient multi-uav-enabled multiaccess edge computing incorporating noma,” IEEE Internet of Things Journal, vol. 7, no. 6, pp. 5613–5627, 2020.
[39] L. Wang, K. Wang, C. Pan, W. Xu, N. Aslam, and L. Hanzo, “Multiagent deep reinforcement learning-based trajectory planning for multiuav assisted mobile edge computing,” IEEE Transactions on Cognitive Communications and Networking, vol. 7, no. 1, pp. 73–84, 2021.
[40] T. Zhang, Y. Xu, J. Loo, D. Yang, L. Xiao, and Y. Zhao, “Joint computation and communication design for uav-assisted mobile edge computing in iot,” IEEE Transactions on Industrial Informatics, vol. 16, no. 8, pp. 5505–5516, 2020.
[41] Z. Liu, X. Tan, M. Wen, S. Wang, C. Liang, and Q. Zhao, “An energy efficient selection mechanism of relay and edge computing in uavassisted cellular networks,” IEEE Transactions on Green Communications and Networking, vol. 5, no. 3, pp. 1306–1318, 2021.
[42] Yan, Dandan, Benjamin K. Ng, Wei Ke, and Chan-Tong Lam. "Deep reinforcement learning based resource allocation for network slicing with massive MIMO." IEEE Access (2023).
[43] C.-Y. Hsieh, Y. Ren, and J.-C. Chen, “Edge-cloud offloading: Knapsack potential game in 5g multi-access edge computing,” IEEE Transactions on Wireless Communications, vol. 22, no. 4, pp. 3124–3136, 2023.
[44] N. Zhao, C. Xu, W. Zhang, S. Yang, G.-M. Muntean, and F. Zhou,“5g-enabled uav-to community offloading: Joint trajectory design and task scheduling,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 11, pp. 3306–3320, 2021.
[45] H. Guo and J. Liu, “Uav-enhanced intelligent offloading for internet of things at the edge, IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2737–2746, 2020.
[46] Wang, Zhaoying, Yifei Wei, F. Richard Yu, and Zhu Han. "Utility optimization for resource allocation in multi-access edge network slicing: A twin-actor deep deterministic policy gradient approach." IEEE Transactions on Wireless Communications 21, no. 8 (2022): 5842-5856.
[47] X. Qin, Z. Song, Y. Hao, and X. Sun, “Joint Resource allocation and trajectory optimization for multi-uav-assisted multi-access mobile edge computing,” IEEE Wireless Communications Letters, vol. 10, no. 7, pp. 1400–1404, 2021.
[48] M. Li, N. Cheng, J. Gao, Y. Wang, L. Zhao, and X. Shen, “Energyefficient uav-assisted mobile edge computing: Resource allocation and trajectory optimization,” IEEE Transactions on Vehicular Technology, vol. 69, no. 3, pp. 3424–3438, 2020.
[49] Wang, Yue, Yu Gu, and Xiaofeng Tao. "Edge network slicing with statistical QoS provisioning." IEEE Wireless Communications Letters 8, no. 5 (2019): 1464-1467.
[50] H. Guo and J. Liu, “Uav-enhanced intelligent offloading for internet of things at the edge, IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2737–2746, 2020.