مدلسازی استوار با استفاده از ترکیب شبیهسازی تبرید و یادگیری تقویتی جهت بهینهسازی مسیرها
محورهای موضوعی : پردازش چند رسانه ای، سیستمهای ارتباطی، سیستمهای هوشمندحامی طالبی 1 , مهران خلج 2 , داوود جعفری 3
1 - دانشجوی دکتری، گروه مهندسی صنایع، واحد پرند و رباط کریم، دانشگاه آزاد اسلامی، پرند، ایران
2 - استادیار، گروه مهندسی صنایع، واحد پرند و رباط کریم، دانشگاه آزاد اسلامی، پرند، ایران
3 - استادیار، گروه مهندسی صنایع، واحد پرند و رباط کریم، دانشگاه آزاد اسلامی، پرند، ایران
کلید واژه: زنجیره تامین, بهینه سازی استوار, شبیه سازی تبرید, یادگیری تقویتی, مسیریابی وسایل نقلیه.,
چکیده مقاله :
این مقاله به بهینه سازی منابع، مسیرها و برنامه های تحویل در یک شبکه زنجیره تامین شامل تأمین کنندگان ثابت، مراکز توزیع و مشتریان می پردازد. چارچوب حل ارائه شده در این مقاله روش بهینه سازی استوار را با تبرید شبیه سازی شده، که تقویت شده توسط رویکرد یادگیری تقویتی است ، ادغام میکند. آزمایش و ارزیابی دقیق ارائّه شده، اثربخشی این رویکرد را در سناریوهای متنوع نشان میدهد. در این مقاله یک مطالعه موردی در دنیای واقعی در بخش لجستیک ارائه شده است که افزایش کارایی محسوس را برای آن به نمایش می گذارد. در ادامه تحلیل حساسیت انجام شده است که بینش های ارزشمندی در مورد انطباق پذیری سیستم با تغییرات پارامترها ارائه می دهد. همچنین نتایج منتشر شده مزایای قابل توجه افزودن استراتژی های مبتنی بر یادگیری را برجسته می کند. این پژوهش ابزاری کاربردی و مؤثر برای بهینه سازی زنجیره تأمین ارائه می دهد که سازمان ها را قادر می سازد تا با چابکی و کارایی با چالش ها روبرو شوند.
این مقاله به بهینه سازی منابع، مسیرها و برنامه های تحویل در یک شبکه زنجیره تامین شامل تأمین کنندگان ثابت، مراکز توزیع و مشتریان می پردازد. چارچوب حل ارائه شده در این مقاله روش بهینه سازی استوار را با تبرید شبیه سازی شده، که تقویت شده توسط رویکرد یادگیری تقویتی است ، ادغام میکند. آزمایش و ارزیابی دقیق ارائّه شده، اثربخشی این رویکرد را در سناریوهای متنوع نشان میدهد. در این مقاله یک مطالعه موردی در دنیای واقعی در بخش لجستیک ارائه شده است که افزایش کارایی محسوس را برای آن به نمایش می گذارد. در ادامه تحلیل حساسیت انجام شده است که بینش های ارزشمندی در مورد انطباق پذیری سیستم با تغییرات پارامترها ارائه می دهد. همچنین نتایج منتشر شده مزایای قابل توجه افزودن استراتژی های مبتنی بر یادگیری را برجسته می کند. این پژوهش ابزاری کاربردی و مؤثر برای بهینه سازی زنجیره تأمین ارائه می دهد که سازمان ها را قادر می سازد تا با چابکی و کارایی با چالش ها روبرو شوند.
Introduction: In today’s dynamic and highly competitive business environment, optimizing supply chain operations is a critical necessity. This study proposes a robust optimization framework that synergistically integrates simulated annealing with reinforcement learning to enhance routing efficiency within a complex supply chain network.
Method: The problem is formulated as a mixed-integer linear programming (MILP) model whose objective function minimizes total transportation costs, vehicle utilization expenses, and time-window penalties. Real-world uncertainties in customer demand are represented as uncertain parameters and addressed through a robust optimization approach. To obtain practical, scalable solutions, a metaheuristic based on simulated annealing is employed and further augmented with reinforcement learning strategies to improve algorithmic effectiveness.
Results: A real-world case study from the logistics sector demonstrates the practical application and effectiveness of the proposed model and solution approach. The integrated simulated annealing–reinforcement learning algorithm yields efficient routing plans that reduce overall costs and enhance resource utilization under demand uncertainty.
Discussion: A comprehensive sensitivity analysis provides actionable intelligence on the model’s adaptability and performance across various operational scenarios. The findings offer supply chain decision-makers insights into how cost, vehicle usage, and service levels respond to changes in demand and system parameters, underscoring the framework’s practical value for robust, data-driven routing decisions
[1] A. A. Javid and N. Azad, “Incorporating location, routing and inventory decisions in
supply chain network design,” Transportation Research Part E: Logistics and
Transportation Review, vol. 46, no. 5, pp. 582–597, 2010.
[2] J.-H. Lee, I.-K. Moon, and J.-H. Park, “Multi-level supply chain network design with
routing,” International Journal of production research, vol. 48, no. 13, pp. 3957–
3976, 2010.
[3] V. Schmid, K. F. Doerner, and G. Laporte, “Rich routing problems arising in supply
chain management,” European Journal of Operational Research, vol. 224, no. 3, pp.
435–448, 2013.
[4] M. Awad, M. Ndiaye, and A. Osman, “Vehicle routing in cold food supply chain
logistics: A literature review,” The International Journal of Logistics Management,
vol. 32, no. 2, pp. 592–617, 2021.
[5] M. Musavi and A. Bozorgi-Amiri, “A multi-objective sustainable hub locationscheduling problem for perishable food supply chain,” Computers & Industrial
Engineering, vol. 113, pp. 766–778, 2017.
[6] J. X. Cao, Z. Zhang, and Y. Zhou, “A location-routing problem for biomass supply
chains,” Computers & Industrial Engineering, vol. 152, p. 107017, 2021.
[7] M. Tavana, H. Tohidi, M. Alimohammadi, and R. Lesansalmasi, “A locationinventory-routing model for green supply chains with low-carbon emissions under
uncertainty,” Environmental Science and Pollution Research, vol. 28, pp. 50636–
50648, 2021.
[8] K. Govindan, A. Jafarian, R. Khodaverdi, and K. Devika, “Two-echelon multiplevehicle location–routing problem with time windows for optimization of sustainable
supply chain network of perishable food,” International journal of production
economics, vol. 152, pp. 9–28, 2014.
[9] G. Iassinovskaia, S. Limbourg, and F. Riane, “The inventory-routing problem of
returnable transport items with time windows and simultaneous pickup and delivery
in closed-loop supply chains,” International Journal of Production Economics, vol.
183, pp. 570–582, 2017.
[10] Z. Dai, Z. Zhang, and M. Chen, “The home health care location-routing problem
with a mixed fleet and battery swapping stations using a competitive simulated
annealing algorithm,” Expert Systems with Applications, vol. 228, p. 120374, 2023.
[11] J. Du, X. Wang, X. Wu, F. Zhou, and L. Zhou, “Multi-objective optimization for
two-echelon joint delivery location routing problem considering carbon emission
under online shopping,” Transportation Letters, vol. 15, no. 8, pp. 907–925, 2023.
[12] A. Zabihian-Bisheh, H. R. Vandchali, V. Kayvanfar, and F. Werner, “A
sustainable multi-objective model for the hazardous waste location-routing problem:
A real case study,” Sustainable Operations and Computers, vol. 5, pp. 1–14, 2024.
[13] J. M. Mulvey, R. J. Vanderbei, and S. A. Zenios, “Robust optimization of largescale systems,” Operations research, vol. 43, no. 2, pp. 264–281, 1995.
[14] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski, Robust optimization, vol. 28.
Princeton university press, 2009.
[15] D. Bertsimas and M. Sim, “The price of robustness,” Operations research, vol.
52, no. 1, pp. 35–53, 2004.
[16] M. S. Pishvaee, M. Rabbani, and S. A. Torabi, “A robust optimization approach
to closed-loop supply chain network design under uncertainty,” Applied
mathematical modelling, vol. 35, no. 2, pp. 637–649, 2011.
[17] A. Rahbari, M. M. Nasiri, F. Werner, M. Musavi, and F. Jolai, “The vehicle
routing and scheduling problem with cross-docking for perishable products under
uncertainty: Two robust bi-objective models,” Applied Mathematical Modelling, vol.
70, pp. 605–625, 2019.
[18] F. Habibzadeh Boukani, B. Farhang Moghaddam, and M. S. Pishvaee, “Robust
optimization approach to capacitated single and multiple allocation hub location
problems,” Computational and Applied Mathematics, vol. 35, pp. 45–60, 2016.
[19] M. Varas, S. Maturana, R. Pascual, I. Vargas, and J. Vera, “Scheduling production
for a sawmill: A robust optimization approach,” International Journal of Production
Economics, vol. 150, pp. 37–51, 2014.
[20] F. Guo, Z. Wang, Z. Huang, and X. Ma, “Robust optimization of microhub
network and mixed service strategy for a multidepot location-routing problem,”
Computers & Industrial Engineering, vol. 190, p. 110070, 2024.
[21] T. I. Faiz, C. Vogiatzis, J. Liu, and M. Noor‐E‐Alam, “A robust optimization
framework for two‐echelon vehicle and UAV routing for post‐disaster humanitarian
logistics operations,” Networks, vol. 84, no. 2, pp. 200–219, 2024.
[22] D. Bertsimas, V. Gupta, and N. Kallus, “Data-driven robust optimization,”
Mathematical Programming, vol. 167, pp. 235–292, 2018.
[23] C. Shang, X. Huang, and F. You, “Data-driven robust optimization based on
kernel learning,” Computers & Chemical Engineering, vol. 106, pp. 464–479, 2017.
[24] M. Musavi and A. Bozorgi-Amiri, “Data-driven robust optimization for hub
location-routing problem under uncertain environment.,” Journal of Industrial and
Systems Engineering, 2023.
[25] S. Mohseni, M. S. Pishvaee, and R. Dashti, “Privacy-preserving energy trading
management in networked microgrids via data-driven robust optimization assisted by
machine learning,” Sustainable Energy, Grids and Networks, vol. 34, p. 101011,
2023.
[26] Y. Li, Y. Sun, J. Liu, C. Liu, and F. Zhang, “A data driven robust optimization
model for scheduling near-zero carbon emission power plant considering the wind
power output uncertainties and electricity-carbon market,” Energy, p. 128053, 2023.
[27] M. Karimi-Mamaghan, M. Mohammadi, A. Pirayesh, A. M. Karimi-Mamaghan,
and H. Irani, “Hub-and-spoke network design under congestion: A learning based
metaheuristic,” Transportation research part e: logistics and transportation review,
vol. 142, p. 102069, 2020.
[28] C.-Y. Cheng, P. Pourhejazy, K.-C. Ying, S.-F. Li, and C.-W. Chang, “Learningbased metaheuristic for scheduling unrelated parallel machines with uncertain setup
times,” IEEE Access, vol. 8, pp. 74065–74082, 2020.
[29] A. Seyyedabbasi, R. Aliyev, F. Kiani, M. U. Gulle, H. Basyildiz, and M. A. Shah,
“Hybrid algorithms based on combining reinforcement learning and metaheuristic
methods to solve global optimization problems,” Knowledge-Based Systems, vol.
223, p. 107044, 2021.
[30] W. Qin, Z. Zhuang, Z. Huang, and H. Huang, “A novel reinforcement learningbased hyper-heuristic for heterogeneous vehicle routing problem,” Computers &
Industrial Engineering, vol. 156, p. 107252, 2021.
[31] V. A. de Santiago Junior, E. Özcan, and V. R. de Carvalho, “Hyper-heuristics
based on reinforcement learning, balanced heuristic selection and group decision
acceptance,” Applied Soft Computing, vol. 97, p. 106760, 2020.
[32] İ. Gölcük and F. B. Ozsoydan, “Q-learning and hyper-heuristic based algorithm
recommendation for changing environments,” Engineering Applications of Artificial
Intelligence, vol. 102, p. 104284, 2021.
[33] B. Xi and D. Lei, “Q-learning-based teaching-learning optimization for
distributed two-stage hybrid flow shop scheduling with fuzzy processing time,”
Complex System Modeling and Simulation, vol. 2, no. 2, pp. 113–129, 2022.
[34] X. Ni, W. Hu, Q. Fan, Y. Cui, and C. Qi, “A Q-learning based multi-strategy
integrated artificial bee colony algorithm with application in unmanned vehicle path
planning,” Expert Systems with Applications, vol. 236, p. 121303, 2024.
[35] Z. Zhang, Z. Wu, H. Zhang, and J. Wang, “Meta-learning-based deep
reinforcement learning for multiobjective optimization problems,” IEEE
Transactions on Neural Networks and Learning Systems, 2022.
[36] J. Kallestad, R. Hasibi, A. Hemmati, and K. Sörensen, “A general deep
reinforcement learning hyperheuristic framework for solving combinatorial
optimization problems,” European Journal of Operational Research, vol. 309, no. 1,
pp. 446–468, 2023.
[37] J. Lin, A. Shen, L. Wu, and Y. Zhong, “Learning-based simulated annealing
algorithm for unequal area facility layout problem,” Soft Computing, vol. 28, no. 6,
pp. 5667–5682, 2024.
[38] F. Kosanoglu, M. Atmis, and H. H. Turan, “A deep reinforcement learning
assisted simulated annealing algorithm for a maintenance planning problem,” Annals
of Operations Research, vol. 339, no. 1, pp. 79–110, 2024.
[39] Z. Zhang, Z. Shao, W. Shao, J. Chen, and D. Pi, “MRLM: A meta-reinforcement
learning-based metaheuristic for hybrid flow-shop scheduling problem with learning
and forgetting effects,” Swarm and Evolutionary Computation, vol. 85, p. 101479,
2024.
[40] S. Kirkpatrick, C. D. Gelatt Jr, and M. P. Vecchi, “Optimization by simulated
annealing,” science, vol. 220, no. 4598, pp. 671–680, 1983.
[41] J.-F. Cordeau, M. Gendreau, and G. Laporte, “A tabu search heuristic for periodic
and multi‐depot vehicle routing problems,” Networks: An International Journal, vol.
30, no. 2, pp. 105–119, 1997.