Determination of Reinforcement Learning Reward Parameters to Solve Path Planning of Unknown Environments by Design of Experiments
الموضوعات :issa Alali Alfares 1 , Ahmadreza Khoogar 2
1 - Faculty of Materials & Manufacturing Technologies, Malek Ashtar University of Technology, Tehran, Iran.
2 - Assistant Professor Malek Ashtar University of Technology
الکلمات المفتاحية: Autonomous Path Planning, Design of Experiment, Mobile Robot, Reinforcement Learning, Reward, Training Parameters,
ملخص المقالة :
The Reinforcement Learning Approach (RL) is used to solve the path-planning problem of an autonomous mobile robot in unknown environments. Despite that RL is a recent and powerful tool, it requires a lot of training processes because there are so many parameters in the agent’s training process. Some of these parameters have a larger effect on the convergence of the learning process than others, so, knowing these parameters and their suitable values makes the training process more efficient, saves time, and consequently makes the trained agent execute the required task successfully. No analytical equations are available to determine the best values for these parameters, therefore, in this paper, a statistical analysis is made using the design and analysis of experiment (DoE) methods to determine the parameters that have the largest effect on the training process. After that, analysis is done to determine the values of the most effective parameters. Results show that the determined parameters lead to a successful autonomous path planning in different unknown environments
[1] C. Zhou, B. Huang, and P. Fränti, "A review of motion planning algorithms for intelligent robots," Journal of Intelligent Manufacturing, vol. 33, no. 2, pp. 387-424, 2022/02/01 2022.
[2] B. Patle, A. Pandey, D. Parhi, and A. Jagadeesh, "A review: On path planning strategies for navigation of mobile robot," Defence Technology, 2019.
[3] Q.-L. Li, Y. Song, and Z.-G. Hou, "Neural network based FastSLAM for autonomous robots in unknown environments," Neurocomputing, vol. 165, pp. 99-110, 2015.
[4] F. Hart and O. Okhrin, "Enhanced method for reinforcement learning based dynamic obstacle avoidance by assessment of collision risk," Neurocomputing, vol. 568, p. 127097, 2024.
[5] C. Lamini, S. Benhlima, and A. Elbekri, "Genetic algorithm based approach for autonomous mobile robot path planning," Procedia Computer Science, vol. 127, pp. 180-189, 2018.
[6] R. Abiyev, D. Ibrahim, and B. Erin, "Navigation of mobile robots in the presence of obstacles," Advances in Engineering Software, vol. 41, no. 10-11, pp. 1179-1186, 2010.
[7] P. K. Mohanty and D. R. Parhi, "Optimal path planning for a mobile robot using cuckoo search algorithm," Journal of Experimental & Theoretical Artificial Intelligence, vol. 28, no. 1-2, pp. 35-52, 2016.
[8] S. Cebollada, L. Payá, M. Flores, A. Peidró, and O. Reinoso, "A state-of-the-art review on mobile robotics tasks using artificial intelligence and visual data," Expert Systems with Applications, vol. 167, p. 114195, 2021.
[9] R. Bailey and J. Reiss, "Design and analysis of experiments testing for biodiversity effects in ecology," Journal of Statistical Planning and Inference, vol. 144, pp. 69-80, 2014.
[10] D. C. Montgomery, Design and analysis of experiments. John wiley & sons, 2017.
[11] S. Chen, D. Wang, A. Zuo, Z. Chen, W. Li, and J. Zan, "Vehicle ride comfort analysis and optimization using design of experiment," in 2010 Second International Conference on Intelligent Human-Machine Systems and Cybernetics, 2010, vol. 1, pp. 14-18: IEEE.
[12] M. Hussain, J. Gu, R. Engel, and D. Shortt, "Designed experiment to find the optimal combination of the factors for the coordinate measuring machine (CMM) to measure cylindricity of engine cylinder bore," in 2015 International Conference on Industrial Engineering and Operations Management (IEOM), 2015, pp. 1-8: IEEE.
[13] G. Zou, J. Xu, and C. Wu, "Evaluation of factors that affect rutting resistance of asphalt mixes by orthogonal experiment design," International Journal of Pavement Research and Technology, vol. 10, no. 3, pp. 282-288, 2017.
[14] Q. Hu, Y. Liu, T. Zhang, S. Geng, and F. Wang, "Modeling the corrosion behavior of Ni-Cr-Mo-V high strength steel in the simulated deep sea environments using design of experiment and artificial neural network," Journal of Materials Science & Technology, vol. 35, no. 1, pp. 168-175, 2019.
[15] S. Dufreneix et al., "Design of experiments in medical physics: Application to the AAA beam model validation," Physica Medica, vol. 41, pp. 26-32, 2017.
[16] J. E. Mondragón, J. A. J. García, J. M. M. Flores, J. A. V. López, and S. T. Vázquez, "Experiments simulation and design to set traffic lights’ operation rules," Transport policy, vol. 67, pp. 2-12, 2018.
[17] S. Lv, Z. Niu, Q. Cui, Z. He, and G. Wang, "Reliability improvement through designed experiments with random effects," Computers & Industrial Engineering, vol. 112, pp. 231-237, 2017.
[18] H. Qi, M. Osterman, and M. Pecht, "Design of experiments for board-level solder joint reliability of PBGA package under various manufacturing and multiple environmental loading conditions," IEEE Transactions on Electronics Packaging Manufacturing, vol. 32, no. 1, pp. 32-40, 2009.
[19] R. Nowzari, N. Mirzaei, and L. Aldabbagh, "Finding the best configuration for a solar air heater by design and analysis of experiment," Energy Conversion and Management, vol. 100, pp. 131-137, 2015.
[20] L. W. Vieira et al., "Methodology for ranking controllable parameters to enhance operation of a steam generator with a combined Artificial Neural Network and Design of Experiments approach," Energy and AI, vol. 3, p. 100040, 2021.
[21] M. Saidi, M. Yousefi, M. Minbashi, and F. A. Ameri, "Catalytic upgrading of 4-methylaniosle as a representative of lignin-derived pyrolysis bio-oil: Process evaluation and optimization via coupled application of design of experiment and artificial neural networks," International Journal of Hydrogen Energy, vol. 46, no. 12, pp. 8411-8430, 2021.
[22] Y. Zhou, E.-J. van Kampen, and Q. Chu, "Hybrid Hierarchical Reinforcement Learning for online guidance and navigation with partial observability," Neurocomputing, vol. 331, pp. 443-457, 2019.
[23] M. Duguleana and G. Mogan, "Neural networks based reinforcement learning for mobile robots obstacle avoidance," Expert Systems with Applications, vol. 62, pp. 104-115, 2016.
[24] M. A. K. Jaradat, M. Al-Rousan, and L. Quadan, "Reinforcement based mobile robot navigation in dynamic environment," Robotics and Computer-Integrated Manufacturing, vol. 27, no. 1, pp. 135-149, 2011.
[25] C. Xia and A. El Kamel, "Neural inverse reinforcement learning in autonomous navigation," Robotics and Autonomous Systems, vol. 84, pp. 1-14, 2016.
[26] W. D. Smart and L. P. Kaelbling, "Effective reinforcement learning for mobile robots," in Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), 2002, vol. 4, pp. 3404-3410: IEEE.
[27] M. Wei, S. Wang, J. Zheng, and D. Chen, "UGV navigation optimization aided by reinforcement learning-based path tracking," IEEE Access, vol. 6, pp. 57814-57825, 2018.