Simulation-based Optimization of Chemotherapeutic Drug Dosage: An Agent-based Q-learning Approach
Peyman Vafadoost Sabzevar
1
(
دپارتمان مهندسی پزشکی، دانشکده مهندسی برق و کامپیوتر
)
Keywords: Cancer, Control, Reinforcement Learning, Q-Learning.,
Abstract :
Cancer is indeed a growing concern worldwide for human health and existence, with its prevalence and impact on individuals and society increasing. Main objective of this article is to control and optimize drug dosage in order to prevent the uncontrollable growth of cancer cells and also restore the patient's immune cells to normal levels at the end of the training process. In such a way that the disease can be controlled in the early days of treatment. Reinforcement learning methods are widely applied in many domains nowadays and have attracted researchers' interest in conducting studies in this field. Therefore, in this article, specifically we also use the Q-learning method, one of the most famous model-free reinforcement learning methods, as well as the four-state nonlinear dynamic model called depillis, to simulate and design the proposed controller. Proposed controller's performance was evaluated in the presence of noise in three stages (training, simulation, and both stages simultaneously) as well as in the presence of uncertainty in one of the parameters of the depillis model. In state of uncertainty, a combination therapy of chemotherapy and immunotherapy has been suggested as a treatment approach.
[1] Siegel, Rebecca L., et al. "Cancer statistics, 2023." Ca Cancer J Clin 73.1 (2023): 17-48.
[2] World Health Organization. "WHO report on cancer: setting priorities, investing wisely and providing care for all." (2020).
[3] Padmanabhan, Regina, Nader Meskin, and Wassim M. Haddad. "Reinforcement learning-based control of drug dosing for cancer chemotherapy treatment." Mathematical biosciences 293 (2017): 11-20.
[4] Yang CY, Shiranthika C, Wang CY, Chen KW, Sumathipala S. Reinforcement learning strategies in cancer chemotherapy treatments: A review. Computer Methods and Programs in Biomedicine. 2023 Feb 1;229:107280.
[5] Perry MC, editor. The chemotherapy source book. Lippincott Williams & Wilkins; 2008.
[6] Lecca P. Control theory and cancer chemotherapy: How they interact. Frontiers in Bioengineering and Biotechnology. 2021 Jan 14;8:621269.
[7] Padmanabhan R, Meskin N, Al Moustafa AE. Mathematical models of cancer and different therapies. Singapore: Springer; 2021.
[8] Schättler H, Ledzewicz U. Optimal control for mathematical models of cancer therapies. An application of geometric methods. 2015.
[9] Wu X, Liu Q, Zhang K, Cheng M, Xin X. Optimal switching control for drug therapy process in cancer chemotherapy. European Journal of Control. 2018 Jul 1;42:49-58.
[10] Padmanabhan R, Meskin N, Haddad WM. Optimal adaptive control of drug dosing using integral reinforcement learning. Mathematical biosciences. 2019 Mar 1;309:131-42.
[11] Yazdjerdi P, Meskin N, Al-Naemi M, Al Moustafa AE, Kovács L. Reinforcement learning-based control of tumor growth under anti-angiogenic therapy. Computer methods and programs in biomedicine. 2019 May 1;173:15-26.
[12] Shiranthika C, Chen KW, Wang CY, Yang CY, Sudantha BH, Li WF. Supervised optimal chemotherapy regimen based on offline reinforcement learning. IEEE Journal of Biomedical and Health Informatics. 2022 Jun 17;26(9):4763-72.
[13] Padmanabhan R, Meskin N, Haddad WM. Reinforcement learning-based control of drug dosing with applications to anesthesia and cancer therapy. In Control applications for biomedical engineering systems 2020 Jan 1 (pp. 251-297). Academic Press.
[14] Azar AT, editor. Control Applications for Biomedical Engineering Systems. Academic Press; 2020 Jan 22.
[15] Kalhor E, Noori A, Saboori Rad S, Sadrnia MA. Using Eligibility Traces Algorithm to Specify the Optimal Dosage for the Purpose of Cancer Cell Population Control in Melanoma Patients with a Consideration of the Side Effects. Journal of Soft Computing and Information Technology. 2021 Mar 21;10(1):72-92.
[16] Noori A, Kalhor E, Sadrnia MA, Saboori RS. Controlling the Cancer Cells in a Nonlinear Model of Melanoma by Considering the Uncertainty Using Q-learning Algorithm Under the Case Based Reasoning Policy.
[17] Mashayekhi H, Nazari M. Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic. Journal of Control. 2022 Jan 10;15(4):13-23.
[18] Tourajizadeh H, Zarandi ZG, Farbodi Z, Ghasemabadi ES. Modelling and Control of Mutation Dynamics of the Cancer Cells Employing Chemotherapy. International Journal of Advanced Design & Manufacturing Technology. 2022 Mar 1;15(1).
[19] Zarandi ZG, Tourajizadeh H, Farbodei Z, Ghasemabad ES. Dynamic Modeling of the Cancer Cell Mutation with the Capability of Control Using Chemotropic Injection.
[20] Agarwal A, Jiang N, Kakade SM, Sun W. Reinforcement learning: Theory and algorithms. CS Dept., UW Seattle, Seattle, WA, USA, Tech. Rep. 2019 Jun 3;32:96.
[21] Winder P. Reinforcement learning. O'Reilly Media; 2020 Nov 6.
[22] De Pillis LG, Radunskaya A. The dynamics of an optimally controlled tumor model: A case study. Mathematical and computer modelling. 2003 Jun 1;37(11):1221-44.
[23] Clifton J, Laber E. Q-learning: Theory and applications. Annual Review of Statistics and Its Application. 2020 Mar 7;7:279-301.
[24] Padmanabhan R, Meskin N, Haddad WM. Closed-loop control of anesthesia and mean arterial pressure using reinforcement learning. Biomedical Signal Processing and Control. 2015 Sep 1;22:54-64.
[25] Nazari M, Ghaffari A. The effect of finite duration inputs on the dynamics of a system: Proposing a new approach for cancer treatment. International Journal of Biomathematics. 2015 May 30;8(03):1550036.