Mapping the Knowledge Landscape of Machine Learning in Portfolio Optimization: A Bibliometric Analysis of Asset Allocation Research
Subject Areas : Financial and Economic Modelling
Mahsa Safavi Iranji
1
,
Mojgan Safa
2
,
Majid Zanjirdar
3
,
Hossein Jahangirnia
4
1 -
2 -
3 -
4 -
Keywords: Asset Allocation, Bibliometric Analysis , Machine Learning , Portfolio Optimization,
Abstract :
This study investigates the bibliometric analysis on asset allocation for portfolio optimization using machine learning algorithms. The primary objective is to identify and analyze the scientific literature through bibliometric analysis to uncover key themes, authors, sources, highly-cited articles, and countries involved in portfolio management research. To achieve this, 304 articles indexed in Scopus and Web of Science from 1990 to 2023 were analyzed. Using RStudio software, the study highlights various models employed in this field, along with tables, graphs, maps, and key performance metrics related to article production and citation impact. The findings reveal an upward trend in the use of machine learning for optimal portfolio management, asset allocation, and risk management since 2016. Additionally, the United States and China emerged as leading contributors to this literature. The results provide practical insights for market participants, especially those in fintech and finance sectors, to identify optimal machine learning solutions for decision-making processes. These findings also guide students in focusing their research efforts on underexplored areas within this domain.
[1] De Prado, M.M.L. Machine learning for asset managers, Cambridge University Press, 2020. doi: https://doi.org/10.1017/9781108883658
[2] Zamanpour,A., Zanjirdar, M., Davodi Nasr,M, Identify and rank the factors affecting stock portfolio optimization with fuzzy network analysis approach, Financial Engineering And Portfolio Manage-ment,2021;12(47): 210-236
[3] Zapata, H.O., Mukhopadhyay, S. A Bibliometric Analysis of Machine Learning Econometrics in Asset Pricing, Journal of Risk and Financial Management, 2022; 15(11):535. doi: https://doi.org/10.3390/jrfm15110535
[4] Ahmed, S., Alshater, M.M., El Ammari, A., Hammami, H. Artificial intelligence and machine learning in finance: A bibliometric review, Research in International Business and Finance, 2022; 61:101646. doi: https://doi.org/10.1016/j.ribaf.2022.101646
[5] Vo, N.N., He, X., Liu, S., Xu, G. Deep learning for decision making and the optimization of socially responsible investments and portfolio, Decision Support Systems, 2019; 124:113097. doi: https://doi.org/10.1016/j.dss.2019.113097
[6] Ozbayoglu, A.M., Gudelek, M.U., Sezer, O.B. Deep learning for financial applications: A survey, Ap-plied Soft Computing, 2020; 93:106384. doi: https://doi.org/10.48550/arxiv.2002.05786
[7] Lakzaie, F., Bahiraie, A., & Mohammadian, S. Visualized portfolio optimization of stock market: Case of TSE. Advances in Mathematical Finance and Applications, 2024; 2:707-722. doi: https://doi.org/https://doi.org/10.71716/amfa.2024.2301-1853
[8] Zanjirdar, M. Overview of portfolio optimization models, Advances in Mathematical Finance and Ap-plications, 2020; 5(4):419-435. doi: https://doi.org/10.22034/amfa.2020.674941
[9] Lopez de Prado, M. A robust estimator of the efficient frontier, SSRN, 2016. doi: https://doi.org/10.2139/ssrn.3469961
[10] Schwendner, P., Papenbrock, J., Jaeger, M., Krügel, S. ‘Adaptive Seriational Risk Parity’ and Other Extensions for Heuristic Portfolio Construction Using Machine Learning and Graph Theory, The Journal of Financial Data Science, 2021; 3(4):65-83. doi: https://doi.org/10.3905/jfds.2021.1.078
[11] Chen, W., Zhang, H., Mehlawat, M.K., Jia, L. Mean–variance portfolio optimization using machine learning-based stock price prediction, Applied Soft Computing, 2021; 100:106943. doi: https://doi.org/10.1016/j.asoc.2020.106943
[12] Wang, W., Li, W., Zhang, N., Liu, K. Portfolio formation with preselection using deep learning from long-term financial data, Expert Systems with Applications, 2020; 143:113042. doi: https://doi.org/10.1016/j.eswa.2019.113042
[13] Demey, P., Maillard, S., Roncalli, T. Risk-based indexation, Available at SSRN 1582998, 2010. doi: https://doi.org/10.2139/ssrn.1582998
[14] Choueifaty, Y. Towards maximum diversification, Available at SSRN 4063676, 2008. doi: https://doi.org/10.2139/ssrn.4063676
[15] Ferretti, S. On the modeling and simulation of portfolio allocation schemes: An approach based on network community detection, Comput. Econ, 2022; 1-37. doi: https://doi.org/10.48550/arxiv.2203.11780
[16] Ernst, P., Thompson, J., Miao, Y. Portfolio selection: The power of equal weight, arXiv preprint arXiv:1602.00782, 2016. doi: https://doi.org/10.48550/arxiv.2309.13696
[17] Tatsat, H., Puri, S., Lookabaugh, B. Machine Learning and Data Science Blueprints for Finance, O'Reilly Media, 2020.
[18] Gokhale, A., Mulay, P., Pramod, D., Kulkarni, R. A bibliometric analysis of digital image forensics, Sci. Technol. Libr, 2020; 39(1):96-113. doi: https://doi.org/10.1080/0194262X.2020.1714529
[19] Roemer, R. C., Borchardt, R. Meaningful metrics: A 21st century librarian's guide to bibliometrics, altmetrics, and research impact, Amer Library Assn, 2015. doi: https://doi.org/10.7710/2162-3309.2290
[20] Nourahmadi, M., Rasti, F., Sadeghi, H. A Review of Research on Clustering of Financial Time Series: A Knowledge Mapping Approach, Financ. Invest. Adv, 2021; 2(2):23-57. [In Persian] doi: https://doi.org/10.30495/afi.2021.1919857.1002
[21] Milian, E. Z., Spinola, M. d. M., de Carvalho, M. M. Fintechs: A literature review and research agenda, Electron. Commer. Res. Appl, 2019; 34:100833. doi: https://doi.org/10.1016/j.elerap.2019.100833
[22] Blanco-Mesa, F., Merigó, J. M., Gil-Lafuente, A. M. Fuzzy decision making: A bibliometric-based review, J. Intell. Fuzzy Syst, 2017; 32(3):2033-2050. doi: https://doi.org/10.3233/jifs-161640
[23] Martínez-López, F. J., Merigó, J. M., Valenzuela-Fernández, L., Nicolás, C. Fifty years of the Europe-an Journal of Marketing: A bibliometric analysis, Eur. J. Mark, 2018; 52(1-2):439-468. doi: https://doi.org/10.1108/ejm-11-2017-0853
[24] Mas-Tur, A., Modak, N. M., Merigó, J. M., Roig-Tierno, N., Geraci, M., Capecchi, V. Half a century of Quality & Quantity: A bibliometric review, Qual. Quant, 2019; 53:981-1020. doi: https://doi.org/10.1007/s11135-018-0799-1
[25] Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., Lim, W. M. How to conduct a bibliometric analy-sis: An overview and guidelines, J. Bus. Res, 2021; 133:285-296. doi: https://doi.org/10.1016/j.jbusres.2021.04.070
[26] Small, H. Visualizing science by citation mapping, J. Am. Soc. Inf. Sci, 1999; 50(9):799-813. doi: https://doi.org/10.1002/%28sici%291097-4571%281999%2950%3A9%3C799%3A%3Aaid-asi9%3E3.0.co%3B2-g
[27] Cobo, M. J., López‐Herrera, A. G., Herrera‐Viedma, E., Herrera, F. Science mapping software tools: Review, analysis, and cooperative study among tools, J. Am. Soc. Inf. Sci. Technol, 2011; 62(7):1382-1402. doi: https://doi.org/10.1002/asi.21525
[28] Heradio, R., De La Torre, L., Galan, D., Cabrerizo, F. J., Herrera-Viedma, E., Dormido, S. Virtual and remote labs in education: A bibliometric analysis, Comput. Educ, 2016; 98:14-38. doi: https://doi.org/10.1016/j.compedu.2016.03.010
[29] Danvila-del-Valle, I., Estévez-Mendoza, C., Lara, F. J. Human resources training: A bibliometric analysis, J. Bus. Res, 2019; 101:627-636. doi: https://doi.org/10.1016/j.jbusres.2019.02.026
[30] Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., Pappas, G. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: Strengths and weaknesses, FASEB J, 2008; 22(2):338-342. doi: https://doi.org/10.1096/fj.07-9492lsf
[31] Aria, M., Cuccurullo, C. bibliometrix: An R-tool for comprehensive science mapping analysis, J. In-formetr, 2017; 11(4):959-975
[32] Wang, Y. Research on supply chain financial risk assessment based on blockchain and fuzzy neural networks, Wirel. Commun. Mob. Comput, 2021; 2021:1-8. doi: https://doi.org/10.1155/2021/5565980
[33] Zheng, J., Wang, Y., Li, S., Chen, H. The stock index prediction based on SVR model with bat optimi-zation algorithm, Algorithms, 2021; 14(10):299. doi: https://doi.org/10.3390/a14100299
[34] Song, Z., Wang, Y., Qian, P., Song, S., Coenen, F., Jiang, Z., Su, J. From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization, Appl. Intell, 2022; 1-16. doi: https://doi.org/10.1007/s10489-022-04217-5
[35] Chen, B., Zhong, J., Chen, Y. A hybrid approach for portfolio selection with higher-order moments: Empirical evidence from Shanghai Stock Exchange, Expert Syst. Appl, 2020; 145:113104. doi: https://doi.org/10.1016/j.eswa.2019.113104
[36] Rojas-Sánchez, M.A., Palos-Sánchez, P.R., Folgado-Fernández, J.A. Systematic literature review and bibliometric analysis on virtual reality and education, Educ. Inf. Technol, 2023; 28(1):155-192. doi: https://doi.org/10.1007/s10639-022-11167-5
[37] Barboza, F., Kimura, H., Altman, E. Machine learning models and bankruptcy prediction, Expert Syst. Appl, 2017; 83:405-417. doi: https://doi.org/10.1016/j.eswa.2017.04.006
[38] Huang, C.-F. A hybrid stock selection model using genetic algorithms and support vector regression, Appl. Soft Comput, 2012; 12(2):807-818. doi: https://doi.org/10.1016/j.asoc.2011.10.009
[39] Ghoddusi, H., Creamer, G.G., Rafizadeh, N. Machine learning in energy economics and finance: A review, Energy Econ, 2019; 81:709-727. doi: https://doi.org/10.1016/j.eneco.2019.05.006
[40] Briand, L.C., Morasca, S., Basili, V.R. Property-based software engineering measurement, IEEE Trans. Softw. Eng, 1996; 22(1):68-86. doi: https://doi.org/10.1109/32.481535
[41] Henrique, B.M., Sobreiro, V.A., Kimura, H. Stock price prediction using support vector regression on daily and up to the minute prices, J. Finance Data Sci, 2018; 4(3):183-201. doi: https://doi.org/10.1016/j.jfds.2018.04.003
[42] Groth, S.S., Muntermann, J. An intraday market risk management approach based on textual analysis, Decis. Support Syst, 2011; 50(4):680-691. doi: https://doi.org/10.1016/j.dss.2010.08.019
[43] Ban, G.-Y., El Karoui, N., Lim, A.E. Machine learning and portfolio optimization, Manage. Sci, 2018; 64(3):1136-1154. doi: https://doi.org/10.1287/mnsc.2016.2644
[44] Paiva, F.D., Cardoso, R.T.N., Hanaoka, G.P., Duarte, W.M. Decision-making for financial trading: A fusion approach of machine learning and portfolio selection, Expert Syst. Appl, 2019; 115:635-655. doi: https://doi.org/10.1016/j.eswa.2018.08.003
[45] Soundararajan, K., Ho, H.K., Su, B. Sankey diagram framework for energy and exergy flows, Appl. Energy, 2014; 136:1035-1042. doi: https://doi.org/10.1016/j.apenergy.2014.08.070
[46] Xie, H., Zhang, Y., Wu, Z., Lv, T. A bibliometric analysis on land degradation: Current status, devel-opment, and future directions, Land, 2020; 9(1):28. doi: https://doi.org/10.3390/land9010028
[47] Altınay Özdemir, M., Göktaş, L.S. Research trends on digital detox holidays: A bibliometric analysis, 2012-2020, 2021. doi: https://doi.org/10.18089/tms.2021.170302
[48] Guo, Y.-M., Huang, Z.-L., Guo, J., Li, H., Guo, X.-R., Nkeli, M.J. Bibliometric analysis on smart cities research, Sustainability, 2019; 11(13):3606. doi: https://doi.org/10.3390/su11133606
[49] Zhou, W., Zhu, W., Chen, Y., Chen, J. Dynamic changes and multi-dimensional evolution of portfolio optimization, Econ. Res.-Ekonomska Istraživanja, 2022; 35(1):1431-1456. doi: https://doi.org/10.1080/1331677x.2021.1968308
[50] Asawa, Y.S. Modern Machine Learning Solutions for Portfolio Selection, IEEE Eng. Manag. Rev, 2021; 50(1):94-112. doi: https://doi.org/10.1109/emr.2021.3131158
[51] Ciciretti, V., Bucci, A. Building optimal regime-switching portfolios, N. Am. J. Econ. Finance, 2023; 64:101837. doi: https://doi.org/10.1016/j.najef.2022.101837
[52] Lim, T., Ong, C.S. Portfolio management: A financial application of unsupervised shape-based clus-tering-driven machine learning method, Int. J. Comput. Digit.
[53] Duarte, F. G., & de Castro, L. N. A fuzzy clustering algorithm for portfolio selection. 2019 IEEE 21st Conference on Business Informatics (CBI), 2019. doi: https://doi.org/10.1109/cbi.2019.00054
[54] Duarte, F. G., & De Castro, L. N. A framework to perform asset allocation based on partitional clus-tering. IEEE Access, 2020; 8:110775-110788. doi: https://doi.org/10.1109/access.2020.3001944
[55] Koratamaddi, P., Wadhwani, K., Gupta, M., & Sanjeevi, S. G. Market sentiment-aware deep rein-forcement learning approach for stock portfolio allocation. Engineering Science and Technology, an In-ternational Journal, 2021; 24(4):848-859. doi: https://doi.org/10.1016/j.jestch.2021.01.007
[56] Ma, Y., Han, R., & Wang, W. Portfolio optimization with return prediction using deep learning and machine learning. Expert Systems with Applications, 2021; 165:113973. doi: https://doi.org/10.1016/j.eswa.2020.113973
[57] Du, J. Mean–variance portfolio optimization with deep learning based-forecasts for cointegrated stocks. Expert Systems with Applications, 2022; 201:117005. doi: https://doi.org/10.1016/j.eswa.2022.117005
[58] Kwak, Y., Song, J., & Lee, H. Neural network with fixed noise for index-tracking portfolio optimiza-tion. Expert Systems with Applications, 2021; 183:115298. doi: https://doi.org/10.1016/j.eswa.2021.115298
[59] Rubesam, A. Machine learning portfolios with equal risk contributions: Evidence from the Brazilian market. Emerging Markets Review, 2022; 51:100891. doi: https://doi.org/10.1016/j.ememar.2022.100891
[60] Durall, R. Asset Allocation: From Markowitz to Deep Reinforcement Learning. arXiv preprint arXiv:2208.07158, 2022. doi: https://doi.org/10.48550/arxiv.2208.07158
[61] Jang, J., & Seong, N. Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Systems with Applications, 2023; 119556. doi: https://doi.org/10.1016/j.eswa.2023.119556
[62] Yun, H., Lee, M., Kang, Y. S., & Seok, J. Portfolio management via two-stage deep learning with a joint cost. Expert Systems with Applications, 2020; 143:113041. doi: https://doi.org/10.1016/j.eswa.2019.113041
[63] Chakravorty, G., Awasthi, A., & Da Silva, B. Deep learning for global tactical asset allocation. Avail-able at SSRN 3242432, 2018. doi: https://doi.org/10.2139/ssrn.3242432
[64] Mohamadi,M., Zanjirdar,M., On the Relationship between different types of institutional owners and accounting conservatism with cost stickiness, Journal of Management Accounting and Auditing Knowledge, 2018;7(28): 201-214
[65] Zanjirdar, M., Moslehi Araghi, M., The impact of changes in uncertainty, unexpected earning of each share and positive or negative forecast of profit per share in different economic condition, Quarterly Journal of Fiscal and Economic Policies,2016;4(13): 55-76.
[66] Nikumaram, H., Rahnamay Roodposhti, F., Zanjirdar, M., The explanation of risk and expected rate of return by using of Conditional Downside Capital Assets Pricing Model, Financial knowledge of securities analysis,2008;3(1):55-77
Adv. Math. Fin. App., 2025, 10(4), P. 392-415 | |
| Advances in Mathematical Finance & Applications www.amfa.iau-arak.ac.ir Print ISSN: 2538-5569 Online ISSN: 2645-4610 Doi: 10.71716/amfa.2025.51190546 |
Review Article
Mapping the Knowledge Landscape of Machine Learning in Portfolio Optimization: A Bibliometric Analysis of Asset Allocation Research
| |
Mahsa Safavi Iranjia, Mojgan Safab,*, Majid Zanjirdarc, Hossein Jahangirniab | |
a Department of Finance, Qom Branch, Islamic Azad University, Qom, Iran b Department of Accounting, Qom Branch, Islamic Azad University, Qom, Iran c Department of Finance, Ar.c. , Islamic Azad University, Arak, Iran |
Article Info Article history: Received 2024-11-15 Accepted 2025-03-14
Keywords: Asset Allocation Bibliometric Analysis Machine Learning Portfolio Optimization
|
| Abstract |
This study investigates the bibliometric analysis on asset allocation for portfolio optimization using machine learning algorithms. The primary objective is to identify and analyze the scientific literature through bibliometric analysis to uncover key themes, authors, sources, highly-cited articles, and countries involved in portfolio management research. To achieve this, 304 articles indexed in Scopus and Web of Science from 1990 to 2023 were analyzed. Using RStudio software, the study highlights various models employed in this field, along with tables, graphs, maps, and key performance metrics related to article production and citation impact.The findings reveal an upward trend in the use of machine learning for optimal portfolio management, asset allocation, and risk management since 2016. Additionally, the United States and China emerged as leading contributors to this literature. The results provide practical insights for market participants, especially those in fintech and finance sectors, to identify optimal machine learning solutions for decision-making processes. These findings also guide students in focusing their research efforts on underexplored areas within this domain.
|
1 Introduction
In financial markets, risk management and portfolio optimization are among the primary objectives and intellectual challenges faced by finance professionals and academics. Achieving an efficient model for financial asset allocation has become a major issue, as decisions related to asset allocation are often made under uncertainty and with incomplete information [1]. Other findings research, through a regular and logical process based on the judgment method in a survey of 14 experts in the field of capital market investment and a quantitative and multivariate model of fuzzy network analysis, to assess the level of importance, ranking and refining the effective factors. Portfolio optimization was undertaken. Based on the analysis, the variables of profit volatility, return on capital, company value, market risk, stock profitability, financial structure, liquidity and survival index can be introduced as the most important factors affecting the optimization of the stock portfolio [2]. Machine learning methodologies have been meticulously developed to effectively handle and analyze extensive datasets, and they have consistently exhibited remarkably high levels of predictive accuracy, particularly within the domains of investment strategies and various branches of computer science. Recent studies in finance explore how machine learning models can enhance the performance of traditional asset allocation models [3]. These applications simplify the resolution of linear and nonlinear problems, which traditional models often struggle with. As a result, deep learning and machine learning techniques, as subsets of artificial intelligence, have found extensive applications in finance [4].Given the importance of this subject, the current research reviews the literature on the application of machine learning in asset allocation, risk management, and portfolio optimization. The primary objective is to introduce and evaluate recent developments in computational methods. The study highlights key findings that enhance understanding of machine learning applications in asset allocation and risk management. The reviewed literature shows minimal overlap with recent books and review articles. This research utilizes a quantitative, objective, and transparent bibliometric analysis, combining data from Scopus and Web of Science to differentiate it from previous studies. Additionally, it complements earlier work in the field by highlighting similar findings in related areas of finance. Accordingly, the main research question of this study is: To what extent have machine learning algorithms gained importance and played a significant role in optimizing asset allocation and managing risk in investment portfolios?
2 Research Background
2.1. Theoretical Foundations of the Research
Traditionally, investors adopt an active approach to analyzing financial reports in search of the best stocks, focusing on investment returns [5]. By prioritizing assets based on expected returns and risk tolerance, they aim to construct an optimal portfolio. Moreover, by periodically revisiting their strategies and rebalancing their portfolio mix, they work towards achieving both long-term and short-term financial objectives [6, 7]. Markowitz (1952) developed the mean-variance (MV) portfolio model to outperform the market index in terms of asset returns [8]. Portfolio construction incorporates the average historical returns of stocks, the expected return, and the standard deviation of those returns as a measure of risk. Notably, many investors continue to rely on the MV strategy today [6]. Researchers such as Lopez de Prado [9], Schwendner et al. [10], Chen et al. [11], and Wang et al. [12] have successfully integrated machine learning algorithms with traditional asset allocation strategies, yielding results that surpass models like the minimum variance portfolio, equal risk contribution portfolio [13], maximum diversification portfolio [14], inverse variance portfolio [15], and equal weight portfolio [16] with the increasing complexity of methods and the growing need for computational efficiency, machine learning techniques have been developed to enhance the process.
Figure 1 illustrates various machine learning paradigms that have been designed to address different challenges within machine learning. The various methodologies associated with machine learning can be systematically classified into three distinct categories, which are clearly illustrated in the subsequent figure presented below: the first category is referred to as supervised learning, the second category is identified as unsupervised learning, and the third and final category is known as reinforcement learning [17].
|
Fig. 1: Classification of Machine Learning Models [17] |
All of the aforementioned methodologies can indeed be utilized effectively within the realm of portfolio management, which is a critical area of study in finance and investment; however, based on the findings from recent empirical research and scholarly studies conducted in this field, it has been determined that unsupervised learning algorithms are currently regarded as the most pragmatic and efficient approach available in contemporary practice [17].This investigation offers an extensive examination of machine learning methodologies, with a particular emphasis on unsupervised learning, within the context of portfolio optimization, thereby delivering enhanced understandings for scholars and participants in the financial markets. Through an analysis of the merits and limitations inherent in these methodologies, it aids asset managers and investors in making more enlightened choices.
The research presents a systematic framework for the comparative evaluation of machine learning models, delineates existing research deficiencies, and suggests novel trajectories for forthcoming inquiries. It assesses the efficacy of these methodologies when juxtaposed with traditional asset allocation paradigms in relation to risk and return profiles. Notable research deficiencies encompass the absence of comprehensive investigations into unsupervised learning applications in portfolio optimization, restricted exploration of its amalgamation with clustering-centric strategies, and inadequate comparative analyses of various machine learning techniques. The inquiry holds significant relevance for academics, institutional investors, financial analysts, fintech developers, and regulatory authorities. As an exhaustive review, it elucidates prospective research pathways and furnishes pragmatic insights for the integration of machine learning within portfolio management.
2.2. A Review of the Theoretical Literature on Knowledge Mapping
This research endeavor meticulously utilizes the methodology known as bibliometric analysis, a concept that was originally introduced into the academic discourse by the scholar Pritchard in the year nineteen sixty-nine, who astutely observed that this analytical approach is especially relevant and beneficial in scholarly investigations that seek to systematically quantify and elucidate the intricate processes underlying the phenomena of written communication. [18]. Bibliometric analysis uses quantitative methods to assess, monitor, and analyze scientific information [19]. This approach highlights authors' publications, prominent journals, methodologies employed, and main findings [20], thus offering a comprehensive view of any research field [21]. Bibliometric methods encompass extensive bibliographic data and are applied in examining diverse topics [22], journals [23], countries [24], and other research aspects. Today, the popularity of knowledge mapping analysis in research reflects its application for managing large volumes of bibliographic data, enhancing effectiveness in scientific studies, and identifying gaps across various fields of science.
The table below briefly presents an example of suitable metrics for comparing and analyzing the main methods related to review studies [25]. Other findings suggest that cost stickiness has a positive impact on the relationship between institutional investors and passive institutional investors with conservatism [64]. The findings of some researchers showed that there is a significant relation between the stock market uncertainty changes in an economic boom and the investment risk in general, which is not significant in terms of the economic turndown. The Investment risk during both economic boom and recession is decreased by the unexpected increase in profit of each share and propagation of positive news. Although the risk is increased by the spread of negative forecasts in relation to shares [65]. The researchers' findings show that risk preimum was a determining factor in explaining changes in investors' expected rate of return, and that there was a conditional relationship between the Downside Beta and expected return. Therefore, to explain the relationship between risk and return, one must pay attention to the market direction[66].
Table 1: Comparison of Different Pertinent Review Methods [25] | |||
Analysis Method | Objective | Application | |
Bibliometric Analysis | To uncover the status of emerging trends and provide the intellectual structure of a specific topic by summarizing large amounts of scientific data for both quantitative and qualitative analysis. | When datasets are large and the scope of the review is too extensive for manual analysis. | |
Meta-Analysis | To furnish a comprehensive synthesis of empirical data regarding the correlation between variables that have yet to be scrutinized in extant research pertaining to quantitative assessment. | When homogeneous studies are sufficiently extensive to summarize results without delving into content. | |
Systematic Literature Review | To summarize findings and results from the existing literature in a specific field for quantitative analysis. | If the content of all scientific data is small and manageable, and manual review is feasible. | |
|
3 Methodology
This research conducts a quantitative investigation using Bibliometric Analysis to identify and examine literature on asset allocation in investment portfolios, offering a comprehensive mapping of the knowledge structure in this field. A scientific roadmap and performance analysis of the study were subsequently conducted. The scientific roadmap, or Bibliometric Analysis, represents how trends, specialties, individual and collective articles, and authors interconnect [26].
To achieve a comprehensive quantitative analysis in the area of portfolio optimization, the following questions were initially posed:
Table 2: Research Questions | |||
Question | Research Question | Objective | Motivation |
Question 1 | Which scholars and academic journals are leading the discourse on the utilization of machine learning for portfolio optimization, and which scholarly articles have received the highest number of citations? | To identify the most prolific sources and authors | In order to enhance comprehension of the principles of scientific leadership pertaining to the utilization of machine learning within the realm of portfolio optimization. |
Question 2 | Which countries have the largest share in scientific production on the topic of this study, and which keywords are most frequently used in the relevant literature? | To show which topics receive the most attention from researchers in different countries | To identify the key topics that scientific research is currently focusing on within the field of portfolio optimization using machine learning. |
Question 3 | Do bibliometric maps, charts, and data tables, along with the analysis of conceptual, intellectual, and social structures, demonstrate the widespread application of different asset allocation methods in portfolio optimization? | To conduct a thorough analysis and summarize it visually | To facilitate a better understanding of the current state of research in portfolio optimization by analyzing trends, themes, and influential contributions. |
Question 4 | What are the main contributions of studies related to asset allocation for portfolio optimization using machine learning from an inductive analysis perspective? | To become familiar with key works, methods used, applications, and results obtained | To assist the scientific community in enhancing productivity in this field |
Fig. 2: Methodology Used [29]
3.1 Source Identification
Data was meticulously collected from an extensive array of scholarly articles that are indexed within the prestigious and widely recognized databases of Scopus and Web of Science, which were judiciously selected due to their remarkable compatibility with the advanced biblioshiny software, a powerful tool that significantly aids in the detailed analysis and interpretation of bibliometric data. The comprehensive search was conducted exclusively in the English language to ensure the acquisition of the most exhaustive and representative set of documents pertaining to the crucial subject of asset allocation for the optimization of investment portfolios. The selection of these two eminent databases was predicated on three principal criteria that are of paramount importance:
- These databases facilitate the bulk downloading of a substantial number of sources, thereby allowing for an efficient and effective collection of relevant literature.
- They possess an extensive historical time span that encompasses a wide range of relevant publications over numerous years.
- They also provide the capability for the simultaneous downloading of considerable amounts of stored information, thus enhancing the research process and data collection efficiency [30].
The comprehensive search, which was conducted on the date of May 10, 2023, employed the specific search terms illustrated in Figure 3 and encompassed all documents that had been published from the year 1990 through to April 2023. The data that was collected during this thorough investigation included a wealth of bibliometric information, and subsequent to a meticulous review facilitated by R software, non-relevant documents such as duplicates, conference papers, editorials, books, book chapters, news articles, and items that fell outside the realm of financial journals (including those from the fields of medical or environmental sciences) were systematically filtered out. R, being a powerful software for data analysis, played a crucial role in streamlining this review process. This rigorous process culminated in the establishment of a final set comprising 304 qualified articles, derived from an initial pool of 595 relevant documents, thereby capturing a representative sample of international scientific activity that was published within esteemed academic journals.
Data analysis was diligently performed utilizing Biblioshiny, which is an innovative web interface that is part of the Bibliometrix package (version 5.0) developed by the esteemed scholars Aria and Cuccurullo [31]. The Biblioshiny tool allows for the graphical representation of statistical data, thereby significantly enhancing the visualization of key themes and trends within the dataset. In the context of this study, the resulting charts vividly depict various topics that are intricately related to investment portfolios and the applications of machine learning within the defined timeframe, thus providing a comprehensive overview of the research landscape in this domain.
This scholarly investigation employs the methodological framework that has been meticulously advocated by Cobo et al. [27], with the explicit aim of visually elucidating the various research topics along with their intricate structures that exist within the dataset, utilizing sophisticated mapping techniques that facilitate a deeper understanding of these relationships. In the subsequent phase of the analytical process, the scholarly articles are systematically organized in a manner that reflects a descending order based on the total citation counts as well as the average citations that are attributed to each publication, thereby allowing for a more nuanced comparison of their impact. For the execution of this particular section, the comprehensive guidelines that have been meticulously delineated by Heradio et al. [28] were scrupulously adhered to, ensuring that the analysis aligns with established academic standards. Figure 2 serves as a visual representation that effectively illustrates the sequential steps that are integral to this methodological process, thereby providing clarity to the reader regarding the procedural framework employed in this study.
| Dataset | ||
Fig. 3: Data Extraction and Cleaning Process Source: Researcher's Findings Note: The number of documents before removing conference papers, books, book chapters, notes, and other types of documents was 595, and after the removal, it was reduced to 304. |
4 Findings
The principal methodology for evaluating research performance in this investigation is citation analysis, wherein an elevated citation frequency signifies a more substantial impact within the discipline. The h-index functions as a robust metric, encompassing both the volume and the influence of a researcher’s scientific contributions. The outcomes of the data analysis are encapsulated in the descriptive statistics presented in Table 3. Findings indicate that the application of machine learning in portfolio optimization is a significant academic interest, as refelected by the 304 articles analyzed and an average of over 10 citations per article.
Table 3: Primary Data | |
Description | Results |
Main Data Information | |
Duration | 1990:2023 |
Sources (Journals, Books, etc.) | 373 |
Documents | 595 |
Annual Growth Rate(%) | 12.81% |
Average Document Age | 3.83 years |
Average Citationsper Document | 10.82 |
Sources | 21,596 |
Documented Topics | |
Keywords (ID) | 2,763 |
Author Keywords (DE) | 1,443 |
Authors | |
Total Authors | 1,532 |
Single-Author Documents | 84 |
Author Collaboration | |
Single-Author Documents | 98 |
Collaborators per Document | 3.07 |
International Collaborators(%) | 1.18% |
Document Types | |
Article | 304 |
Book | 19 |
Book Chapter | 24 |
Conference Paper | 215 |
Review | 23 |
Other | 10 |
Source: Researcher's Findings
|
In addressing the first research question, Figure 4 highlights the most prolific authors over the past five years: Wang Y, Chen Y, Creamer G, Wu X, and Al J M. Wang Y has concentrated on topics such as financial risk assessment using neural networks [32], stock index prediction via the SVR model enhanced with the Bat optimization algorithm [33], and applying an interpretable reinforcement learning model for portfolio optimization [34]. Conversely, Chen Y has proposed a hybrid portfolio selection method based on empirical data from the Shanghai Stock Exchange [35]. Other authors have investigated various optimization algorithms to assess, enhance, and analyze investment portfolios according to specific economic indicators.
|
Fig. 4: Most Cited Authors |
Source: Researcher's Findings |
The main institutional affiliations can be seen in Figure (5). This figure shows that Islamic Azad University, with 9 published papers in the analyzed dataset, is the most productive institution in the field of applying machine learning in portfolio optimization. In second to fourth place, the Spanish university Complutense University of Madrid, Malaysia's Jinan University, and China's Nanjing University of Information Science and Technology are each ranked with 5 papers.
|
Fig. 5: Most Related Affiliated Institutions |
Source: Researcher's Findings |
In this research, a similarity criterion called connectivity strength was used to create knowledge mapping maps. It is worth mentioning that this criterion provides significant assistance in preparing various scientific maps and better displaying the dynamic and structural aspects of the obtained data [36].
Table 4 presents information about influential and effective journals in the fields of expert systems and finance. By examining the number of articles and the h-index, it is evident that the journal "Expert Systems with Applications" has the greatest impact in this area, with 28 articles and an h-index of 17. Other journals, such as "Quantitative Finance" and "Journal of Financial Data Science," also contribute significantly to the scientific literature, with dozens of articles published and appropriate h-index values.
Table 4: Most Influential and Efficient Journals | ||
Source | Number of Articles | h_index |
Expert Systems with Applications | 28 | 17 |
Quantitative Finance | 10 | 6 |
European Journal of Operational Research | 7 | 5 |
Journal of Financial Data Science | 16 | 4 |
Journal of Risk and Financial Management | 7 | 4 |
Annals of Operations Research | 6 | 3 |
Applied Soft Computing | 3 | 3 |
Decision Support Systems | 3 | 3 |
IEEE Access | 9 | 3 |
Cognitive Computation | 3 | 2 |
Computational Economics | 4 | 2 |
Computational Intelligence and Neuroscience | 5 | 2 |
Frontiers in Artificial Intelligence | 3 | 2 |
Journal of Fuzzy Systems and Intelligent Systems | 2 | 2 |
Journal of Risk Management in Financial Institutions | 4 | 2 |
Source: Researcher's Findings
|
4.1 Key Documents and Frequently Used Terms in The Dataset
A comprehensive research study that is characterized by an exceptional citation count undeniably exerts a profound influence on the progression of researchers who are diligently working to advance the specific field of investigation that is being examined in the scholarly literature [36]. In this context, Table 5 meticulously delineates the documents that have achieved the highest citation counts, thereby providing invaluable insights into the most impactful works within the realm of academic research. The document that has garnered the most citations, which is meticulously authored by Barboza et al. [37], boasts an impressive total of 336 citations, while the subsequent document authored by Huang [38] follows closely with a citation count of 171, both of which were published in distinguished journals, namely Expert Systems with Applications and Applied Soft Computing, respectively. Occupying the third position in this hierarchy of citation counts is the work of Ghoddusi et al. [39], which has accumulated a noteworthy total of 158 citations and was published in the reputable journal Energy Economics. Figure 6 serves as a visual representation that illustrates the most frequently occurring terms within the dataset, thereby providing a comprehensive overview of the thematic focus of the literature. The foremost four terms identified correspond directly to phrases extracted from the search strings utilized in the research, with "risk assessment" emerging as the most prevalent term among them. Other terms that are frequently encountered within the dataset include "trade", "financial markets", "forecasting,", "financial data processing" and "deep learning" all of which collectively underscore the significant role that machine learning plays in the domain of portfolio management. Furthermore, the co-occurrence of these keywords not only highlights their individual significance but also serves to illustrate the intricate knowledge structure that exists within the body of literature pertaining to this field of study.
Table 5: Most Globally Cited Documents | |||||
Authors | Title | Source | Total Citations | Highlights | |
Barboza et al [37] | Machine learning models and bankruptcy prediction | Expert Systems with Applications | 336 | Examination and testing of machine learning models for bankruptcy prediction. | |
Huang [38] | A hybrid stock selection model using genetic algorithms and support vector regression | Applied Soft Computing | 171 | Development of an effective stock selection method using Genetic Algorithms (GAs) and Support Vector Regression (SVR). | |
Ghoddusi et al [39] | Machine learning in energy economics and finance: A review | Energy Economics | 158 | A review of the growth dedicated to machine learning applications in the fields of energy economics and finance. | |
Briand et al [40] | Property-based software engineering measurement | IEEE Transactions on Software Engineering | 149 | Introduction of a mathematical framework to provide definitions for several important measurement concepts such as size, length, complexity, continuity, and coupling. | |
Stock price prediction using support vector regression on daily and up to the minute prices | The Journal of Finance and Data Science | 136 | Introduction and use of Support Vector Regression (SVR) aimed at predicting stock prices. | ||
Groth and Muntermann [42] | An intraday market risk management approach based on textual analysis | Decision Support Systems | 115 | Utilization of four different learners: Naïve Bayes, k-Nearest Neighbors, Neural Networks, and Support Vector Machines to identify patterns in news data related to company disclosures that affect stock prices. | |
Machine learning and portfolio optimization | Management Science | 103 | Use of two machine learning methods: Performance-Based Regularization (PBR) for estimating models with lower error and cross-validation for portfolio optimization. | ||
Paiva et al [44] | Decision-making for financial trading: A fusion approach of machine learning and portfolio selection | Expert Systems with Applications | 99 | Development of a model using a fusion approach of machine learning classifiers based on Support Vector Machines (SVM) and Mean-Variance (MV) method for portfolio selection, compared to SVM + 1/N and Random + MV methods. | |
Source: Researcher's Findings |
|
Fig. 6: Most Frequent Keywords Source: Researcher's Findings |
4.2 Generating Countries/Regions
In response to the secondary question of this research, Figure 7 presents the publications and citations from six leading countries/regions. According to the data, the People's Republic of China ranks first in terms of publication volume with 52 documents, followed by the United States (32), Germany (12), and South Korea (11). In terms of citations, China also leads with 694 citations, followed by the United States (465), the United Kingdom (269), and South Korea (149). China has established itself as a global leader in machine learning technology, and Chinese universities are increasingly focused on research in this area to tackle challenges in financial portfolio risk management. This focus is largely driven by the high receptivity of the Chinese population to emerging technologies, making China a crucial market for machine learning advancements worldwide.
|
Fig. 7: Trend of Scientific Publications per Country (1990-2023) |
Source: Researcher's Findings |
The financial utilizations of machine learning are extensive and diverse. Figure 8 highlights the connections between key keywords—investment, learning systems, risk assessment, and portfolio optimization—and countries like China and the United States, as well as leading universities, indicating their higher academic prominence. The topics related to risk assessment in financial markets, forecasting, trading, deep learning, financial data processing, electronic trading, and learning algorithms are part of the broader research landscape.
As shown in Figure 9, the evolution of these topics is illustrated using a Sankey diagram, a specialized flow chart. This diagram visualizes the thematic progression over time in the field of portfolio management and machine learning. It helps to understand how different topics have evolved and been applied within the context of portfolio management. Furthermore, Figure 9 provides quantitative insights into the flow of topics, their directional movement, and the relationships between transformations in these topics [45].
|
Fig. 8: Network of Connections Between Countries, Keywords, and Authoring Universities |
Source: Researcher's Findings |
|
Fig. 9: Thematic Evolution |
Source: Researcher's Findings |
4.3 Main Topics in Keywords Based on Factor Analysis
In response to the third question, Figure 10 presents a two-dimensional map created from the keywords in “Keywords Plus”. Factor analysis, a powerful tool for summarizing data with multiple variables, is employed to facilitate the reduction of the dataset into a lower-dimensional framework. This method enables the visualization of hidden patterns within the data. Within the representation, keywords situated in proximity to the center of the cluster signify subjects that have attracted considerable scholarly interest in recent years, while keywords at the edges of the clusters indicate topics that have been less explored in the research [46].
|
Fig. 10: Cluster Analysis Factor Map |
Source: Researcher's Findings |
The larger cluster encompasses practical terms related to portfolio optimization and various types of machine learning algorithms used by investors, analysts, and asset managers. The second cluster also highlights the importance of the topic of time intervals in the statistical data of this research.
4.4 Simultaneous Network Representation
In order to achieve a thorough comprehension of the conceptual framework, an exhaustive mapping of the terminology employed by authors was undertaken. The concurrent analysis illustrated in Figure 11 elucidates not only the predominant keywords, such as asset allocation and portfolio management, but also associated subjects including risk management, investment, and learning systems. Concurrent keyword analysis constitutes an efficacious approach for investigating knowledge frameworks and discerning research trajectories. It facilitates the differentiation between primary and secondary scholarly publications [47]. The analysis starts by identifying nodes, where their size reflects the number of documents, and the lines connecting them represent relationships between two groups. A short line indicates a strong connection between the keywords, while a longer line suggests a weaker link [48]. In this analysis, the primary keywords identified in the first cluster are "risk management", "investment", and "machine learning". Each cluster represents a keyword and shows the frequency and number of connections between them in publications. These clusters are color-coded, with the blue cluster highlighting "risk management", "investment" and "machine learning" , while the red cluster emphasizes "risk assessment", "finance" , "risk prediction", "algorithms", "risk analysis" and "value engineering".
Since this research mapped a quantitative knowledge structure, the number of simultaneous keyword links was moderate. As depicted in Figure 11, two groups show a stronger relationship: "investment in financial markets utilizing machine learning" and "application of machine learning in risk assessment".
|
Fig. 11: Cluster Analysis Keyword Network |
Source: Researcher's Findings |
|
Fig. 12: Annual Scientific Production Source: Researcher's Findings Note: The annual productivity of the scientific output of machine learning applications in portfolio optimization. The trend line calculations are exponential. |
Figure 12 illustrates the scientific production in the field of machine learning applications for portfolio optimization from 1990 to April 2023, showing a significant increase in publications, especially from 2015 onwards, with a peak of 142 papers in 2021. This surge indicates a growing interest in machine learning techniques within capital management. Furthermore, the exponential trend equation and its coefficient of determination emphasize a notable yet variable growth pattern in this area.
5 Stage Two Analysis
This section provides a brief evaluation and categorization of the most cited articles. Unlike other studies [49], which focus exclusively on a quantitative approach, this research adopts a dual quantitative-qualitative methodology based on the research of NourAhmadi et al [20]. As a result, our analysis extends beyond merely counting articles, authors, or journals, and instead delves into the most relevant data related to the application of machine learning in portfolio optimization.
The simultaneous onset and stagnation of COVID-19 globally forced investors and fintech organizations to reconsider their traditional strategies for optimizing asset allocation. Additionally, there is a possibility of future occurrences of strong black swan events, as has been seen in the past [50]. Consequently, the authors of this research believe that fund managers and other active market investors can rely on news and leverage modern global technologies and tools to predict market movements and accurately value investments during times of high volatility, with a focus on investing in quality companies; thus enhancing potential returns and risk diversification.
The main focus of this section of the present research is on gathering and reviewing recent advancements in machine learning models for asset allocation in an optimized portfolio. Figure (13) illustrates the categorization of groups and the methods considered within each group.
|
Fig. 13: Machine Learning Solutions in Portfolio Selection |
Source: Researcher's Findings
Table (6) reviews the background of research conducted on the application of machine learning methods in investment portfolios.
Table 6: Research on Machine Learning Applications in Portfolio Management | ||||||||
Algorithm | Author | Index/Stocks | Models | Portfolio Strategy | Time Period | Number of Stocks in Portfolio | Performance Evaluation | |
Clustering | Ciciretti Bucci et al [51] | Top 100 stocks in SP 500 | Bootstrapping operator | Comparison of the minimum cluster method, hierarchical risk parity (HRP), Markowitz minimum Sharpe portfolio (MSR), and minimum variance portfolio (MVP) | January 2010 to September 2021 | 30 stocks | Average return = 14.65%, Standard deviation = 13.31%, Skewness = -1.29, Kurtosis = 2.45, Sharpe = 0.8, Sortino = 1.42, Information ratio = 4.81, Treynor = 0.98 | |
Lim and Ong [52] | Top 82 stocks listed on SGX | Shape-based clustering | Hierarchical clustering, AHC-DTW | 2015 to 2017 | Varies from 3 to 6 stocks per portfolio | Average return = 9.24%, Average Sharpe = 33.98 | ||
Duarte and de Castro [53] | Stocks traded on B3 (Brazilian Stock Exchange) | Fuzzy clustering | Markowitz mean-variance model and risk parity model | From December 30, 2009 to December 30, 2017 | N/A | Return = 151.9%, Risk = 14.2%, Asset turnover = 31.4%, Maximum drawdown = -16.9%, Sharpe = 1.8 | ||
Duarte and de Castro [54] | 20 stocks from B3 (Brazilian Stock Exchange) | Partitional clustering | Markowitz mean-variance model and risk parity model | From December 2005 to April 2020 | 20 stocks | Return = 9.1%, Risk = 22.3%, Asset turnover = 22.5%, Maximum drawdown = -47.7%, Sharpe = -0.1% | ||
Koratamaddi et al [55] | 30 companies in the Dow Jones Industrial Average | Deep Deterministic Policy Gradient (DDPG) | Comparison with mean-variance, minimum variance, DDPG, adaptive DDPG, and sentiment-aware adaptive DDPG trader agent | January 1, 2001 to February 10, 2018 | Maximum of 5 stocks per day | Sharpe ratio = 2.07, Annual return % = 22.05, Annual std error = 0.096 | ||
SVM | Paiva et al [44] | 53 to 73 stocks listed in Ibovespa, São Paulo Stock Exchange | SVM | Mean-variance portfolio compared to 1/N | June 2001 to December 2016 | 7 stocks | Accuracy = 54.97%, Feature = 70.29% | |
Ma et al [56] | 49 stocks from the China Securities Index (CSI) 100 | Random Forest > SVR, LSTM, CNN, DMLP and ARIMA | Mean-variance and Omega portfolio | January 4, 2007 to December 31, 2015 | N/A | Excess return = 121.53%, Standard deviation = 111.3980, Information ratio = 0.8693, Total return = 679.36%, Maximumdrawdown = -70.42% and Turnover rate = 149.72% | ||
Du [57] | 20 stocks from CSI 300 and SP 500 | Support Vector Machines (SVM), Random Forest algorithms, and Attention-driven Long Short-Term Memory networks (LSTM) | Mean-variance portfolio compared to 1/N | May 4, 2012 to August 4, 2020 | Less than 20 stocks | Accuracy = 92.59% (CSI 300) and 88.52% (SP 500), Sharpe ratio = 9.31 (CSI 300) and 2.77 (SP 500) | ||
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
| |
Genetic Algorithm | Kwak et al [58] | 487 stocks in SP 500 and Hang Seng Index (HSI) | Neural Network | Equal-weight portfolio (1/N) | July 1, 2014 to December 31, 2019 | Number of stocks in the portfolio includes: 50, 100, 200, and 487 | RMSE = 0.0239, Average daily return = 0.09427, Daily return volatility = 0.00032 | |
Rubesam [59] | 572 Brazilian stocks, Ibovespa index | The application of linear regression, including both regularization techniques such as LASSO and Ridge, Bayesian variable selection methodologies, Random Forest algorithms, Gradient Boosting frameworks, Neural Network architectures, and ensemble risk parity models demonstrates superior performance compared to short-term strategic approaches. | Equal risk contribution is better than short-term strategies | January 2003 to December 2018 | Number of stocks per portfolio varies from 10 to 40 | Average monthly return before costs = +2.06% and after costs = +1.23%, Monthly standard deviation = +2.94%, Annual Sharpe ratio before costs = +2.10 and after costs = +1.13, Maximum drawdown = -19.27%, Average monthly turnover = +117.12%, Average leverage = +1.66 | ||
Reinforcement Learning | Durall [60] | Selected 8 stocks from the U.S. market | Reinforcement learning | Tangent portfolio, minimum variance portfolio, risk parity, equal weight, A2C, PPO, DDPG, SAC and TD3 | From January 1, 2010 to January 1, 2017 | 8 stocks | Statistics of best-performing indicators in a bear market: Annual return = +27.6%, Cumulative return = +0.42%, Annual volatility = +4.41%, Sharpe ratio = +0.79, Calmar ratio = +0.75, Maximum loss = -36.8%, Stability = +0.59 | |
Jang and Seong [61] | 29 stocks from the Dow Jones Index | Deep reinforcement learning and neural network N/1 and Jiang et al.'s algorithms (2017) and Yang et al.'s algorithms (2020) and author's proposed method | N/1 and the algorithms of Jiang et al. (2017) and the algorithms of Yang et al. (2020) and the method proposed by the author | From January 1, 2008 to December 31, 2019 | Portfolio includes: 29 stocks | cash value Final accumulated portfolio value = $12,970.29; Max drawdown ratio = -8.29; Sharpe ratio = +2.67 | ||
Deep learning | Yun et al [62] | 32 selected ETFs from 28 countries | Deep learning N/1, Random Forest, Support Vector Regression, Multi-layer Perceptron, Long Short-Term Memory | N/1, random forest, support vector regression, multilayerperceptron, Short-term long-term memory | From May 20, 2002 to June 8, 2017 | N/A | Return = 0.5659, Risk = 0.0330, Sharpe Ratio = 0.1411, Maximum Drawdown = -0.1843, Value at Risk = -0.0504, Expected Shortfall = -0.0629 | |
Chakravorty [63] | 8 selected ETFs from the United States | Deep learning exploratory processes in machine learning, ETF clustering, risk parity approach | Machine learning exploratory processes, ETF grouping, risk parity approach | From 2014 to 2018 | N/A | Maximum Drawdown = -9.33, Sharpe Ratio = 0.76, Sortino Ratio = 1.037 | ||
Source: Researcher's Findings |
5.1. Analysis of Performance Evaluation Metrics
To compare the performance of different machine learning models or evaluate their effectiveness against other models, specific evaluation metrics are used. These are presented in Table 6 and Figure 14. However, ensuring the future performance of a model cannot be achieved by solely relying on these models in real-world scenarios. Therefore, it is essential to incorporate financial metrics to assess the effectiveness of both machine learning and traditional algorithms. Key metrics include the maximum drawdown ratio, the Sortino ratio, the Sharpe ratio, and the return-to-turnover ratio (both annual and average), all of which are commonly used in portfolio management today.
| |
Fig. 14: Portfolio Performance Evaluation Criteria Source: Researcher's Findings |
6 Discission and Conclusions
This research explores the literature on knowledge mapping related to the convergence of two fields: machine learning, a branch of computer science, and its application in optimizing investment portfolios. With the growing use of machine learning in portfolio management today, this study delves into the trends, accuracy, and capabilities of this technology. The first section reviews 304 articles published in reputable financial and computer science journals from Scopus and Web of Science since 1990, while the second section examines 12 articles selected by the authors. The study provides an overview of the types of models, datasets and performance metrics used in various algorithms, followed by an analysis of the results. The findings indicate that machine learning's application in the financial domain is expanding, and the effective performance of these algorithms challenges the efficient market hypothesis in the reviewed studies. The use of machine learning in wealth allocation and portfolio optimization has surpassed traditional models like minimum variance portfolios and 1/N strategies, even outperforming market index returns. From 2018 onwards, published studies in this field have significantly increased, reflecting growing interest and value in such research within financial markets. Chinese researchers have made notable contributions, ranking first, followed by scholars from the United States and the United Kingdom. Since this research focused exclusively on articles published in ScienceDirect and Web of Science, it is possible that papers from other journals were not included in this study's data set. The use of various performance metrics, considering different conditions and characteristics in the financial data of the reviewed articles, makes direct comparisons across studies challenging. However, the evidence suggests that machine learning, compared to classical models, has significant potential to enhance investor satisfaction and profitability by processing large amounts of data more autonomously and accurately in investment decisions.
[1] De Prado, M.M.L. Machine learning for asset managers, Cambridge University Press, 2020. doi: https://doi.org/10.1017/9781108883658
[2] Zamanpour,A., Zanjirdar, M., Davodi Nasr,M, Identify and rank the factors affecting stock portfolio optimization with fuzzy network analysis approach, Financial Engineering And Portfolio Management,2021;12(47): 210-236
[3] Zapata, H.O., Mukhopadhyay, S. A Bibliometric Analysis of Machine Learning Econometrics in Asset Pricing, Journal of Risk and Financial Management, 2022; 15(11):535. doi: https://doi.org/10.3390/jrfm15110535
[4] Ahmed, S., Alshater, M.M., El Ammari, A., Hammami, H. Artificial intelligence and machine learning in finance: A bibliometric review, Research in International Business and Finance, 2022; 61:101646. doi: https://doi.org/10.1016/j.ribaf.2022.101646
[5] Vo, N.N., He, X., Liu, S., Xu, G. Deep learning for decision making and the optimization of socially responsible investments and portfolio, Decision Support Systems, 2019; 124:113097. doi: https://doi.org/10.1016/j.dss.2019.113097
[6] Ozbayoglu, A.M., Gudelek, M.U., Sezer, O.B. Deep learning for financial applications: A survey, Applied Soft Computing, 2020; 93:106384. doi: https://doi.org/10.48550/arxiv.2002.05786
[7] Lakzaie, F., Bahiraie, A., & Mohammadian, S. Visualized portfolio optimization of stock market: Case of TSE. Advances in Mathematical Finance and Applications, 2024; 2:707-722. doi: https://doi.org/https://doi.org/10.71716/amfa.2024.2301-1853
[8] Zanjirdar, M. Overview of portfolio optimization models, Advances in Mathematical Finance and Applications, 2020; 5(4):419-435. doi: https://doi.org/10.22034/amfa.2020.674941
[9] Lopez de Prado, M. A robust estimator of the efficient frontier, SSRN, 2016. doi: https://doi.org/10.2139/ssrn.3469961
[10] Schwendner, P., Papenbrock, J., Jaeger, M., Krügel, S. ‘Adaptive Seriational Risk Parity’ and Other Extensions for Heuristic Portfolio Construction Using Machine Learning and Graph Theory, The Journal of Financial Data Science, 2021; 3(4):65-83. doi: https://doi.org/10.3905/jfds.2021.1.078
[11] Chen, W., Zhang, H., Mehlawat, M.K., Jia, L. Mean–variance portfolio optimization using machine learning-based stock price prediction, Applied Soft Computing, 2021; 100:106943. doi: https://doi.org/10.1016/j.asoc.2020.106943
[12] Wang, W., Li, W., Zhang, N., Liu, K. Portfolio formation with preselection using deep learning from long-term financial data, Expert Systems with Applications, 2020; 143:113042. doi: https://doi.org/10.1016/j.eswa.2019.113042
[13] Demey, P., Maillard, S., Roncalli, T. Risk-based indexation, Available at SSRN 1582998, 2010. doi: https://doi.org/10.2139/ssrn.1582998
[14] Choueifaty, Y. Towards maximum diversification, Available at SSRN 4063676, 2008. doi: https://doi.org/10.2139/ssrn.4063676
[15] Ferretti, S. On the modeling and simulation of portfolio allocation schemes: An approach based on network community detection, Comput. Econ, 2022; 1-37. doi: https://doi.org/10.48550/arxiv.2203.11780
[16] Ernst, P., Thompson, J., Miao, Y. Portfolio selection: The power of equal weight, arXiv preprint arXiv:1602.00782, 2016. doi: https://doi.org/10.48550/arxiv.2309.13696
[17] Tatsat, H., Puri, S., Lookabaugh, B. Machine Learning and Data Science Blueprints for Finance, O'Reilly Media, 2020.
[18] Gokhale, A., Mulay, P., Pramod, D., Kulkarni, R. A bibliometric analysis of digital image forensics, Sci. Technol. Libr, 2020; 39(1):96-113. doi: https://doi.org/10.1080/0194262X.2020.1714529
[19] Roemer, R. C., Borchardt, R. Meaningful metrics: A 21st century librarian's guide to bibliometrics, altmetrics, and research impact, Amer Library Assn, 2015. doi: https://doi.org/10.7710/2162-3309.2290
[20] Nourahmadi, M., Rasti, F., Sadeghi, H. A Review of Research on Clustering of Financial Time Series: A Knowledge Mapping Approach, Financ. Invest. Adv, 2021; 2(2):23-57. [In Persian] doi: https://doi.org/10.30495/afi.2021.1919857.1002
[21] Milian, E. Z., Spinola, M. d. M., de Carvalho, M. M. Fintechs: A literature review and research agenda, Electron. Commer. Res. Appl, 2019; 34:100833. doi: https://doi.org/10.1016/j.elerap.2019.100833
[22] Blanco-Mesa, F., Merigó, J. M., Gil-Lafuente, A. M. Fuzzy decision making: A bibliometric-based review, J. Intell. Fuzzy Syst, 2017; 32(3):2033-2050. doi: https://doi.org/10.3233/jifs-161640
[23] Martínez-López, F. J., Merigó, J. M., Valenzuela-Fernández, L., Nicolás, C. Fifty years of the European Journal of Marketing: A bibliometric analysis, Eur. J. Mark, 2018; 52(1-2):439-468. doi: https://doi.org/10.1108/ejm-11-2017-0853
[24] Mas-Tur, A., Modak, N. M., Merigó, J. M., Roig-Tierno, N., Geraci, M., Capecchi, V. Half a century of Quality & Quantity: A bibliometric review, Qual. Quant, 2019; 53:981-1020. doi: https://doi.org/10.1007/s11135-018-0799-1
[25] Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., Lim, W. M. How to conduct a bibliometric analysis: An overview and guidelines, J. Bus. Res, 2021; 133:285-296. doi: https://doi.org/10.1016/j.jbusres.2021.04.070
[26] Small, H. Visualizing science by citation mapping, J. Am. Soc. Inf. Sci, 1999; 50(9):799-813. doi: https://doi.org/10.1002/%28sici%291097-4571%281999%2950%3A9%3C799%3A%3Aaid-asi9%3E3.0.co%3B2-g
[27] Cobo, M. J., López‐Herrera, A. G., Herrera‐Viedma, E., Herrera, F. Science mapping software tools: Review, analysis, and cooperative study among tools, J. Am. Soc. Inf. Sci. Technol, 2011; 62(7):1382-1402. doi: https://doi.org/10.1002/asi.21525
[28] Heradio, R., De La Torre, L., Galan, D., Cabrerizo, F. J., Herrera-Viedma, E., Dormido, S. Virtual and remote labs in education: A bibliometric analysis, Comput. Educ, 2016; 98:14-38. doi: https://doi.org/10.1016/j.compedu.2016.03.010
[29] Danvila-del-Valle, I., Estévez-Mendoza, C., Lara, F. J. Human resources training: A bibliometric analysis, J. Bus. Res, 2019; 101:627-636. doi: https://doi.org/10.1016/j.jbusres.2019.02.026
[30] Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., Pappas, G. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: Strengths and weaknesses, FASEB J, 2008; 22(2):338-342. doi: https://doi.org/10.1096/fj.07-9492lsf
[31] Aria, M., Cuccurullo, C. bibliometrix: An R-tool for comprehensive science mapping analysis, J. Informetr, 2017; 11(4):959-975
[32] Wang, Y. Research on supply chain financial risk assessment based on blockchain and fuzzy neural networks, Wirel. Commun. Mob. Comput, 2021; 2021:1-8. doi: https://doi.org/10.1155/2021/5565980
[33] Zheng, J., Wang, Y., Li, S., Chen, H. The stock index prediction based on SVR model with bat optimization algorithm, Algorithms, 2021; 14(10):299. doi: https://doi.org/10.3390/a14100299
[34] Song, Z., Wang, Y., Qian, P., Song, S., Coenen, F., Jiang, Z., Su, J. From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization, Appl. Intell, 2022; 1-16. doi: https://doi.org/10.1007/s10489-022-04217-5
[35] Chen, B., Zhong, J., Chen, Y. A hybrid approach for portfolio selection with higher-order moments: Empirical evidence from Shanghai Stock Exchange, Expert Syst. Appl, 2020; 145:113104. doi: https://doi.org/10.1016/j.eswa.2019.113104
[36] Rojas-Sánchez, M.A., Palos-Sánchez, P.R., Folgado-Fernández, J.A. Systematic literature review and bibliometric analysis on virtual reality and education, Educ. Inf. Technol, 2023; 28(1):155-192. doi: https://doi.org/10.1007/s10639-022-11167-5
[37] Barboza, F., Kimura, H., Altman, E. Machine learning models and bankruptcy prediction, Expert Syst. Appl, 2017; 83:405-417. doi: https://doi.org/10.1016/j.eswa.2017.04.006
[38] Huang, C.-F. A hybrid stock selection model using genetic algorithms and support vector regression, Appl. Soft Comput, 2012; 12(2):807-818. doi: https://doi.org/10.1016/j.asoc.2011.10.009
[39] Ghoddusi, H., Creamer, G.G., Rafizadeh, N. Machine learning in energy economics and finance: A review, Energy Econ, 2019; 81:709-727. doi: https://doi.org/10.1016/j.eneco.2019.05.006
[40] Briand, L.C., Morasca, S., Basili, V.R. Property-based software engineering measurement, IEEE Trans. Softw. Eng, 1996; 22(1):68-86. doi: https://doi.org/10.1109/32.481535
[41] Henrique, B.M., Sobreiro, V.A., Kimura, H. Stock price prediction using support vector regression on daily and up to the minute prices, J. Finance Data Sci, 2018; 4(3):183-201. doi: https://doi.org/10.1016/j.jfds.2018.04.003
[42] Groth, S.S., Muntermann, J. An intraday market risk management approach based on textual analysis, Decis. Support Syst, 2011; 50(4):680-691. doi: https://doi.org/10.1016/j.dss.2010.08.019
[43] Ban, G.-Y., El Karoui, N., Lim, A.E. Machine learning and portfolio optimization, Manage. Sci, 2018; 64(3):1136-1154. doi: https://doi.org/10.1287/mnsc.2016.2644
[44] Paiva, F.D., Cardoso, R.T.N., Hanaoka, G.P., Duarte, W.M. Decision-making for financial trading: A fusion approach of machine learning and portfolio selection, Expert Syst. Appl, 2019; 115:635-655. doi: https://doi.org/10.1016/j.eswa.2018.08.003
[45] Soundararajan, K., Ho, H.K., Su, B. Sankey diagram framework for energy and exergy flows, Appl. Energy, 2014; 136:1035-1042. doi: https://doi.org/10.1016/j.apenergy.2014.08.070
[46] Xie, H., Zhang, Y., Wu, Z., Lv, T. A bibliometric analysis on land degradation: Current status, development, and future directions, Land, 2020; 9(1):28. doi: https://doi.org/10.3390/land9010028
[47] Altınay Özdemir, M., Göktaş, L.S. Research trends on digital detox holidays: A bibliometric analysis, 2012-2020, 2021. doi: https://doi.org/10.18089/tms.2021.170302
[48] Guo, Y.-M., Huang, Z.-L., Guo, J., Li, H., Guo, X.-R., Nkeli, M.J. Bibliometric analysis on smart cities research, Sustainability, 2019; 11(13):3606. doi: https://doi.org/10.3390/su11133606
[49] Zhou, W., Zhu, W., Chen, Y., Chen, J. Dynamic changes and multi-dimensional evolution of portfolio optimization, Econ. Res.-Ekonomska Istraživanja, 2022; 35(1):1431-1456. doi: https://doi.org/10.1080/1331677x.2021.1968308
[50] Asawa, Y.S. Modern Machine Learning Solutions for Portfolio Selection, IEEE Eng. Manag. Rev, 2021; 50(1):94-112. doi: https://doi.org/10.1109/emr.2021.3131158
[51] Ciciretti, V., Bucci, A. Building optimal regime-switching portfolios, N. Am. J. Econ. Finance, 2023; 64:101837. doi: https://doi.org/10.1016/j.najef.2022.101837
[52] Lim, T., Ong, C.S. Portfolio management: A financial application of unsupervised shape-based clustering-driven machine learning method, Int. J. Comput. Digit.
[53] Duarte, F. G., & de Castro, L. N. A fuzzy clustering algorithm for portfolio selection. 2019 IEEE 21st Conference on Business Informatics (CBI), 2019. doi: https://doi.org/10.1109/cbi.2019.00054
[54] Duarte, F. G., & De Castro, L. N. A framework to perform asset allocation based on partitional clustering. IEEE Access, 2020; 8:110775-110788. doi: https://doi.org/10.1109/access.2020.3001944
[55] Koratamaddi, P., Wadhwani, K., Gupta, M., & Sanjeevi, S. G. Market sentiment-aware deep reinforcement learning approach for stock portfolio allocation. Engineering Science and Technology, an International Journal, 2021; 24(4):848-859. doi: https://doi.org/10.1016/j.jestch.2021.01.007
[56] Ma, Y., Han, R., & Wang, W. Portfolio optimization with return prediction using deep learning and machine learning. Expert Systems with Applications, 2021; 165:113973. doi: https://doi.org/10.1016/j.eswa.2020.113973
[57] Du, J. Mean–variance portfolio optimization with deep learning based-forecasts for cointegrated stocks. Expert Systems with Applications, 2022; 201:117005. doi: https://doi.org/10.1016/j.eswa.2022.117005
[58] Kwak, Y., Song, J., & Lee, H. Neural network with fixed noise for index-tracking portfolio optimization. Expert Systems with Applications, 2021; 183:115298. doi: https://doi.org/10.1016/j.eswa.2021.115298
[59] Rubesam, A. Machine learning portfolios with equal risk contributions: Evidence from the Brazilian market. Emerging Markets Review, 2022; 51:100891. doi: https://doi.org/10.1016/j.ememar.2022.100891
[60] Durall, R. Asset Allocation: From Markowitz to Deep Reinforcement Learning. arXiv preprint arXiv:2208.07158, 2022. doi: https://doi.org/10.48550/arxiv.2208.07158
[61] Jang, J., & Seong, N. Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Systems with Applications, 2023; 119556. doi: https://doi.org/10.1016/j.eswa.2023.119556
[62] Yun, H., Lee, M., Kang, Y. S., & Seok, J. Portfolio management via two-stage deep learning with a joint cost. Expert Systems with Applications, 2020; 143:113041. doi: https://doi.org/10.1016/j.eswa.2019.113041
[63] Chakravorty, G., Awasthi, A., & Da Silva, B. Deep learning for global tactical asset allocation. Available at SSRN 3242432, 2018. doi: https://doi.org/10.2139/ssrn.3242432
[64] Mohamadi,M., Zanjirdar,M., On the Relationship between different types of institutional owners and accounting conservatism with cost stickiness, Journal of Management Accounting and Auditing Knowledge, 2018;7(28): 201-214
[65] Zanjirdar, M., Moslehi Araghi, M., The impact of changes in uncertainty, unexpected earning of each share and positive or negative forecast of profit per share in different economic condition, Quarterly Journal of Fiscal and Economic Policies,2016;4(13): 55-76.
[66] Nikumaram, H., Rahnamay Roodposhti, F., Zanjirdar, M., The explanation of risk and expected rate of return by using of Conditional Downside Capital Assets Pricing Model, Financial knowledge of securities analysis,2008;3(1):55-77
|