Scalable Fuzzy Decision Tree Induction Using Fast Data Partitioning and Incremental Approach for Large Dataset
Subject Areas : Data MiningSomayeh Lotfi 1 , Mohammad Ghasemzadeh 2 , Mehran Mohsenzadeh 3 , Mitra Mirzarezaee 4
1 - Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
2 - Computer Engineering Department, Yazd University, Yazd, Iran.
3 - Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran ,IRAN
4 - Department of Computer Engineering, Science and Research Branch, Islamic Azad University,
Daneshgah Blvd., Simon Bulivar Blvd., P.O. Box: 14515/775, Tehran, Iran
Keywords:
Abstract :
[1] Khakata, E., Omwenga, V. and Msanjila, S., 2020. Prediction of Student Learning Styles using Data Mining Techniques. Journal of Advances in Computer Engineering and Technology, 6(2), pp.107-118.
[2] Goetz, T., 2010. The decision tree: taking control of your health in the new era of personalized medicine. Rodale.
[3] Han, J., Pei, J. and Kamber, M., 2011. Data mining: concepts and techniques. Elsevier.
[4] Rokach, L. and Maimon, O.Z., 2008. Data mining with decision trees: theory and applications (Vol. 69). World scientific.
[5] Quinlan, J.R., 2014. C4. 5: programs for machine learning. Elsevier.
[6] Breiman, L., Friedman, J., Stone, C.J. and Olshen, R.A., 1984. Classification and regression trees. CRC press.
[7] Kotsiantis, S.B., 2013. Decision trees: a recent overview. Artificial Intelligence Review, 39(4), pp.261-283.
[8] Fayyad, U. and Irani, K., 1993. Multi-interval discretization of continuous-valued attributes for classification learning.
[9] Fayyad, U.M. and Irani, K.B., 1992. On the handling of continuous-valued attributes in decision tree generation. Machine learning, 8(1), pp.87-102.
[10] Zheng, H., He, J., Zhang, Y., Huang, G., Zhang, Z. and Liu, Q., 2019. A general model for fuzzy decision tree and fuzzy random forest. Computational Intelligence, 35(2), pp.310-335.
[11] Yu, H., Lu, J. and Zhang, G., 2017, November. Learning a fuzzy decision tree from uncertain data. In 2017 12th International Conf. on Intelligent Systems and Knowledge Engineering (pp. 1-7).
[12] Hishamuddin, M.N.F., Hassan, M.F. and Mokhtar, A.A., 2020, February. Improving Classification Accuracy of Random Forest Algorithm Using Unsupervised Discretization with Fuzzy Partition and Fuzzy Set Intervals. In Proceedings of the 2020 9th International Conference on Software and Computer Applications (pp. 99-104).
[13] Garcia, S., Luengo, J., Sáez, J.A., Lopez, V. and Herrera, F., 2012. A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning. IEEE Trans. on Knowledge and Data Engineering, 25(4), pp.734-750.
[14] Zeinalkhani, M. and Eftekhari, M., 2014. Fuzzy partitioning of continuous attributes through discreti-zation methods to construct fuzzy decision tree classifiers. Information Sciences, 278, pp.715-735.
[15] Lomax, S. and Vadera, S., 2013. A survey of cost-sensitive decision tree induction algorithms. ACM Computing Surveys (CSUR), 45(2), pp.1-35.
[16] Lomax, S. and Vadera, S., 2013. A survey of cost-sensitive decision tree induction algorithms. ACM Computing Surveys (CSUR), 45(2), pp.1-35.
[17] Mehta, M., Agrawal, R. and Rissanen, J., 1996, March. SLIQ: A fast scalable classifier for data mining. In International conference on extending database technology (pp. 18-32). Springer, Berlin, Heidelberg.
[18] Hulten, G. and Domingos, P., 2016. Mining Decision Trees from Streams. In Data Stream Management (pp. 189-208). Springer, Berlin, Heidelberg.
[19] Segatori, A., Marcelloni, F. and Pedrycz, W., 2017. On distributed fuzzy decision trees for big data. IEEE Transactions on Fuzzy Systems, 26(1), pp.174-192.
[20] Mahmud, M.S., Huang, J.Z., Salloum, S., Emara, T.Z. and Sadatdiynov, K., 2020. A survey of data partitioning and sampling methods to support big data analysis. Big Data Mining and Analytics, 3(2), pp.85-101.
[21] Franco-Arcega, A., Carrasco-Ochoa, J.A., Sánchez-Díaz, G. and Martínez-Trinidad, J.F., 2011. Decision tree induction using a fast splitting attribute selection for large datasets. Expert Systems with Applications, 38(11), pp.14290-14300.
[22] Bahri, M., Pfahringer, B., Bifet, A. and Maniu, S., 2020, April. Efficient batch-incremental classification using umap for evolving data streams. In International Symposium on Intelligent Data Analysis (pp. 40-53). Springer, Cham.
[23] Couso, I., Borgelt, C., Hullermeier, E. and Kruse, R., 2019. Fuzzy sets in data analysis: From statistical foundations to machine learning. IEEE Comput-ational Intelligence Magazine, 14(1), pp.31-44.
[24] Peng, Y. and Flach, P., 2001. Soft discretization to enhance the continuous decision tree induction. Integrating Aspects of Data Mining, Decision Support and Meta-Learning, 1(34), pp.109-118.
[25] Chen, Y.L., Wang, T., Wang, B.S. and Li, Z.J., 2009. A survey of fuzzy decision tree classifier. Fuzzy Information and Engineering, 1(2), pp.149-159.
[26] Dong, M. and Kothari, R., 2001. Look-ahead based fuzzy decision tree induction. IEEE Transactions on fuzzy systems, 9(3), pp.461-468.
[27] Wang, X. and Borgelt, C., 2004, July. Information measures in fuzzy decision trees. In 2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No. 04CH37542) (Vol. 1, pp. 85-90). IEEE.
[28] Wang, X., Liu, X., Pedrycz, W. and Zhang, L., 2015. Fuzzy rule based decision trees. Pattern Recognition, 48(1), pp.50-59.
[29] Grossi, V., Romei, A. and Turini, F., 2017. Survey on using constraints in data mining. Data mining and knowledge discovery, 31(2), pp.424-464.
[30] Afify, A.A., 2016. A fuzzy rule induction algorithm for discovering classification rules. Journal of Intelligent & Fuzzy Systems, 30(6), pp.3067-3085.
[31] Jin, C., Li, F. and Li, Y., 2014. A generalized fuzzy ID3 algorithm using generalized information entropy. Knowledge-Based Systems, 64, pp.13-21.