Minimizing job execution time in Data Grid by A fuzzy dynamic replication algorithm
محورهای موضوعی : Strategic ManagementMahsa Beigrezaei 1 , Seyedeh Leili Mirtaheri 2
1 - Department of Computer and Engineering, Yadegar -e- Imam Khomeini (RAH) Shahr-e-Rey Branch, Islamic Azad University, Tehran, Iran
2 - Computer Engineering Department, Kharazmi University, Tehran, Iran
کلید واژه: Fuzzy, grid, Distributed System, Data Replication,
چکیده مقاله :
The nature of Data Grids is dynamic. In these systems, data access patterns of users and network latency may change. The system needs to meet data availability reliability. Data replication is a well-known method for improving performance parameters such as data access time, availability, load balancing, and reliability. Here, a novel dynamic algorithm is proposed that uses fuzzy inference systems to manage replication for increasing performance. The proposed algorithm uses a new comprehensive set of decision parameters and fuzzy logic in each phase to reduce the inefficiency caused by wrong decisions in different phases in a practical Grid. The algorithm uses two fuzzy interfere systems to select an appropriate place for new replication and a less valuable file for deleting when storage space is full. It places the new replica in a suitable site where the file could possibly be needed soon with high probability. It also prevents deleting valuable files using a fuzzy valuation function. The algorithm was simulated by the OptorSim simulator. The results demonstrated that the algorithm is more effective than other replication methods in terms of the number of created replications, the percentage of storage used, and the job execution time.
The nature of Data Grids is dynamic. In these systems, data access patterns of users and network latency may change. The system needs to meet data availability reliability. Data replication is a well-known method for improving performance parameters such as data access time, availability, load balancing, and reliability. Here, a novel dynamic algorithm is proposed that uses fuzzy inference systems to manage replication for increasing performance. The proposed algorithm uses a new comprehensive set of decision parameters and fuzzy logic in each phase to reduce the inefficiency caused by wrong decisions in different phases in a practical Grid. The algorithm uses two fuzzy interfere systems to select an appropriate place for new replication and a less valuable file for deleting when storage space is full. It places the new replica in a suitable site where the file could possibly be needed soon with high probability. It also prevents deleting valuable files using a fuzzy valuation function. The algorithm was simulated by the OptorSim simulator. The results demonstrated that the algorithm is more effective than other replication methods in terms of the number of created replications, the percentage of storage used, and the job execution time.
Amjad, T., Sher, M., & Daud, A. (2012). A survey of dynamic replication strategies for improving data availability in data grids. Future Generation Computer Systems, 28(2), 337–349.
Bakhshad, S., NOOR, R., Akhunzada, A., Saba, T., AHMEDY, I. S. M. A. I. L. B. I. N., Haroon, F., & Nazir, B. (2018). A Dynamic Replication Aware Load Balanced Scheduling for Data Grids in Distributed Environments of Internet of Things. Adhoc & Sensor Wireless Networks, 40.
Beigrezaei, M, Haghighat, A. T., & Kanan, H. R. (2013). A new fuzzy based dynamic data replication algorithm in data grids. 2013 13th Iranian Conference on Fuzzy Systems (IFSC), 1–5. https://doi.org/10.1109/IFSC.2013.6675676
Beigrezaei, Mahsa, Haghighat, A. T., Hajizadeh Bastani, N., & Saadati, M. (2020). Increasing performance in Data grid by a new replica replacement algorithm. Journal of Advances in Computer Research, 11(2), 1–10.
Beigrezaei, Mahsa, Toroghi Haghighat, A., & Leili Mirtaheri, S. (n.d.). Minimizing data access latency in data grids by neighborhood‐based data replication and job scheduling. International Journal of Communication Systems, e4552.
Cameron, D. G., Carvajal-Schiaffino, R., Millar, A. P., Nicholson, C., Stockinger, K., & Zini, F. (2003). Evaluating scheduling and replica optimisation strategies in OptorSim. Proceedings. First Latin American Web Congress, 52–59.
Chang, R.-S., Chang, J.-S., & Lin, S.-Y. (2007). Job scheduling and data replication on data grids. Future Generation Computer Systems, 23(7), 846–860.
Čibej, U., Slivnik, B., & Robič, B. (2005). The complexity of static data replication in data grids. Parallel Computing, 31(8–9), 900–912.
Dang, N. N., Lim, S. B., & Yeo, C. K. (2007). Combination of replication and scheduling in data grids. International Journal of Computer Science and Network Security, 7(3), 304–308.
Dong, X., Li, J., Wu, Z., Zhang, D., & Xu, J. (2008). On dynamic replication strategies in data service grids. 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), 155–161.
Fuhrmann, P., Petravick, D., Bakken, J., Perelmutov, T., Fisk, I., Mkrtchyan, T., & Ernst, M. (2005). Managed Data Storage and Data Access Services for Data Grids.
John, S. N., & Mirnalinee, T. T. (2019). A novel dynamic data replication strategy to improve access efficiency of cloud storage. Information Systems and E-Business Management, 1–22.
John, S. N., & Mirnalinee, T. T. (2020). A novel dynamic data replication strategy to improve access efficiency of cloud storage. Information Systems and E-Business Management, 18(3), 405–426.
Khanli, L. M., Isazadeh, A., & Shishavan, T. N. (2011). PHFS: A dynamic replication method, to decrease access latency in the multi-tier data grid. Future Generation Computer Systems, 27(3), 233–244.
Li, C., Song, M., Zhang, M., & Luo, Y. (2020). Effective replica management for improving reliability and availability in edge-cloud computing environment. Journal of Parallel and Distributed Computing.
Meddeber, M., & Yagoubi, B. (2019). Dependent tasks assignment and data consistency management for grid computing. Multiagent and Grid Systems, 15(2), 179–196.
Naas, M. I., Lemarchand, L., Raipin, P., & Boukhobza, J. (2021). IoT data replication and consistency management in fog computing. Journal of Grid Computing, 19(3), 1–25.
Nicholson, C., Cameron, D. G., Doyle, A. T., Millar, A. P., & Stockinger, K. (2008). Dynamic data replication in lcg 2008. Concurrency and Computation: Practice and Experience, 20(11), 1259–1271.
Park, S.-M., Kim, J.-H., Ko, Y.-B., & Yoon, W.-S. (2003). Dynamic data grid replication strategy based on internet hierarchy. International Conference on Grid and Cooperative Computing, 838–846.
Rajaretnam, K., Rajkumar, M., & Venkatesan, R. (2016). Rplb: A replica placement algorithm in data grid with load balancing. International Arab Journal of Information Technology (IAJIT), 13(6).
Ranganathan, K., & Foster, I. (2002). Decoupling computation and data scheduling in distributed data-intensive applications. Proceedings 11th IEEE International Symposium on High Performance Distributed Computing, 352–358.
Ranganathan, K., & Foster, I. (2001). Identifying dynamic replication strategies for a high-performance data grid. International Workshop on Grid Computing, 75–86.
Sashi, K., & Thanamani, A. S. (2011). Dynamic replication in a data grid using a modified BHR region based algorithm. Future Generation Computer Systems, 27(2), 202–210.
Tang, M., Lee, B.-S., Tang, X., & Yeo, C.-K. (2006). The impact of data replication on job scheduling performance in the Data Grid. Future Generation Computer Systems, 22(3), 254–268.
Ubaidillah, S. H. S. A., Alkazemi, B., & Noraziah, A. (2021). An Efficient Data Replication Technique with Fault Tolerance Approach using BVAG with Checkpoint and Rollback-Recovery. International Journal of Advanced Computer Science and Applications, 12(1).
Vashisht, P., Kumar, V., Kumar, R., & Sharma, A. (2019). Optimizing Replica Creation using Agents in Data Grids. 2019 Amity International Conference on Artificial Intelligence (AICAI), 542–547.
Xiong, L., Yang, L., Tao, Y., Xu, J., & Zhao, L. (2018). Replication strategy for spatiotemporal data based on distributed caching system. Sensors, 18(1), 222.