A Comparative Study of Open-Source Software for Deployment and Management of Cloud Computing Utilizing a Big Data Processing Quality Model
Subject Areas : Multimedia Processing, Communications Systems, Intelligent SystemsMahdi Jafari 1 , Amir Kalbasi 2
1 - Department of Computer Engineering, Amirkabir University of Technology, Tehran, Iran
2 - Assistant Professor, Department of Computer Engineering, Amirkabir University of Technology, Tehran, Iran
Keywords: Big Data Processing, Cloud Management, Big Data Storage, cloud computing, Quality Model,
Abstract :
Introduction: The volume of data produced by human society is growing rapidly. Data is being produced in many different industries such as manufacturing, transportation, healthcare, and social networks. Due to the volume of data being produced, data storage and processing are among the most important issues when dealing with big data. The main challenges when dealing with big data are data storage and management, data processing and analytics, and resource management to provide the infrastructure needed to support the first two mentioned challenges. Cloud computing, due to its features and architecture, is a promising infrastructure to store and process big data. Different cloud computing deployment models exist, namely, public cloud, private cloud, community cloud, and hybrid cloud. To store and process big data in a cloud environment, individuals and organizations may be more inclined to deploy and manage private clouds to gain greater control and access to resources and their data. Numerous open-source software has been developed for the deployment and management of private clouds. Evaluating and choosing among them is a challenging task, especially for those who are new to these large-scale software systems. Furthermore, due to the continuous delivery of new releases with major changes or new features and modules for each of the cloud infrastructure management software, choosing among them could be a challenge even for an experienced user.Method: In this paper, first of all, we provide the Quality Model for Cloud Infrastructure (QMCI) for evaluation of cloud infrastructure management software. QMCI focuses on quality factors that are important when processing big data. The top-level factors of this model are 1- Functionality 2- Usability 3- Reliability 4- Supportability 5- Performance. The top-level factors are then divided into sub-factors to further refine the quality model. Metrics can be considered for the sub-factors to evaluate a cloud infrastructure management software.Discussion: Based on QMCI, multiple-criteria decision-making can be utilized to choose between cloud infrastructure management software that best suits a given set of criteria. In the remaining of this paper, three of the most popular open-source cloud infrastructure management software, namely, Eucalyptus, OpenStack, and Apache Cloud Stack are evaluated based on QMCI to compare their capabilities, weaknesses, and strengths from big data processing perspective. Previous literatures that considered the selected three cloud infrastructure management software were studies and utilized to perform the comparative study.
[1] |
Mark Beyer , Douglas Laney , "The Importance of 'Big Data': A Definition," 21 6 2012. [Online]. Available: https://www.gartner.com/en/documents/2057415/the-importance-of-big-data-a-definition. [Accessed 16 12 2021]. |
[2] |
Cheikh Kacfah Emani, Nadine Cullot, Christophe Nicolle, "Understandable Big Data: A survey," Computer Science Review, vol. 17, pp. 70-81, 2015. |
[3] |
Amir Gandomi, Murtaza Haider, "Beyond the hype: Big data concepts, methods, and analytics," International Journal of Information Management, vol. 35, pp. 137-144, 2015. |
[4] |
Chun-Wei Tsai, Chin-Feng Lai, Han-Chieh Chao, Athanasios V. Vasilakos, "Big Data Analytics: A Survey," Journal of Big Data, vol. 2, no. 21, 14 05 2015. |
[5] |
Peter Mell, Tim Grance, "The NIST Definition of Cloud," Recommendations of the National Institute of Standards and Technology, vol. 800, no. 145, 2011/9/28. |
[6] |
Yaoxue Zhang, Ju Ren, Jiagang Liu, Chugui Xu, Hui Guo, Yaping Liu, "A Survey on Emerging Computing Paradigms for Big Data," Chinese Journal of Electronics, vol. 26, no. 1, pp. 1-12, 2017. |
[7] |
Nrusimham Ammu, Mohd Irfanuddin, "Big Data Challenges," International Journal of Advanced Trends in Computer Science and Engineering, vol. 2, pp. 613-615, 2013. |
[8] |
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, "The Google File System," in Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, New York, NY, USA, Association for Computing Machinery, 2003, p. 29–43. |
[9] |
T. White, Hadoop: The Definitive Guide, O'Reilly, 2012. |
[10] |
Ali Davoudian, Liu Chen, Mengchi Liu, "A Survey on NoSQL Stores," ACM Computing Surveys (CSUR), vol. 51, no. 2, pp. 1-43, 2018. |
[11] |
Jeffrey Dean, Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2008. |
[12] |
Chaowei Yang, Qunying Huang, Zhenlong Li, Kai Liu, Fei Hu, "Big Data and Cloud Computing: Innovation Opportunities and Challenges," International Journal of Digital Earth, vol. 10, no. 1, pp. 13-53, 2017. |
[13] |
"Analytics on AWS," [Online]. Available: https://aws.amazon.com/big-data/datalakes-and-analytics/. [Accessed 16 12 2021]. |
[14] |
"Introduction: A Bit of OpenStack History," [Online]. Available: https://docs.openstack.org/project-team-guide/introduction.html. [Accessed 16 12 2021]. |
[15] |
"ISO/IEC 25010," [Online]. Available: https://iso25000.com/index.php/en/iso-25000-standards/iso-25010. [Accessed 25 3 2022]. |
[16] |
R. E. Al-Qutaish, "Quality Models in Software Engineering Literature: An Analytical and Comparative Study," Journal of American Science, vol. 6, no. 3, 2010. |
[17] |
Theo Lynn, Graham Hunt, David Corcoran, John P Morrison, Philip D Healy, "A Comparative Study of Current Open-source Infrastructure as a Service Frameworks," in CLOSER, 2015. |
[18] |
"Eucalyptus Documentation," 06 08 2021. [Online]. Available: https:/ /docs. Eucalyptus. cloud/eucalyptus/5/. |
[19] |
"Official Documentation for Eucalyptus Cloud," 06 08 2021. [Online]. Available: https://docs.eucalyptus.cloud/eucalyptus/4.4.5/index.html. |
[20] |
"Welcome to Apache Cloud Stack’s Documentation," 06 08 2021. [Online]. Available: http://docs.cloudstack.apache.org/en/latest/index.html. |
[21] |
" OpenStack Documentation," 08 06 2021. [Online]. Available: https://docs.openstack.org/yoga/ |
[22] |
Isaac Odun-Ayo, Olasupo Ajayi, Sanjay Misra, "Cloud Computing and Open Source Software:Issues and Developments," in Proceedings of the International MultiConference of Engineers and Computer Scientists, 2018. |
[23] |
Siddharth Jain, Rakesh Kumar, Sunil Kumar Jangir Anamika, "A Comparative Study for Cloud Computing Platform on Open Source Software," ABHIYANTRIKI: An International Journal of Engineering & Technology (AIJET), vol. 1, no. 2, 2014. |
[24] |
Brummett, Travis and Sheinidashtegol, Pezhman and Sarkar, Debadrita and Galloway, Michael, "Performance Metrics of Local Cloud Computing Architectures," in 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, 2015. |
[25] |
Mullerikkal, Jaison Paul and Sastri, Yedhu, "A Comparative Study of OpenStack and CloudStack," in 2015 Fifth International Conference on Advances in Computing and Communications (ICACC), 2015. |
[26] |
Vahid Amiry, Shayan Zamani Rad, Mohammad Kazem Akbari, Morteza Sargolzai Javan, "Implementing Hadoop Platform on Eucalyptus Cloud Infrastructure," in 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 2012. |
[27] |
"Welcome to Sahara!," 19 10 2018. [Online]. Available: https://docs.openstack.org/sahara/latest/. [Accessed 16 12 2021]. |
[28] |
Mumtaz Al-Mukhtar, Asraa Abdulrazak Ali Mardan, "Performance Evaluation of Private Clouds Eucalyptus versus CloudStack," International Journal of Advanced Computer Science and Applications, vol. 5, 2014. |