Introduction: To analyze big data, especially streaming social network data, we require real-time and distributed systems to process streaming data with high speed and efficiency. In this paper, a distributed architecture for collecting, ingesting, processing, storing a
More
Introduction: To analyze big data, especially streaming social network data, we require real-time and distributed systems to process streaming data with high speed and efficiency. In this paper, a distributed architecture for collecting, ingesting, processing, storing and visualizing streaming social network data based on Kappa architecture is introduced. Also, the proposed architecture includes a component for detecting anomalous data.
Method: We utilize the 4+1 architectural view models to visually illustrate the various architectural layers, components, and their interactions.
Results: The proposed architecture serves as a distributed solution designed for processing streaming social network data. We utilized the 4+1 architectural view model and UML diagrams to outline proposed architecture. This documentation clearly outlines the data processing pipeline and specifies both functional and non-functional system requirements.
Discussion: The proposed architecture is designed to process streaming social network data, leveraging distributed and parallel solutions for improved efficiency. Anomaly detection is a pivotal component integrated within the architecture to identify outlier data, enhancing processing precision and quality. By utilizing the 4+1 architectural view model and UML diagrams, the proposed architecture is effectively outlined, ensuring a well-defined structure that aids in organizing information. This structured approach provides stakeholders with tailored architectural views that cater to their individual needs and priorities. Notable functional requirements include real-time processing, while non-functional requirements encompass scalability, interoperability, portability, usability, and efficiency.
Manuscript profile