A Survey of Fake News: definition, features, detection approaches, crisis management process and open research issues
محورهای موضوعی : Journal of Computer & Robotics
1 - ITRC
کلید واژه: fake news detection, fake news features, crisis management, dataset, social media,
چکیده مقاله :
Today, people often use social media as a popular tool to know or share news due to its fast dissemination of information, low cost, and easy access. However, since there are no specific rules and framework for publishing news in these media, the quality and accuracy of the content of this news is lower than the news published in the traditional news sources. Spread of false information among the people can cause irreversible damage to organizations, governments, companies and even individuals; therefore, addressing fake news has become an emerging issue, and large companies in the world, such as Google, are looking for practical solutions to validate content and detect fake news. But due to the dynamic nature of social media as well as the complexity and diversity of available data, fake news detection remains a challenging issue. This survey reviews and discuss the approaches that can detect fake news from five perspectives: (1) learning method, (2) detection method, (3) learning approach, (4) implementation model and (5) independencies from language, field and platform. This survey also presents the state-of-the-art crisis management process in fake news age and proposed actions for each steps too. The result of this study shows despite the many studies have been conducted in recent years in the field of fake news, there is still a long way to reach an effective and efficient system for fake news detection, so in this survey we highlight some of open issues for future research directions. We hope this survey can facilitate collaborative efforts among experts in computer and information sciences, social sciences, management science, and journalism to research fake news, where such efforts can lead to fake news detection that is not only efficient but more importantly, explainable.
Today, people often use social media as a popular tool to know or share news due to its fast dissemination of information, low cost, and easy access. However, since there are no specific rules and framework for publishing news in these media, the quality and accuracy of the content of this news is lower than the news published in the traditional news sources. Spread of false information among the people can cause irreversible damage to organizations, governments, companies and even individuals; therefore, addressing fake news has become an emerging issue, and large companies in the world, such as Google, are looking for practical solutions to validate content and detect fake news. But due to the dynamic nature of social media as well as the complexity and diversity of available data, fake news detection remains a challenging issue. This survey reviews and discuss the approaches that can detect fake news from five perspectives: (1) learning method, (2) detection method, (3) learning approach, (4) implementation model and (5) independencies from language, field and platform. This survey also presents the state-of-the-art crisis management process in fake news age and proposed actions for each steps too. The result of this study shows despite the many studies have been conducted in recent years in the field of fake news, there is still a long way to reach an effective and efficient system for fake news detection, so in this survey we highlight some of open issues for future research directions. We hope this survey can facilitate collaborative efforts among experts in computer and information sciences, social sciences, management science, and journalism to research fake news, where such efforts can lead to fake news detection that is not only efficient but more importantly, explainable.
Journal of Computer & Robotics 17 (2), Summer and Autumn 2024, 51-76
A Survey of Fake News:Features, Detection Approaches, Crisis Management Process and Open Research Issues
Mojgan Farhoodi
IT Faculty, ICT Research Institute (ITRC), Tehran, Iran
Received 07 November 2023; Accepted 02 May 2024
Abstract
Today, people often use social media as a popular tool to know or share news due to its fast dissemination of information, low cost, and easy access. However, since there are no specific rules and framework for publishing news in these media, the quality and accuracy of the content of this news is lower than the news published in the traditional news sources. Spread of false information among the people can cause irreversible damage to organizations, governments, companies and even individuals; therefore, addressing fake news has become an emerging issue, and large companies in the world, such as Google, are looking for practical solutions to validate content and detect fake news. But due to the dynamic nature of social media as well as the complexity and diversity of available data, fake news detection remains a challenging issue. This survey reviews and discuss the approaches that can detect fake news from five perspectives: (1) learning method, (2) detection method, (3) learning approach, (4) implementation model and (5) independencies from language, field and platform. This survey also presents the state-of-the-art crisis management process in fake news age and proposed actions for each steps too. The result of this study shows despite the many studies have been conducted in recent years in the field of fake news, there is still a long way to reach an effective and efficient system for fake news detection, so in this survey we highlight some of open issues for future research directions. We hope this survey can facilitate collaborative efforts among experts in computer and information sciences, social sciences, management science, and journalism to research fake news, where such efforts can lead to fake news detection that is not only efficient but more importantly, explainable.
Keywords: fake news detection, fake news features, crisis management, dataset, social media
1.Introduction
1.1 Background
Over the recent years, the rapid growth of mass media such as Internet, social networks and smartphones have made it easy to produce diverse content and reach it quickly to various users around the world. One of the most important effect of social networks in society is the impact on the news industry [1]. In other words, a large amount of content on the Internet and social networks is news content that is a serious competitor to traditional news media such as newspapers, magazines, radio and television. Some news items may contain false information that was intentionally or unintentionally created. Most social media users not only read the news, but also publish it, so the news is easily broadcast. Thus, in addition to the many benefits that can be counted for new technologies in the field of media activity, this feature is considered a disadvantage and has destructive side effects that can have adverse consequences in society [2].
Fake news was not a common term until a few years ago, but it has become one of the biggest threats today. The emergence of social media was itself a major factor in this regard. The use of the term began in 2016 with the US presidential election and the candidacy of Donald Trump [3]; but fake news is not limited to the United States. A look at the news published on social media of each country shows the spread of lies, which are sometimes published for commercial motives or political purposes. Sometimes spreading false information in cyberspace is intended for humor and entertainment and is not considered a special threat and therefore does not have a long-term effect. But in some cases, the false information can be purposeful and effective at the national level and disrupt society, security, economy and so on, so this problem should not be easily ignored.
On the other hand, the creation and publication of fake news is not limited to the present time and may be regarding to the antiquity of human existence; but the size, extent of their distribution and sphere of influence as it exists in the present age is not comparable to the past age. In the past, the person who created the fake news had to work hard to get the news to the audience, and it was very costly to gain the audience's trust; but today there are users who surpass the media in publishing news that is interesting to them (regardless of its accuracy). Therefore, although fake news is not a new issue, it is not comparable to previous decades in terms of spread and scope of influence; so from this point of view, it seems to be a new phenomenon.
Given the above, the motivation of this survey is awareness, prevention and confrontation with the consequences of publishing and propagating fake news in today's societies, because the indifference to them may create crises that are either impossible to solve or impose exorbitant costs. In other words, entering the field of cyberspace content validation and as one of its important branches, detecting and dealing with fake news, is one of the requirements of the present age.
1.2. Fake News Definition
Although the concept of fake news is very old and goes back to the antiquity of human existence, but according to Webster's dictionary, the term fake news has a history of 125 years [4]. Unfortunately, there is no universal definition of fake news in the world, even in journalism and this has to some extent led to research challenges in this area [5]. In the Cambridge Dictionary, fake news refers to the false stories that appear to be news, spread on the internet or using other media, usually created to influence political views or as a joke [6]. Wikipedia considers fake news to be form of false or misleading information presented as news and often published with the aim of damaging the reputation of a person or entity, or making money through advertising revenue [7]. [5] provides a broad definition of fake news: Fake news is false news, where news broadly includes articles, claims, statements, speeches, posts, among other types of information related to public figures and organizations. It can be created by journalists and non-journalists. Thus, fake news refers, on the one hand, to the types of information that result from the indirect sharing of false information, and, on the other hand, to express misleading information that is the result of deliberate dissemination of false information, and is usually published as propaganda to discredit Powerful and rich people [8]. The common denominator of these two types of information is that the audience is exposed to messages that have been deliberately produced, with the difference that in the case of misinformation, the audience unknowingly forward these messages, but in the disinformation, the audience is considered the victim of these messages [8]. Numerous researches have presented different types of fake news in the world [9, 8, 10, 11, 12, 13] which in the following, we present one of them that accepted by many experts in the field of communication sciences:
- Fabricated News: these are completely fictional stories that have nothing to do with reality and are often written in the style of news articles. They aim to provide false information and are often published on blogs or social networks.
- Propaganda: fake content that aims to harm the interests of a particular party. It usually has a political background and is easily and quickly disseminated through social media. Its purpose is to deceive the minds of the people and attract them with lies.
- Conspiracy Theories: fictional content that tries to explain a situation or event by citing a conspiracy without proof. It is about illegal acts committed by
Table 1
Characteristics of various types of fake news
Type of News | characteristics | ||||||
unreal | fictitious | Based on reality | Publish as real news | Publish on the web and social media | Intention to mislead | Fake disclosure | |
Fabricated News | P | P |
| P | P | P |
|
Propaganda | P |
| P | P | P | P |
|
Conspiracy Theories | P | P |
| P | P | P |
|
Hoaxes | P | P |
| P | P | P |
|
Click-bait | P |
| P | P | P | P |
|
Photo Manipulation | P |
| P | P | P | P |
|
News Satire | P |
| P | P | P |
| P |
Rumor | P |
| P | P | P | P |
|
Table 2
Comparison of existing surveys with the proposed survey
Related Surveys | Key Contributions | Limitation and open issues |
[14] | Surveyed various state-of-the-art approaches for detecting fake news | Lack of proper survey of future challenges |
[2] | Discussed the ways to define fake news and summarized fundamental theories across disciplines. Also it presented the different fake news detection methods | Only focused on fake news detection methods from four perspective: knowledge, style, propagation and source. |
[15] | focused on challenges of automatic fake news detection and provided the first comprehensive review of Natural Language Processing solutions for related challenges | Focus only on NLP approaches |
[16] | The author presented a comprehensive set of features which can be used for online fake news identification, also it summarized both practical-based and research-based approaches for online fake news detection | Lack of analysis of advantages and disadvantages of past studies |
[17] | Reviewed and analyzed the different fake news detection techniques
| Issues and future scope of fake news detection is absent. Also it analyzed only the researches between 2017 to 2021. |
[18] | Analyzed the studies focusing the machine learning techniques for fake news detection | Issues and future challenges were not discussed |
[19] | reviewed and summarized systematically the current status of deep learning techniques for fake news detection | Focus only on deep learning approaches. |
governments or powerful individuals. This news targets the user's misconceptions and beliefs.
- Hoaxes: includes events that are either wrong or inaccurate but are presented as legal and legitimate facts and are often published by websites. Their purpose is to deceive the audience into believing a falsehood to be true.
- Click-bait: deliberate use of misleading news headlines to attract the user's attention. Its purpose is to tempt and deceive the user to click on a link to read an article.
- Photo Manipulation: manipulation of real images or videos to create a false narrative and its purpose is to mislead. With the advent of digital photography and powerful image manipulation software, this type of news has become more popular.
- News Satire: content that typically makes fun of news programs and uses humor to engage with their audience members. The news is broadcast on both television and on websites.
- Rumors: Content that its accuracy either not known or will never be proven. It is usually told from person to person without any evidence; so is
changed according to the desires of the people and then transferred to another person.
According to the above, all types of fake news have three common features: 1) they are not true, 2) they pretend to be real news, 3) hey can easily spread on the web and social media. The following table 1 shows the characteristics of different types of fake news.
1.3. Contribution of this Paper
Recently, some survey papers have examined the topic of online fake news and its related issues. Table 2 shows the relative comparison of some existing surveys for fake news classification. Though these surveys are much information-oriented when we look at the various points which need to be covered, but in all of them, the identification of fake news has been examined from various aspects, but there is no comprehensive classification that can provide a wide and current range of researches conducted in this field.
Therefore, in the proposed survey, we have provided an up-to-date and comprehensive
classification of research-based approaches to detecting fake news online; so researchers can find useful knowledge from our work. Also We proposed the state-of-the-art crisis management process in the age of fake news and the necessary measures in each step. This process can help the responsible organizations and institutions to prevent the creation and publication of fake news as much as possible and reduce its effects and consequences if they occur.
Fig. 2: Time distribution of remained papers after filtering by selection criteria |
1.4. Survey Structure
The rest of this paper is organized as follows. Section 2 presents the research method used in current study. Section 3 demonstrates the characteristic and features of fake news. Section 4 explains the latest research based approaches for online fake news detection. Section 5 introduces the available fake news datasets and compare their features. Section 6 analyzed the examined researches. Section 7, presents the process of crisis management in fake news era and necessary actions for each steps of the process. Section 8 present the open research issue and Finally Section 9 recaps the conclusions and discuss the future research directions.
2. Research Method
2.1. Research Objective
The purpose of this paper is to categorize the approaches used to identify fake news. To do this, a systematic literature review was conducted. In this section, the search terms used, selection criteria and source selection are presented.
2.2. Search Terms
To enable the finding of the relevant articles, we used specific search terms as the following:
(“what is fake news” OR “fake news” OR “types of fake news” OR “fake news delimitation” OR “fake news features” OR “fake news characteristics”)
AND (“fake news detection” OR “rumor detection” OR “approaches to identify fake news” OR “Automated detection of fake news” OR “supervised/unsupervised ways to detect fake news”)
2.3. Selection Criteria
Inclusion criteria. Studies which met all the following criteria were included: (1) studies published between 2010 and 2023; (2) with the main focus of fake news on social media or digital platforms; (3) research found in English language; (4) articles published in information technology journals or any technology-related journal articles as well as conference proceedings.
Exclusion Criteria. Studies that adhered to the following criteria: (1) research not presented in journal articles (e.g. in the form of a slide show or overhead presentation); (2) studies published, not relating to technology or IT.
2.4. Timeline of used Papers
After filtering the found articles using selection criteria, 133 articles remained. The time distribution of these articles is shown in Fig. 2 by different years.
3. Characteristics of Fake News
As fake news detection has become an emerging issue, more technical giant companies such as Google are seeking future solutions for recognizing online fake information. However, accurate fake news detection, is still challenging, due to the dynamic nature of the social media, and the complexity and diversity of online communication data. Therefore, to develop an effective and efficient detection system, it is significant to identify their characteristics and features. The most important characteristics of fake news are presented as follows [16]:
/- Volume: Fake news is easily written on the Internet without the need for any confirmation procedure and are distributed through the Internet, even without users’ awareness, so has a very high volume.
- Veracity: As Mentioned before, there are several type of fake news such as rumors, satire news, conspiracy theories and etc., which affect every aspect of people' lives. With the increasing popularity of social media, fake news can dominate public’s opinions, interests and decisions. In addition, fake news changes the way that people interact with real news.
- Velocity: Fake news usually focuses on hot topics and is published very quickly in a short time and therefore does not have a long life [20].
// Fig. 4 describes each stage of the fake news life cycle [21]. As shown in this figure, the first step is its creation, which fake news content is created by one or more authors for specific purposes in the context of social media or outside. Each news includes different sections such as the headline, the body and if necessary, the image. After creating fake news, it is necessary to inject the created news in social media by one or more publishers which have a specific identity that can be defined through features such as friends, followers, history of activities and etc.
In the next step, each news article enters a phase that depends entirely on the behavior of the recipients such as share, comment, like or leave the news without any action. Finally, the authenticity of the news can be verified using existing evidence and therefore its falsity can be detected. Based on the different stages of this cycle, it is possible to examine who is the news sources, what is their purpose for creating online false information, what writing skills are more likely to be used in fake news, how fake news is distributed via the Internet or social media, and how it can effect online readers [16]. Based on the above, in Fig. 5, we present the features of fake news from different aspects:
Fig. 3: Characteristic of fake news
3.1. Fake News Creator/Publisher
It is important to demonstrate who is behind the fake news and why it is created and shared through the social media. The creator/publisher of the fake news can be either real human beings or non-humans.
Fig. 4: Fake news life cycle [21]
Fig. 5: Fake news features
- Non-humans: Social bots and cyborgs are the most common non-human fake news creators. Social bots are computer algorithms that behavior similarly to humans, and automatically generate content and interact with humans on social media [22]. Cyborg refers to either bot-assisted humans or human-assisted bots. Once registered by a human, a cyborg can tweet and interact with the social community.
- Real humans: are crucial sources for fake news diffusion. Actually, social bots and cyborgs are only the carriers of fake news on social media, those automate accounts are being programmed to spread false messages by humans [16]. Regardless of whether the fake news is spread manually or automatically, real humans, who aim to disrupt the credibility of society. Fake contents are often produced intentionally by the malicious users, but some legitimate users also participate in distributing of fake news without any malice. Due to the anonymous identity of individuals on the Internet, users are not responsible for the content they post, share or comment. This is problematic since the unidentified messages may undergo far-reaching dissemination, and may have material impacts on the Internet.
3.2. News Content
Each news usually includes physical and non-physical content [16]:
- Physical content: as shown in Fig. 5, the physical content of the news contains the headline of the news, the main body of the news, and the other items such as images or videos. Also, other items in news such as URI, a hashtag, an emoji and etc. are all considered as physical content. In other words, everything that is explicitly seen in the text of a news is considered as its physical content.
- Non-physical news content: this content is the core of fake news and unlike the physical content that carries the news format, this type of content includes the opinions, emotions, attitudes, and feelings that news creators want to express.
- Sentiment polarity is another important feature of non-physical content for fake news. Fake news
- creator often express strong positive or negative feelings in the text in order to persuade their news. Fake news may also target certain areas and fields such as finance, social, political or information technology and so on. In general, everything that is implicitly derived from the news is considered as non-physical content.
3.3. Social Context
Social Context refers to the whole environment in which news is published and includes how to social data is distributed, time model of news dissemination on social media, how users interact with each other and what is the stance of users toward the news?
- Propagation Model: This model shows how the news was published. In this model, a tree structure is usually generated which the root of the tree is the first user to share the news, and the other nodes of the tree are the users who received the news and sent them to others if they were not tree leaves. The edges of the tree indicate the sequence of sending the news (Fig. 6).
Based on the criteria come from the analysis of such a tree, it is possible to estimate its prevalence and suspect that it is fake because one of the characteristics of fake news is their high prevalence [23]. Of course, in the early days of publishing the news due to the lack of their publishing model, content-based features as well as user-based features give better information, but after spreading the news among the users, it is possible to estimate the accuracy of news by extracting the news propagation model [24, 25].
- User Network: By analyzing the network, the user behavior pattern of malicious users and ordinary users can be obtained. For example, a typical user behavior feature is the user anomaly score, which computed by the number of the user's interaction in a time window divided by the user's monthly average for online anomalous information detection [16]. User stance can also be considered as one of the features that obtained from analyzing the user network [26, 27]. The meaning of users' stance is whether users support or deny the news, or request more information, or comment without regard to its accuracy [28]. Stance detection is the first and most important step in detecting fake news [29], which is still in the early stages of research.
4. Existing Research-Based Approaches to Detect Fake News
Fig. 6: Propagation model of a news in social media
In general, detecting the fake news is a very complex task, and even without supportive information, it is almost impossible to detect the accuracy of the news. In recent years, many studies have focused on the fake news automatic detection. To do this, they collect and analyze various data such as news content, propagation model, user behavior and etc. In this section, we try to categorize the existing research approaches from different
dimensions. Therefore, we reviewed studies that examined the problem of detecting fake news from different aspects and presented various categories [30, 31, 16, 32, 33, 34, 35, 36]. In this paper, we propose a comprehensive classification of the existing research-based approaches to detect fake news (Fig. 7).
4.1. Implement-Based Category
In terms of how the system executed, fake news detection can be divided into two categories: real-time detection and offline detection [16]. Offline detection system is important for classifying online fake news, because they can analyze anomalous information in a descriptive manner, such as select the most effective features to distinguish false information among large amounts of social messages. However, the disadvantage of offline systems is that they are limited, and the datasets used in them may not reflect the important features of the fake news, and the learning models trained in an offline system may not be applied to other circumstances. But real-time detection systems are
Fig. 7: The existing research-based approaches to detect fake news
powerful tools for capturing dynamic nature of online information and counteracting the fake news. They use various real-time analysis techniques to determine whether news content is fake or not. To real-time detection of the fake news, only has to rely on its source or content, because since online communication data is time-sensitive, continuous, and heterogeneous. On the other hand, new events often contain new and unexpected knowledge that does not exist in the previous knowledge of the system or is very difficult to infer. Also, features that reflect the style of fake news in the past may not be usable in the future or in other fields and therefore may reduce system performance. To tackle this problem, we can use knowledge graphs that are updated dynamically [5] or use features that are independent of the field, subject or language and can represent the style of the fake news [37]. Another way is to extract the minimum information (such as news headlines and part of the content) in such a way that they can be used to increase the efficiency of fake news detection system [38, 39].
4.2. Detection Method-Based Category
From this perspective, the methods of fake news detection can be divided into two categories: knowledge-based and features-based methods.
4.2.1. Knowledge-based methods: In these methods, fact-checking are usually used, which is done in the following methods [5, 32, 40]:
Human-oriented fact-checking: These approaches are human-centric where a person or group analyses the fact of an information and used first by journalists. This approach can be further categorized as expert oriented fact-checking where the fact
checker is a domain expert (eg; factchecker1, politifact2) and crowdsourcing based
fact-checking where the fact checkers are normal people in the crowd. Expert-based fact checking is often performed by a small group of expert human, it is easy to manage and leads to very accurate results, but it becomes costly and time consuming as the volume of news increases. Crowdsourcing based fact-checking relies on a large population of normal people who actually check the accuracy of the news, in other words, it depends on collective intelligence. Compared to expert-based method, although this method is relatively scalable, it is relatively difficult to summarize due to conflicting opinions from individuals. Therefore, this method requires two tasks: (1) filtering or deleting invalid users and (2) resolving conflicting results. And of course, both requirements become more vital as the number of reviewers increases.
§ Computational fact-checking: As the volume of news increases, fact checking by humans becomes very difficult and time consuming. Automatic identification and assignment of truth value to the news is the main purpose of computational fact-checking. So to assign the truth value to a claim revealed by the news, it is necessary to automatically extract all relevant facts from the available primary, secondary sources, and all other external sources including the open web and structured knowledge graph [32]. Then the claims in the news are compared with the facts in the knowledge graph and their accuracy is measured.
Generally, automating all of the Fact-checking process can help experts to detect the fake news. The relevant process includes the following steps [41]:
- Finding Claims-Worth Fact-Checking: As fact-checkers are flooded with claims, they need to decide what is actually worth fact-checking. To do this, AI solutions are used and today it is participated in common conference tasks such as CLEF CheckThat! lab [41] and also in fachcheker Sites such as Full Fact [42]. Systems such as ClaimBuster [43] and ClaimRank [44] which were developed for this purpose, use AI solutions. Later on CLEF CheckThat! Systems were introduced that used deep learning methods and pre-trained transformers.
- Evidence Retrieval: The purpose of this section is to find external evidence to help fact checkers make better decisions on the factuality of an input claim.
- Claim Verification: Automatic claim verification approaches can be divided into explainable and non-explainable. Explainable approaches are further relevant to assisting human fact checkers. They verify the input claim against a trusted source. Non-explainable approaches make a prediction based on the content of documents retrieved from the Web or social media by modeling the message and its propagation, the users and etc.
At present, there is a very small number of fact checking organizations in the world so automating any parts of the fact checking process could cut down the time it takes to respond to a claim [45]. For example, by creating and updating a knowledge graph based on credible news sources, the new claims are compared to the facts of that knowledge base, and then it is determined whether the news is fake or not. Numerous researches use techniques based on natural language processing (NLP), knowledge representation, and knowledge graph to automatically predict the accuracy of claims [45, 46; 47, 41].
4.2.2. Feature-Based Methods:
These methods use different features, which are shown in Fig. 5. This feature set contains three different subcategories as follows:
4.2.2.1. Creator-Based Features
Many researchers believe that the best approach for detecting the fake news is not focusing on the claims themselves, but on the news sources or their creators [16]. Unfortunately, online users don’t have the clues to assess the credibility of the social information. On the other hand, malicious social media accounts intent to manipulate people’s decision and pollute the truth news content by purposely spreading misinformation so creator/user analysis is a critical aspect for fake news detection, which is involved the following features [48]:
- Account profile features: The basic user profiling information includes the language used by the user, his/her geographic locations, the account creation time, number of posts/tweets sent by the user account, and so on [16]. Analysis of this information shows how active or suspicious a social account is; for example, in [49], an attempt has been made to extract users’ behavior in news sharing and then, using explicit and implicit features of their profiles, to group users. Finally, by analyzing these features can be measured the probability of sharing fake news by each of these groups. Also, it is sometimes necessary to check whether the news source is from a popular domain or an unknown domain? It is even possible to identify a malicious website just by checking its URL, for example, if it has an unusual domain name (such as com.co) or unusual tokens in it, it can be suspected of being real. Sometimes even the "About Us" section contains information that can be used as a credibility indicator [16].
- Temporal-based features: Using time information such as average submission time between two consecutive posts, number of replies, sharing, etc., bots and cyborgs can be somehow distinguished from human users; because social bats and cyborgs are usually more active over a period of time, but human users have complex temporal behaviors. Some studies have used this information to detect fake content; for example, [50] used time pattern analysis as well as other users' information to detect fake images posted during Sandy Hurricane.
- Account credibility feature: the number of friends and followers of the user can also be a good feature to differentiate between malicious and legitimate accounts. The number of followers of a normal user is often close to the number of his friends; but this is different for social bots, and they have much more friends than followers. The following equation is usually used for this purpose. The result of this fraction is close to 1 for celebrities and 0 for social bats [16]:
Account Reputation = followers / (followers + friends)
-Sentiment features: are the useful features to illustrate the emotions, attitudes, and opinions that are conveyed by online social media and these factors are one of the key attributes for identifying suspicious user accounts. Malicious accounts often exaggerate the content and mislead the user by arousing the reader's emotions. [30] has examined the various usage of sentiment analysis in the fake news detection.
It should be noted that finding a cluster of malicious users who create forums to spread false news is a basic idea for detecting and preventing the spread of fake news on social networks, which [51] is focused on.
4.2.2.2. Content-Based Features
Textual modality: the most important and common of these features are [32]:
· Linguistic features: The traditional representations of linguistic features divided to lexical, syntactic and semantic features.
- lexical features: include the average length of news, words statistics, fake news patterns at the word-level, used pronouns, positive and negative emotional words extracted from the text and so on [52].
- Syntactic features: include parts of speech tagging (POS), the average sentence length, the frequency of punctuation (such as question marks, parentheses, and comma), the average polarity of the sentence (positive, neutral or negative) and so on. Research has shown these features have a great impact on the fake news detection. For example, fake news creator often use punctuation as well as adverbs and verbs more than real news creator [53]. Some studies have also expressed that fake news creator, unlike real news creator, don’t have to apply the grammatical rules.
Semantic features: include the polarity of the sentence (positive, neutral or negative), readability, the thematic features of the messages extracted from the text and so on [52].
Many studies have extracted textual features using NLP techniques to the fake news detection [54, 55, 56, 57, 58, 59, 1, 60, 61].
· Psycho-Linguistic features: Psycho-linguistics or psychology of language is used to extract the psychological characteristics of language including, the emotions embedded, self-references, cognitive complexity etc., from a text data. Fake news usually uses fewer self-references, more negative emotion words and fewer markers of cognitive complexity [62]. Other features of this category are: affective information (positive and negative emotion), exclusive information (but, without), motion words (walk, move, go), social processes (talk, us, friends), Cognitive processes (cause, know), etc. [32]. LIWC is one of the most popular software for experimenting the psycho-linguistic features of a text [63], which is based on a large dictionary of different categories of words that represent linguistic and psycho-linguistic features.
· Stylometric features: Stylometry is the statistical analysis of variations in literary style between one writer or genre and another, which has been used in numerous researches to detect a news is fake or not [64, 65, 3]. The intuition and assumption behind style-based methods is that malicious entities prefer to write fake news in a “special” style to encourage others to read and convince them to trust. News style analysis is a content-based method that uses text writing features and machine learning methods [5]. When people hide their writing styles, some linguistic features change, and if we identify these features, we can most likely detect weather a news is fake or not [66]. Some of these features are: letter-related features (such as number of letters, percentage of numbers, number of common n-gram, number of special letters and punctuation mark), word-related features (such as total number of words, average number of letters per word, number of long words), functional words, POS and etc. In [66], two other categories of features include text-related features (such as quantitative features, lexical and grammatical complexity, etc.) and author-related features (such as the number of unique words that can be used by him/her, the number of sentences, the average length of the sentence, etc.) were also mentioned.
· Statistical or empirical features: These features are very helpful to characterize and understand the hidden patterns in fake or real news when we consider the fake or real article detection as traditional supervised or unsupervised learning processes. The most important of these features are: bag of words and word embedding.
- Multimedia features: Images are more important than text, because they attract more due to being vivid and easily comprehensible. There are a lot of images on social networks, so fake information can be transmitted to many users by manipulating them. Therefore, the more the content tends to be visualized, the less likely it is to be criticized and the more likely it is to be credible [67]. Unfortunately, few studies use image-based features to validate a news article, although they often try to use textual features around images. [68] stated that fake news includes images and video in addition to text, but existing research focuses on only one of these items; In order to have high accuracy, it is better to consider all textual and non-textual content of the news. [69] also stated that relying only on the textual features of tweets that contain short content, it is not possible to determine with good accuracy whether they are real or fake, and therefore additional information such as multimedia features should be considered. [70] also present a survey of deep learning methods on multimodal fake news detection on social media.
4.2.2.3. Context-based features
Identifying fake and fact news is difficult only by content-based features analysis [71]. On the other hand, context-based methods are less effective than content-based methods in finding fake news early because fake news must be published on the network for some time to form a model for its dissemination. According to various studies, fake news in the political field usually spreads faster and more than fake news in other fields such as business, science, and entertainment [23]. There are many studies that use context-based features to the fake news detection [72, 38, 73, and 74). These features can be divided into several types, which are:
- Stance: is usually considered as a subset of sentiment analysis and aims to identify an author's stance or post stance towards a goal (which can be an idea, event, claim, topic, and even a news) [75] and therefore can be very helpful in the fake news detection.
- Propagation features: includes the news distribution model, the time of their publication and so on. There are many questions about publishing fake news; for example: How to describe the propagation model of the news? Is there a difference between the propagation models of fake or fact news? Is the fake news in various fields (such as political, economic, and cultural), various topics (such as natural disasters, presidential elections and health), various platforms (such as Twitter) or various languages (such as English, Chinese, and Russian) published differently?
- Temporal-based features: These features can be used to describe the behavior of the sender of the news in a time series manner. They are good attribute to detect suspicious posting activities, and can be used to indicate the false level of online news. Some of these features include: the interval between two posts, the frequency of posting, replying and commenting for a certain account, the time of the day when the original information is posted/shared/commented, and the day of the week in which the post is published [16].
Numerous studies have used a combination of these methods and achieved good results. For example, [76] has used various features of content, users and context to identify Persian rumors in twitter and has achieved relatively high accuracy. [77] has also used a combination of news content, publisher and context features. It stated that although early detection of fake news is very important in order to prevent its publication in the future, but the longer the news is published, due to more user interactions about that news, the accuracy of the fake news detection is increased. [78] used a transformer-based approach based on news content and social contexts.
4.3. Learning Approach-Based Category
Studies in the field of fake news detection can be divided into two parts. Until 2013, most research focused on content related to debates and online forums, but in recent years, the focus of research has shifted to posts and tweets on social media [28]. The following are the main approaches to implementing the fake news detection system [79]:
4.3.1. Feature-based machine learning approaches
Most existing studies, especially before deep learning, used this approach to detect fake news [80, 69, and 76]. These approaches use machine learning algorithms. Features used in this type of approach are [16]:
§ Word-level feature: bag-of-words, n-gram, term frequency (TF), term frequency-inverted document frequency (TF-IDF) are the most commonly used linguistic features for natural language processing. Also, the presence of special and suspicious tokens such as exclamation, question mark, user mention, hashtag, emoticon smile and so on can be used to identify fake content. Similar to suspicious tokens, the present of stylistic words such as the stop-words, punctuations, quotes, negations (no, never, not, etc.), informal/swear words, interrogative (how, when, what, why), nouns, personal pronouns, possessive pronouns, determinants, cardinal numbers, adverbs, verbs, quantifying words, comparison words and so on can also be used for online fake news detection.
§ Sentence-level features: refer to all the important attributes that based on sentence scale, they include parts of speech tagging (POS), the average sentence length, the frequency of punctuations, function words in a sentence, the average polarity of the sentence (positive, neutral or negative), the sentence complexity and so on.
§ Content-level features: refer to the raw information of the news content and include the news topics (politics, financial, technology, etc.), the certainty of news, the number of special tags or symbols in the whole news and so on.
4.3.2. Deep learning Approaches
These approaches often use deep neural networks (such as RNN3). However, the results show that LSTM4 networks, which is a type of RNN, bring good results [81]. Some of the common features used in these approaches are word representation (Word2Vec), GloVe [82], phrase embedding and word/letter n-grams. For example, [83] tried to learn discriminative features from tweets content by following their non-sequential propagation structure and generated more powerful representations for identifying different type of rumors and showed that recursive neural network models performed better than previous approaches and will be able to identify rumors with relatively good accuracy in the early stages after publication. [84] also examined different approaches to machine learning to detect fake news and showed that the limitations of these methods can be partially overcome by using deep learning. [33] also acknowledged that deep learning techniques can improve fake news detection systems. Wang et al. proposed a hybrid conventional neural network model that performs better than other machine learning models [85]. Rashkin et al. conducted an extensive analysis of language-based features and reported promising results with LSTM [86]. [87] used a hybrid CNN-BiLSTM-AM model for COVID-19 fake news detection. [88] proposed a model for fake news detection in Dravidian language using transfer learning with adaptive fine-tuning.
4.3.3. Learning approaches based on advanced language models or pre-trained models
These approaches have been used recently in many studies [89, 90, 91, 92] and often use pre-trained models such as BERT5, RoBERTa6, DistilBERT, ELECTRA7 and ELMo8 algorithms. BERT is a transformer-based machine learning technique for natural language processing (NLP) pre-training developed by Google and designed to learn the word representation of unlabeled texts [93]. RoBERTa was first proposed in [94] and is an optimized approach to the BERT algorithm. DistilBERT is a smaller, faster, cheaper and lighter version of the original Bret that has 40% fewer parameters than BERT and is 60% faster than BERT [95]. ELECTRA is a method for self-supervised language representation learning [96]. It can be used to pre-train transformer networks using relatively little compute. ELMo is a deep contextualized word representation which represent words in vectors or embeddings [97]. [89] By analyzing the relationship between the title and the body of the news, identified fake news and claimed that the BERT algorithm has improved the F-score compared to previous advanced models. [91] also present a hybrid architecture connecting BERT with RNN to create models for detecting fake news.
4.3.4. Ensemble Learning Approaches
These approaches use more than one classifier to arrive at a final output, that simplest combination scheme is majority voting. The random forest algorithm is one of the most common algorithms for the full coverage of the data set, combining several decision trees with each other. For example, [98] detected rumors in twitter using different methods and stated that the random forest algorithm has the highest efficiency compared to other methods. Also [99] presented a rumor detection system which focused on a specific topic, that is health-related rumors on Twitter. To this aim, it constructed a new subset of features including influence potential and network characteristics features and achieved relatively good accuracy with the random forest algorithm.
4.4. Learning Method-Based Category
Learning models can usually be divided into three categories: supervised, semi-supervised and unsupervised. In the following, we will describe two categories of supervised and unsupervised, which are mainly used to the fake news detection.
§ Supervised learning: Supervised machine learning algorithms like decision tree, random forest, support vector machine (SVM), logistic regression, and K-nearest neighbor (kNN) are commonly used to detect hoaxes, fraud and classify them [16]. There are many evaluation criteria for assessing the performance of different machine learning techniques. The most common metrics are: true positive (TP), true negative (TN), false positive (FP) and false negative (FN). Other evaluation criteria are Precision, Recall and F-Score that the formulas for calculating them are:
|
|
|
Recently, deep learning algorithms have significant improvements in speech recognition and visual object detection. Unlike conventional machine learning methods that require manual feature extraction, deep learning algorithms such as recursive neural networks (RNN) can be fed with raw data and they are able to detect patterns
automatically. These methods are also very effective in detecting fake news [100, 64, 99, 98, 25, 76].
§ Unsupervised learning: The performance of a supervised learning model strongly depends on the quality of a labeled dataset. However, it is difficult to create a dataset with wide coverage and good quality for fake news detection. Firstly, the real-world online dataset is usually big, incomplete, unstructured, unlabeled, and secondly, and secondly, a large amount of false information with diverse intentions and different linguistic is created via social media every day [16]. Therefore, determining the true label for data is a difficult task. Thus, an unsupervised learning model is more practical and feasible to solve the real-world problem. Unfortunately, only a few research has worked directly to detecting the online fake news in unsupervised methods, most of which have focused on semantic similarity analysis. [48] has used content-based features as well as social-context based features to detect fake news from Twitter news posts, and claim that because no labelled data is required, the proposed model for the online fake news detection is very practical. [101] proposed a model for content-based unsupervised fake news detection on Ukraine-Russia war. [102] stated that usually when people encounter news that they doubt is true, ask some questions such as "Is this news true?", "Really?", "How is this possible?". Therefore, it developed a technique based on searching for the enquiry phrases, clustering similar posts together, and then
§ collecting related posts that do not contain these simple phrase. These clusters indicate whether the news is fake or not and then provided to human experts for further investigation.
Some studies used semi-supervised methods for fake news detection. For example, [103] developed semi-supervised bidirectional RNN for misinformation detection. [104] proposed a robust semi-supervised fake news recognition model based on effective augmentations and ensemble of diverse deep learners.
4.5. Independence-Based Category
Most studies on automatic detection of fake news focus on a specific language (mostly English), a specific platform (Twitter, Facebook, etc.) or a specific field (Covid 19, Cindy Flood, and Earthquake). However, some studies, given that the publication of fake news is a global problem and may be presented in any language in any field and on any platform, tried to present methods or features that can detect fake news independent of such dependencies. For example, [105] proposed some language-independent features for detecting the fake news, and [106, 107] tried to use cross-lingual methods and pre-trained transformers such as BERT to solve this problem. Of course, in some cases, if we encounter a lack of dataset in the relevant language, a translation-based solution can also be used. In this method, the system is trained on the training data of the source language, and then by translating the test data in the target language, the label of each input sample is specified, or the entire test data is translated into the target language and then modeling is performed on it and for each test data or input data, the corresponding label or class is specified. [117, 118] used this method but mentioned that using this approach alone is not appropriate to improve the performance of the fake news detection system; because each language has unique features that by translation, some of these features are lost or even changed, and hence, some other arrangements must be made to improve accuracy. [119] noting that most research has focused on a particular platform such as Twitter, he stated that
[1] https://www.factcheck.org/
[2] http://www.politifact.com/
[3] Recurrent Neural Network
[4] Long-Short Term Memory
[5] Bidirectional Encoder Representations from Transformers
[6] Robustly optimized BERT approach
[7] Efficiently Learning an Encoder that Classifies Token Replacements Accurately
[8] Embedding from Language Models
only by relying on special text-based features can it be possible without regard to the news publishing platform and as much as possible Independently detected fake news. [119] stated that most research has focused on a specific platform such as Twitter, but said that it is possible to detect fake news by using some text-based features regardless of the news propagation platform and even the language.
Table 3 summarized some research conducted in this regard. As the table shows, most studies have used the supervised method and little work has been done on unsupervised methods. Machine learning approaches are also more popular, but in recent years researches have shifted to transfer learning approaches and use of transformers. On the other hand, the results of this study show that most of the papers have worked on the news text and didn’t pay attention to image or video
5. Available Fake News Dataset
Various datasets have been published in recent years to detect fake news, which in the Table 4, we have tried to introduce the most important of them along with the characteristics of each. Most of the available datasets are small in size and number of samples, so they aren't useful for machine learning models that require a lot of data. Also, many of these datasets categorized their data into fake and real classes, and only a few provide more accurate labels. On the other hand, some of them focused on specific domains that may involve only certain writing styles. On the other hand, most of the existing datasets have focused on collecting the textual data and a small number have gathered visual data. The available fake image datasets are limited in size and variety, so there is still much work to be done in this area. Some of the existing datasets are used to fact-checking, such as the FEVER, Fauxtography, and Fakeddit.
6. Critical Analysis of Fake News Detection Studies
To highlight the challenges and limitations of fake news detection models and identify their weaknesses, we analyze the studies listed in Table 3, which are examples of all reviewed articles.
The following figures represent statistics resulting from the analysis of the relevant studies in Table 3. Fig. 8 illustrates shows the percentage of studies conducted in English and non-English language. Therefore, there is a need to focus more on this important issue in other non-English languages. According to Fig. 9, one of the weaknesses of the conducted studies is the development of fake news detection models on the Twitter platform, and less work has been done on other platforms. In fact, it is better for researches to propose models that identify fake news without depending on a specific platform. Fig. 10 shows that most studies have used supervised.
Table 3
A comparison of research conducted in the field of fake news
Paper | Year | Language | Source | Learning method | Learning approach | Detection method | Type | ||||||||
Supervised | Unsupervised | Machine learning | Deep learning | Pre-trained learning | Ensemble learning | Knowledge-based | Feature-based (content) | Feature-based (context) | Feature-based (user) | Text | Image | ||||
[100] | 2011 | English |
| √ | √ |
|
|
|
| √ |
|
| √ |
| |
[80] | 2013 | English | √ |
| √ |
|
|
|
| √ | √ | √ | √ |
| |
[50] | 2013 | English | √ |
| √ |
|
|
|
|
| √ | √ |
| √ | |
[102] | 2015 | English |
| √ | √ |
|
|
| √ | √ |
|
| √ |
| |
[108] | 2015 | English | Wikipedia | - |
|
|
|
|
|
|
|
|
| √ |
|
[69] | 2016 | English | √ |
| √ |
|
|
|
| √ | √ |
| √ |
| |
[76] | 2017 | Persian | √ |
| √ |
|
|
|
| √ | √ | √ | √ |
| |
[25] | 2017 | English | √ |
| √ |
|
|
|
| √ | √ | √ | √ |
| |
[99] | 2017 | English | √ |
|
|
|
| √ |
|
| √ | √ | √ |
| |
[65] | 2017 | English | BuzzFeed | √ |
| √ |
|
|
|
| √ |
|
| √ |
|
[109] | 2017 | Chinese | √ |
|
| √ |
|
|
| √ |
|
| √ |
| |
[98] | 2018 | Persian | √ |
|
|
|
| √ |
| √ | √ | √ | √ |
| |
[110] | 2018 | English | - |
|
|
|
|
|
| √ |
|
|
| √ |
|
[111] | 2018 | English | Wikipedia |
|
|
|
|
|
| √ |
|
|
| √ |
|
[83] | 2018 | English | √ |
|
| √ |
|
|
|
| √ |
| √ |
| |
[112] | 2018 | English |
| √ |
| √ |
|
|
|
|
| √ | √ |
| |
[48] | 2019 | English |
| √ | √ |
|
|
|
| √ | √ |
| √ |
| |
[64] | 2019 | English | journalistic texts | √ |
|
|
|
| √ |
| √ |
|
| √ |
|
[47] | 2019 | English | News article |
|
|
|
|
|
|
|
|
| √ | √ |
|
[113] | 2019 | German | News article | √ |
|
| √ |
|
|
| √ |
|
| √ |
|
[81] | 2020 | Persian | √ |
|
| √ | √ |
|
| √ |
|
| √ |
| |
[45] | 2021 | English | - | √ |
| √ |
|
|
| √ |
|
|
| √ |
|
[1] | 2021 | Persian | Twitter, Telegram | √ |
|
|
|
| √ |
|
| √ |
| √ |
|
[114] | 2021 | Arabic | √ |
|
|
| √ |
|
| √ |
|
| √ |
| |
[115] | 2022 | Arabic | News article | √ |
|
| √ |
|
|
| √ |
|
| √ |
|
[116] | 2022 | Arabic | √ |
|
|
| √ |
|
| √ |
|
| √ |
|
methods for learning. Due to the fact that the datasets created in this field are not large, the accuracy of the models is often not very high. It may be useful to focus on models that can identify fake news with high accuracy without relying on labeled data. As it is clear from Fig. 11, most studies have used machine learning approaches, but interestingly, in recent years, the use of deep learning approaches is increasing strongly. The existence of transformers has led to an increase in the quality of algorithms in this field. Also, according to Fig. 12, the use of content-based features is used in many articles. But due to the usefulness of information in other cases such as context, user, etc., it is better to use a combination of different features to detect fake news. Unfortunately, according to Fig. 13, most researches have only used the features of the news text and have not used the features of news images, etc. If the use of multimodality can have a better
effect on increasing the accuracy of fake news detection models
7. Crisis management in Fake News Age
As mentioned, fake news in the digital age has taken on a new form due to the emergence of social media, which has led to many challenges and threats [52]. Here are the most significant negative effects of fake news:
- Damage to the media ecosystem and lack of trust in them
- Increased anxiety, uncertainty and pessimism among individuals
- Misleading public opinion towards achieving the goals
- Confusion and hesitation of community members or decision makers due to lack of reliable information for decision making in various fields
- Threats to domestic and international security and diplomacy
- Inappropriate effects and consequences of social political, cultural, economic
- Reducing the rate of productivity and production
Table 4
Available fake news dataset
News domain | Source | Type | Number of classes | Size | Dataset |
Political | Politifact | text | 6 | 12863 | LIAR [85] |
various | Wikipedia | text | 3 | 185445 | FEVER [111] |
Political | text | 4 | 2282 | BUZZFEEDNEWS | |
Political | text | 4 | 2263 | BUZZFACE [120] | |
Technology | text | 2 | 15500 | some-like-it-hoax [121] | |
various | text | 2 | 330 | PHEME [122] | |
various | text | 5 | 60000000 | CREDBANK [123] | |
Political | BS Detector | text | 2,3 | 700 | Breaking! |
various | 194 news outlets | text | 8 | 713000 | NELA-GT-2018 |
Political/ celebrity | text | 2 | 602659 | FAKENEWSNET [124] | |
various | Opensources.co | text | 10 | 9400000 | FakeNewsCorpus1 |
Syrian war | 15 news outlets | text | 2 | 804 | FA-KES |
various | self-taken | image | 2 | 48 | Image Manipulation [125] |
various | Snopes, Reuters | text, image | 2 | 1233 | Fauxtography [126] |
various | text, image | 2 | 17806 | image-verification-corpus [127] | |
Manupulated images | image | 2 | 102028 | The PS-Battles Dataset [128] | |
various | text, image | 2,3,6 | 1063106 | Fakeddit |
[1] https://github.com/several27/FakeNewsCorpus
On the other hand, fact-checking resources can only detect fake news after the misleading information is created and disseminate through the Internet and warn online users against similar claims or topics, but they cannot completely prevent the propagation of false information on social networks [16], therefore, it is necessary to consider other aspects to combat fake news. In this regard, we propose the process presented in the figure 6. Since the publication and propagation of fake news leads to the crisis in society, so we tried to take advantage of the crisis management cycle [129] and address the proposed actions and activities at each stage.
Crisis management seeks to minimize the damage a crisis causes. However, this does not mean crisis management is the same thing as crisis response.
|
|
Fig. 8: Percentage of studies conducted in English and Non-English Language |
Fig. 9: Percentage of studies conducted in Twitter and other social media |
|
|
Fig. 10: Percentage of papers use Supervised and Un-supervised learning method |
Fig. 11: Percentage of papers used different approaches for their model developement |
|
|
Fig. 12: Percentage of studies uses various fake news detection method
|
Fig. 13: Percentage of studies uses different modality for fake news detection
|
Fig. 8: Crisis management process in fake news age and proposed actions |
1) Prevention (before the crisis): The purpose of this step is to identify any trending or potential false news as early as possible to avoid publishing and spreading it. To achieve this, by using various methods such as AI-based methods, the existing historical data are analyzed, then topics that are susceptible to misinformation are identified and potential fake news before they occur are predicted. Also, by introducing malicious accounts and domains, it is possible to inform the people about the inaccuracy of the news published from these sources.
2) Preparation (beginning of the crisis): Facing the crisis involves any action to prevent each future consequences and damages, because the more fake news is spread on the network, according to the validity effect [130], people are more likely to trust it. Therefore, at this stage, all suspicious items such as news content, platform, sources and even suspicious users should be investigated more carefully to identify newly emerging fake news as well as to determine from which source and where this news was published. Other activities can also be done in this step are: promoting the media literacy of users, passing laws by the government to prevent the publication and dissemination of the fake news, strengthening the field of professional journalism and using new methods for stating facts.
3) Response (during the crisis): At this stage, the most important action is to detect rumors or fake news published and spread. In this case, the response strategy can be based on network structure, or based on users [5]. In the first case, we prevent from spreading of fake news in the predicted directions by relying on analyzing the structure of its dissemination network. In the second case, it is possible to limit the effective malicious users who have more influence among the users and prevent the further spread of fake news by them. In other words, another activity that should be done in this step is to minimize the scope of propagating the fake news so that fewer users are exposed to the mental pollution of this news. Experiences as well as cooperation of other countries can also be used to detecting and dealing with false news in various fields, especially international hot topics.
Recovery (after the crisis): At this stage, unfortunately, fake news has spread among users and misled their beliefs and minds. Therefore, it is necessary to carry out various activities such as cultural activities with the aim of raising public awareness, and on the other hand, people should be immunized with real news. For example, users can be identified whose role is to correct news and then publish it [131]. It is important to have such corrective mechanisms before the false information is imprinted as correct information in the reader's mind. These mechanisms include crowdsourcing techniques, fact-checking by experts before publishing news, or a combination of them [132]. Also, the intervention strategy should be different for malicious users than normal users, for example, malicious users should be fined or their account deleted, but for normal users, training should be done to improve its capabilities to recognize and not spread fake news. Finally, reviewing and modifying programs and plans is another activity that should be done in this step.
8. Open Research Issues
Despite the many studies have been conducted in recent years in the field of fake news, there is still a long way to reach an effective and efficient system for fake news detection and there are various research issues for further research, that we highlight some of them:
§ Cross-lingual fake news detection: With increasing globalization, news from different countries, in different languages, has become readily available and has become a way for many people to learn about other cultures. As people around the world become more reliant on social media, the impact of fake news on public society also increases. However, most of the fake news detection research focuses only on English and the existing datasets are often prepared in English, French and so on; therefore, detecting of fake news in many languages is a problem. One of the efficient methods for solving this problem that can lead future research is using the cross-lingual techniques. For this purpose, the system is trained with training data available in a specific language (such as English) and then used to predict test data in other language.
§ Fake news detection in other types of content: Almost all related research on detecting fake news has been done on textual content, so detecting manipulated or fake content on images and videos can be an important research issue.
§ Account profiling: Because of the devastating societal effects of fake news, fake news detection has attracted increasing attention. However, the detection performance is generally not satisfactory only using news contents because the fake news is written to mimic true news. Therefore, there is a need for an in-depth understanding on the relationship between user profiles on social media and fake news. Therefore, account profiling and especially personality profiling can be an important future research direction.
§ Stance detection: The first useful step in identifying fake news is to understand what other news sources are saying about the same topic. Stance detection comprises the estimation of the relative perspectives of two different text pieces on the same topic. Of course, there are different types of stance: 1) stance of post to post, 2) stance of post to topic, 1) stance of replies of a post to post, each of them can be important research topics.
§ Unsupervised learning to fake news detection: As mentioned earlier, limited access to high quality labeled datasets is one of the main challenges in detecting fake news. Therefore, unsupervised learning methods are useful for analyzing real-world news and it is better to pay attention to it in future research.
9. Discussion and Conclusion
This survey extensively reviewed and discussed about the current fake news research by (1) defining fake news and differentiating it from fabricated News, propaganda, conspiracy theories, hoaxes, Click-bait, news Satire and rumors; (2) presenting the fake news characteristics and features; (3) reviewing and categorizing the fake news detection approaches from five perspectives; (4) describing the various steps in crisis management process and proposing the important activities in each steps in fake news age; (5) highlighting the various research issues for future.
Based on the analysis of various studies, the key challenges in fake news detection are datasets, feature representation and data fusion. To overcome the problem of the use of either small or imbalanced datasets can be use the data augmentation methods. This method is effective for addressing the lack of data in the early stages, especially when new events emerge, such as the Covid-19 pandemic. On the other hand, visual features have not been used much in detecting fake news. Images in fake news are manipulated in sophisticated ways to trick viewers, grab their attention, and convince them to share the news. Due to the relationship between textual and visual features, the combination of text and image features increases the recognition accuracy.
Due to the fact that social media users are from different cultures, ages and educational backgrounds, the language of the media is often slang and has many mistakes, which leads to problems when using models such as Glove and Word2Vec. To solve this problem, models such as BERT and fastText can be used.
Although machine learning methods have been used to detect fake news, their detection accuracy is less. Deep learning algorithms have outperformed machine learning algorithms and provided outstanding results due to their ability to handle large amounts of data, efficiently extract features, successfully learn, and capture complex patterns.
And finally, combining data from different methods (text and image) may be very useful for identifying fake news. However, the true importance of multiple methods cannot be determined by direct correlation of features. The unique characteristics of each modality must be preserved and relevant information combined between different modalities. One of the most effective ways to deal with this issue is the attention mechanism.
References
Jahanbakhsh-Nagadeh, Z, Feizi-Derakhshi, M. R., & Sharifi, A. (2021). A model for detecting of Persian rumors based on the analysis of contextual features in the content of social networks. Signal and Data Processing, 18(1), 50-29. | [1] |
Ράπτη, Μ. (2019). Fake News in the era of online intentional misinformation; a review of existing approaches. | [2]
|
Horne, B. D., & Adali, S. (2017, May). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In Eleventh international AI conference on web and social media. | [3]
|
Webster, Merriam, The Real Story of 'Fake News', (2017). https://www.merriam-webster.com/words-at-play/the-real-story-of-fake-news | [4] |
Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5), 1-40. | [5] |
Cambridge dictionary, Definition of fake news, (2017), http://dictionary.cambridge.org/us/dictionary/english/fake-news | [6] |
Wikipedia, Fake news, (2022). https://en.wikipedia.org/wiki/Fake_news | [7] |
Soltanifar, M, Salimi, M., & Fasafi, Gh. (2017). Fake News and Skills of Fighting Them, Quarterly Journal of Media Scientific, 4(3), 43-69. | [8] |
Pedram, A. (2017). Fake News: Misinformation ecosystem in cyberspace. Society Culture Media, 6(24), 51-72. | [9] |
Rubin, V. L., Chen, Y., & Conroy, N. K. (2015). Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology, 52(1), 1-4. | [10] |
Tandoc Jr, E. C., Lim, Z. W., & Ling, R. (2018). Defining “fake news” A typology of scholarly definitions. Digital journalism, 6(2), 137-153. | [11] |
de Oliveira, N. R., Pisa, P. S., Lopez, M. A., de Medeiros, D. S. V., & Mattos, D. M. (2021). Identifying fake news on social networks based on natural language processing: trends and challenges. Information, 12(1), 38. | [12] |
Aimeur, E., Amri, S., & Brassard, G. (2023). Fake news, disinformation and misinformation in social media: a review. Social Network Analysis and Mining, 13(1), 30. | [13] |
(Horne, 2019) Horne, B. D., Nørregaard, J., & Adali, S. (2019). Robust fake news detection over time and attack. ACM Transactions on Intelligent Systems and Technology (TIST), 11(1), 1-23. | [14] |
Oshikawa, R., Qian, J., & Wang, W. Y. (2018). A survey on natural language processing for fake news detection. arXiv preprint arXiv:1811.00770. | [15] |
Zhang, X., & Ghorbani, A. A. (2020). An overview of online fake news: Characterization, detection, and discussion. Information Processing & Management, 57(2), 102025. | [16] |
Kumar, S., Kumar, S., Yadav, P., & Bagri, M. (2021, March). A survey on analysis of fake news detection techniques. In 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS) (pp. 894-899). IEEE. | [17] |
Lahby, M., Aqil, S., Yafooz, W. M., & Abakarim, Y. (2022). Online fake news detection using machine learning techniques: A systematic mapping study. Combating Fake News with Computational Intelligence Techniques, 3-37. | [18] |
Hu, L., Wei, S., Zhao, Z., & Wu, B. (2022). Deep learning for fake news detection: A comprehensive survey. AI Open. | [19] |
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of economic perspectives, 31(2), 211-36. | [20] |
Jarrahi, A., & Safari, L. (2021). FR-Detect: A Multi-Modal Framework for Early Fake News Detection on Social Media Using Publishers Features. arXiv preprint arXiv:2109.04835. | [21] |
Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96-104. | [22] |
Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. science, 359(6380), 1146-1151. | [23] |
Gupta, M., Zhao, P., & Han, J. (2012, April). Evaluating event credibility on twitter. In Proceedings of the 2012 SIAM international conference on data mining (pp. 153-164). Society for Industrial and Applied Mathematics. | [24] |
Kwon, S., Cha, M., & Jung, K. (2017). Rumor detection over varying time windows. PloS one, 12(1), e0168344. | [25] |
AlDayel, A., & Magdy, W. (2021). Stance detection on social media: State of the art and trends. Information Processing & Management, 58(4), 102597. | [26] |
Li, Y., & Caragea, C. (2019, November). Multi-task stance detection with sentiment and stance lexicons. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 6299-6305). | [27] |
Küçük, D., & Can, F. (2020). Stance detection: A survey. ACM Computing Surveys (CSUR), 53(1), 1-37. | [28] |
Pomerleau, D., & Rao, D. (2021). The fake news challenge: exploring how artificial intelligence technologies could be leveraged to combat fake news (2017). | [29] |
Alonso, M. A., Vilares, D., Gómez-Rodríguez, C., & Vilares, J. (2021). Sentiment analysis for fake news detection. Electronics, 10(11), 1348. | [30] |
Collins, B., Hoang, D. T., Nguyen, N. T., & Hwang, D. (2021). Trends in combating fake news on social media–a survey. Journal of Information and Telecommunication, 5(2), 247-266. | [31] |
Anoop, K., Gangan, M. P., & Lajish, V. L. (2019). Leveraging heterogeneous data for fake news detection. In Linking and mining heterogeneous and multi-view data (pp. 229-264). Springer, Cham. | [32] |
Manzoor, S. I., & Singla, J. (2019, April). Fake news detection using machine learning approaches: A systematic review. In 2019 3rd international conference on trends in electronics and informatics (ICOEI) (pp. 230-234). IEEE. | [33] |
Parikh, S. B., & Atrey, P. K. (2018, April). Media-rich fake news detection: A survey. In 2018 IEEE conference on multimedia information processing and retrieval (MIPR) (pp. 436-441). IEEE. | [34] |
Kondamudi, M. R., Sahoo, S. R., Chouhan, L., & Yadav, N. (2023). A comprehensive survey of fake news in social networks: Attributes, features, and detection approaches. Journal of King Saud University-Computer and Information Sciences, 35(6), 101571. | [35] |
Jabeen, K., Alshahrani, H., Islam, N., Rajab, K., Elmagzoub, M. A., & Shaikh, A. (2023). Machine Learning Models to Detect Online Fake News: a Systematic Literature Review. Telematique, 22(01), 552-561. | [36] |
Wang, Y., Ma, F., Jin, Z., Yuan, Y., Xun, G., Jha, K., ... & Gao, J. (2018, July). Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining (pp. 849-857). | [37] |
Liu, Y., & Wu, Y. F. (2018, April). Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1). | [38] |
(Zhou-1, 2019) Zhou, X., Jain, A., Phoha, V. V., & Zafarani, R. (2020). Fake news early detection: A theory-driven model. Digital Threats: Research and Practice, 1(2), 1-25. | [39] |
Thorne, J., & Vlachos, A. (2018). Automated fact checking: Task formulations, methods and future directions. arXiv preprint arXiv:1806.07687. | [40] |
Nakov, P., Corney, D., Hasanain, M., Alam, F., Elsayed, T., Barrón-Cedeño, A., ... & Martino, G. D. S. (2021). Automated fact-checking for assisting human fact-checkers. arXiv preprint arXiv:2103.07769. | [41] |
Corney, D. (2019). How we use AI to help fact check party manifestos. | [42] |
Hassan, N., Zhang, G., Arslan, F., Caraballo, J., Jimenez, D., Gawsane, S., ... & Tremayne, M. (2017). Claimbuster: The first-ever end-to-end fact-checking system. Proceedings of the VLDB Endowment, 10(12), 1945-1948. | [43] |
Jaradat, I., Gencheva, P., Barrón-Cedeño, A., Màrquez, L., & Nakov, P. (2018). ClaimRank: Detecting check-worthy claims in Arabic and English. arXiv preprint arXiv:1804.07587. | [44] |
Konstantinovskiy, L., Price, O., Babakar, M., & Zubiaga, A. (2021). Toward automated factchecking: Developing an annotation schema and benchmark for consistent automated claim detection. Digital threats: research and practice, 2(2), 1-16. | [45] |
Guo, Z., Schlichtkrull, M., & Vlachos, A. (2022). A survey on automated fact-checking. Transactions of the Association for Computational Linguistics, 10, 178-206. | [46] |
Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355. | [47] |
Hussein, A., Ahmad, F., & Kamaruddin, S. (2019). Content-social based features for fake news detection model from Twitter. International Journal of Advanced Trends in Computer Science and Engineering, 8(6), 2806-2810. | [48] |
Shu, K., Zhou, X., Wang, S., Zafarani, R., & Liu, H. (2019, August). The role of user profiles for fake news detection. In Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 436-439). | [49] |
Gupta, A., Lamba, H., Kumaraguru, P., & Joshi, A. (2013, May). Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In Proceedings of the 22nd international conference on World Wide Web (pp. 729-736). | [50] |
Mohammadi, B., & Izadkhah, H. (2019). Fake News detection on Social Networks Using Clustering of Fake Users (in Farsi), 5th National Conference on Distributed Computing and Big Data Processing. | [51] |
Sadeghi, F., & Jalaly Bidgoly, A. (2019). A survey of rumor detection methods in social networks. Biannual Journal Monadi for Cyberspace Security (AFTA), 8(1), 3-14. | [52] |
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2017). Automatic detection of fake news. arXiv preprint arXiv:1708.07104. | [53] |
Islam, N., Shaikh, A., Qaiser, A., Asiri, Y., Almakdi, S., Sulaiman, A., ... & Babar, S. A. (2021). Ternion: An Autonomous Model for Fake News Detection. Applied Sciences, 11(19), 9292. | [54] |
Saikh, T., Anand, A., Ekbal, A., & Bhattacharyya, P. (2019, June). A novel approach towards fake news detection: deep learning augmented with textual entailment features. In International Conference on Applications of Natural Language to Information Systems (pp. 345-358). Springer, Cham. | [55] |
Traylor, T., Straub, J., & Snell, N. (2019, January). Classifying fake news articles using natural language processing to identify in-article attribution as a supervised learning estimator. In 2019 IEEE 13th International Conference on Semantic Computing (ICSC) (pp. 445-449). IEEE. | [56] |
Girgis, S., Amer, E., & Gadallah, M. (2018, December). Deep learning algorithms for detecting fake news in online text. In 2018 13th international conference on computer engineering and systems (ICCES) (pp. 93-97). IEEE. | [57] |
Ahmed, H., Traore, I., & Saad, S. (2017, October). Detection of online fake news using n-gram analysis and machine learning techniques. In International conference on intelligent, secure, and dependable systems in distributed and cloud environments (pp. 127-138). Springer, Cham. | [58] |
Ksieniewicz, P., Choraś, M., Kozik, R., & Woźniak, M. (2019, November). Machine learning methods for fake news classification. In International Conference on Intelligent Data Engineering and Automated Learning (pp. 332-339). Springer, Cham. | [59] |
Zhou, X., Jain, A., Phoha, V. V., & Zafarani, R. (2020). Fake news early detection: A theory-driven model. Digital Threats: Research and Practice, 1(2), 1-25. | [60] |
Garg, S., & Sharma, D. K. (2022). Linguistic features based framework for automatic fake news detection. Computers & Industrial Engineering, 172, 108432. | [61] |
Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words: Predicting deception from linguistic styles. Personality and social psychology bulletin, 29(5), 665-675. | [62] |
Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates, 71(2001), 2001. | [63] |
Ribeiro Bezerra, J. F. (2021). Content-based fake news classification through modified voting ensemble. Journal of Information and Telecommunication, 5(4), 499-513.
| [64] |
Potthast, Martin, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein, A Stylometric Inquiry into Hyperpartisan and Fake News. arXiv preprint arXiv:1702.05638, 2017 | [65] |
Afroz, S., Brennan, M., & Greenstadt, R. (2012, May). Detecting hoaxes, frauds, and deception in writing style online. In 2012 IEEE Symposium on Security and Privacy (pp. 461-475). IEEE. | [66] |
Wardle, C. (2017). Fake news. It’s complicated. First Draft, 16, 1-11. | [67] |
Shah, P., & Kobti, Z. (2020, July). Multimodal fake news detection using a Cultural Algorithm with situational and normative knowledge. In 2020 IEEE Congress on Evolutionary Computation (CEC) (pp. 1-7). IEEE. | [68] |
Cao, J., Jin, Z., Zhang, Y., & Zhang, Y. (2016, October). MCG-ICT at MediaEval 2016 Verifying Tweets from both Text and Visual Content. In MediaEval. | [69] |
Comito, C., Caroprese, L., & Zumpano, E. (2023). Multimodal fake news detection on social media: a survey of deep learning techniques. Social Network Analysis and Mining, 13(1), 101. | [70] |
Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter, 19(1), 22-36. | [71] |
Conroy, N. K., Rubin, V. L., & Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the association for information science and technology, 52(1), 1-4. | [72] |
Hu, G., Ding, Y., Qi, S., Wang, X., & Liao, Q. (2019, October). Multi-depth graph convolutional networks for fake news detection. In CCF International conference on natural language processing and chinese computing (pp. 698-710). Springer, Cham. | [73] |
Zhang, J., Dong, B., & Philip, S. Y. (2019, December). Deep diffusive neural network based fake news detection from heterogeneous social networks. In 2019 IEEE International Conference on Big Data (Big Data) (pp. 1259-1266). IEEE.
| [74] |
Hardalov, M., Arora, A., Nakov, P., & Augenstein, I. (2022, June). Few-shot cross-lingual stance detection with sentiment-based pre-training. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 10, pp. 10729-10737). | [75] |
Zamani, S., Asadpour, M., & Moazzami, D. (2017, May). Rumor detection for persian tweets. In 2017 Iranian Conference on Electrical Engineering (ICEE) (pp. 1532-1536). IEEE. | [76] |
Shu, K., Wang, S., & Liu, H. (2019, January). Beyond news contents: The role of social context for fake news detection. In Proceedings of the twelfth ACM international conference on web search and data mining (pp. 312-320). | [77] |
Raza, S., & Ding, C. (2022). Fake news detection based on news content and social contexts: a transformer-based approach. International Journal of Data Science and Analytics, 13(4), 335-362. | [78] |
Sharma, S., & Sharma, D. K. (2019, November). Fake News Detection: A long way to go. In 2019 4th International Conference on Information Systems and Computer Networks (ISCON) (pp. 816-821). IEEE. | [79] |
Kwon, S., Cha, M., Jung, K., Chen, W., & Wang, Y. (2013, December). Prominent features of rumor propagation in online social media. In 2013 IEEE 13th international conference on data mining (pp. 1103-1108). IEEE. | [80] |
Shirazi, H., Dadahstabar, K., & Hashemi Golpaygani, S. A. (2020). A New Preprocessing Method for Rumor Detection in Social Networks based on LSTM-CNN. C4I Journal, 4(1), 38-51. | [81] |
Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543). | [82] |
Ma, J., Gao, W., & Wong, K. F. (2018). Rumor detection on twitter with tree-structured recursive neural networks. Association for Computational Linguistics. | [83] |
Goyal, P., Taterh, S., & Saxena, A., (2021). Fake News Detection using Machine Learning: A Review, International Journal of Advanced Engineering, Management and Science (IJAEMS), 7 (3), 33-38. | [84] |
Wang, W. Y. (2017). " liar, liar pants on fire": A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648. | [85] |
Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017, September). Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 2931-2937). | [86] |
Xia, H., Wang, Y., Zhang, J. Z., Zheng, L. J., Kamal, M. M., & Arya, V. (2023). COVID-19 fake news detection: A hybrid CNN-BiLSTM-AM model. Technological Forecasting and Social Change, 195, 122746 | [87] |
Raja, E., Soni, B., & Borgohain, S. K. (2023). Fake news detection in dravidian languages using transfer learning with adaptive finetuning. Engineering Applications of Artificial Intelligence, 126, 106877. | [88] |
Jwa, H., Oh, D., Park, K., Kang, J. M., & Lim, H. (2019). exbake: Automatic fake news detection model based on bidirectional encoder representations from transformers (bert). Applied Sciences, 9(19), 4062. | [89] |
Tida, V. S., Hsu, D., & Hei, D. (2022). Unified fake news detection using transfer learning of bidirectional encoder representation from transformers model. arXiv preprint arXiv:2202.01907. | [90] |
Kula, S., Choraś, M., & Kozik, R. (2019, May). Application of the BERT-based architecture in fake news detection. In Computational Intelligence in Security for Information Systems Conference (pp. 239-249). Springer, Cham. | [91] |
Kaliyar, R. K., Goswami, A., & Narang, P. (2021). FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimedia tools and applications, 80(8), 11765-11788. | [92] |
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. | [93] |
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
| [94] |
(Sanh, 2020) Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. | [95] |
Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555. | [96] |
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (1802). Deep contextualized word representations. CoRR abs/1802.05365 (2018). arXiv preprint arXiv:1802.05365. | [97] |
Mahmoodabad, S. D., Farzi, S., & Bakhtiarvand, D. B. (2018, December). Persian rumor detection on twitter. In 2018 9th international symposium on telecommunications (IST) (pp. 597-602). IEEE. | [98] |
Sicilia, R., Giudice, S. L., Pei, Y., Pechenizkiy, M., & Soda, P. (2017, November). Health-related rumour detection on Twitter. In 2017 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 1599-1606). IEEE. | [99] |
Castillo, C., Mendoza, M., & Poblete, B. (2011, March). Information credibility on twitter. In Proceedings of the 20th international conference on World wide web (pp. 675-684). | [100] |
Shin, Y., Sojdehei, Y., Zheng, L., & Blanchard, B. (2023). Content-Based Unsupervised Fake News Detection on Ukraine-Russia War. SMU Data Science Review, 7(1), 3. | [101] |
Zhao, Z., Resnick, P., & Mei, Q. (2015, May). Enquiring minds: Early detection of rumors in social media from enquiry posts. In Proceedings of the 24th international conference on world wide web (pp. 1395-1405). | [102] |
Dong, X., & Qian, L. (2022). Semi-supervised bidirectional RNN for misinformation detection. Machine Learning with Applications, 10, 100428. | [103] |
Al Obaid, A., Khotanlou, H., Mansoorizadeh, M., & Zabihzadeh, D. (2023). Robust Semisupervised Fake News Recognition by Effective Augmentations and Ensemble of Diverse Deep Learners. IEEE Access | [104] |
Abonizio, H. Q., de Morais, J. I., Tavares, G. M., & Barbon Junior, S. (2020). Language-independent fake news detection: English, Portuguese, and Spanish mutual features. Future Internet, 12(5), 87. | [105] |
De, A., Bandyopadhyay, D., Gain, B., & Ekbal, A. (2021). A Transformer-Based Approach to Multilingual Fake News Detection in Low-Resource Languages. Transactions on Asian and Low-Resource Language Information Processing, 21(1), 1-20. | [106] |
Chu, S. K. W., Xie, R., & Wang, Y. (2021). Cross-Language fake news detection. Data and Information Management, 5(1), 100-109. | [107] |
Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer, F., & Flammini, | [108] |
(Ma, 2017) Ma, B., Lin, D., & Cao, D. (2017). Content representation for microblog rumor detection. In Advances in Computational Intelligence Systems (pp. 245-251). Springer, Cham. | [109] |
Ahuja, H., & Michels A., Computational Fact-Checking through Relational Similarity based Path Mining, Joint Mathematics Meeting. | [110] |
Thorne, J., & Vlachos, A. (2018). Automated fact checking: Task formulations, methods and future directions. arXiv preprint arXiv:1806.07687. | [111] |
Chen, W., Zhang, Y., Yeo, C. K., Lau, C. T., & Lee, B. S. (2018). Unsupervised rumor detection based on users’ behaviors using neural networks. Pattern Recognition Letters, 105, 226-233. | [112] |
Vogel, I., & Jiang, P. (2019, September). Fake news detection with the new German dataset “GermanFakeNC”. In International Conference on Theory and Practice of Digital Libraries (pp. 288-295). Springer, Cham. | [113] |
Ameur, M. S. H., & Aliane, H. (2021). AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News & Hate Speech Detection Dataset. Procedia Computer Science, 189, 232-241. | [114] |
Fouad, Kh., Sabbeh S. (2022). Walaa Medhat, Arabic news detection using deep learning, Computers, Materials, & Continua; Henderson, 77(2), 3647-3665 | [115] |
Al-Yahya, M., Al-Khalifa, H., Al-Baity, H., AlSaeed, D., & Essam, A. (2021). Arabic fake news detection: comparative study of neural networks and transformer-based approaches. Complexity, 2021. | [116] |
Saghayan, M. H., Ebrahimi, S. F., & Bahrani, M. (2021, May). Exploring the Impact of Machine Translation on Fake News Detection: A Case Study on Persian Tweets about COVID-19. In 2021 29th Iranian Conference on Electrical Engineering (ICEE) (pp. 540-544). IEEE. | [117] |
Amjad, M., Sidorov, G., & Zhila, A. (2020, May). Data augmentation using machine translation for fake news detection in the Urdu language. In Proceedings of the 12th language resources and evaluation conference (pp. 2537-2542). | [118] |
Faustini, P. H. A., & Covões, T. F. (2020). Fake news detection in multiple platforms and languages. Expert Systems with Applications, 158, 113503. | [119] |
Santia, G. C., & Williams, J. R. (2018, June). Buzzface: A news veracity dataset with facebook user commentary and egos. In Twelfth international AAAI conference on web and social media. | [120] |
Tacchini, E., Ballarin, G., Della Vedova, M. L., Moret, S., & De Alfaro, L. (2017). Some like it hoax: Automated fake news detection in social networks. arXiv preprint arXiv:1704.07506. | [121] |
Zubiaga, A., Liakata, M., Procter, R., Wong Sak Hoi, G., & Tolmie, P. (2016). Analysing how people orient to and spread rumours in social media by looking at conversational threads. PloS one, 11(3), e0150989. | [122] |
Mitra, T., & Gilbert, E. (2015). Credbank: A large-scale social media corpus with associated credibility annotations. In Proceedings of the international AAAI conference on web and social media (Vol. 9, No. 1, pp. 258-267). | [123] |
Shu, K., Mahudeswaran, D., Wang, S., Lee, D., & Liu, H. (2018). Fakenewsnet: A data repository with news content, social context and spatialtemporal information for studying fake news on social media. arXiv preprint arXiv:1809.01286. | [124] |
Christlein, V., Riess, C., Jordan, J., Riess, C., & Angelopoulou, E. (2012). An evaluation of popular copy-move forgery detection approaches. IEEE Transactions on information forensics and security, 7(6), 1841-1854. | [125] |
Zlatkova, D., Nakov, P., & Koychev, I. (2019). Fact-checking meets fauxtography: Verifying claims about images. arXiv preprint arXiv:1908.11722. | [126] |
Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., & Kompatsiaris, Y. (2018). Detection and visualization of misleading content on Twitter. International Journal of Multimedia Information Retrieval, 7(1), 71-86. | [127] |
Heller, S., Rossetto, L., & Schuldt, H. (2018). The ps-battles dataset-an image collection for image manipulation detection. arXiv preprint arXiv:1804.04866. | [128] |
SG Class Online, The 4 Key Steps for a Crisis Management Plan. | [129] |
Boehm, L. E. (1994). The validity effect: A search for mediating variables. Personality and Social Psychology Bulletin, 20(3), 285-293. | [130] |
Vo, N., & Lee, K. (2018, June). The rise of guardians: Fact-checking url recommendation to combat fake news. In The 41st international ACM SIGIR conference on research & development in information retrieval (pp. 275-284). | [131] |
Alsmadi, I., Alazzam, I., & AlRamahi, M. A. (2021). An ontological analysis of misinformation in online social networks. arXiv preprint arXiv:2102.11362. | [132] |