In December 2019, the first cases of an infection caused by the virus called Covid19 were recorded in the Chinese city of Wuhan. As the months passed, this virus gave rise to a global pandemic that has not yet been eradicated. The COVID19 information disseminated on digital platforms has very different contents, which makes it difficult to recognize whether the published news is true or false, as well as the sentiments associated with it. Therefore, the hypothesis that feelings about COVID19 may differ between Fake news and Real news is considered. The aim of the present study is to support the identification of real tweets from fake ones and to compare the sentiments that users express in them. To achieve this goal, two different datasets obtained from the English version of the social network Twitter were used: the first dataset was downloaded from ’Kaggle’ and relates to the year 2021, while the second dataset is more recent and was obtained via ’Python’. Supervised Learning techniques were applied to the dataset downloaded from ’Kaggle’, also highlighting variables not recognizable at first glance (metadata) and associating the sensations manifested in the publications. From the analysis of the first d ataset, we derived an algorithm, which we subsequently applied to the second dataset for news recognition. The performances obtained are of interest and show the change and trend of emotions and feelings conveyed by the tweets
Fake News Detection on COVID 19 tweets via Supervised Learning Approach
Cevallos, Maria;De Biase, Matteo;Vocaturo, Eugenio
;Zumpano, Ester
2022-01-01
Abstract
In December 2019, the first cases of an infection caused by the virus called Covid19 were recorded in the Chinese city of Wuhan. As the months passed, this virus gave rise to a global pandemic that has not yet been eradicated. The COVID19 information disseminated on digital platforms has very different contents, which makes it difficult to recognize whether the published news is true or false, as well as the sentiments associated with it. Therefore, the hypothesis that feelings about COVID19 may differ between Fake news and Real news is considered. The aim of the present study is to support the identification of real tweets from fake ones and to compare the sentiments that users express in them. To achieve this goal, two different datasets obtained from the English version of the social network Twitter were used: the first dataset was downloaded from ’Kaggle’ and relates to the year 2021, while the second dataset is more recent and was obtained via ’Python’. Supervised Learning techniques were applied to the dataset downloaded from ’Kaggle’, also highlighting variables not recognizable at first glance (metadata) and associating the sensations manifested in the publications. From the analysis of the first d ataset, we derived an algorithm, which we subsequently applied to the second dataset for news recognition. The performances obtained are of interest and show the change and trend of emotions and feelings conveyed by the tweetsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.