Authors
Marco Polignano, Pierpaolo Basile, Marco De Gemmis, Giovanni Semeraro
Publication date
2019/11/19
Conference
NL4AI@ AI* IA
Pages
1-13
Description
The task of identifying hate speech in social networks has recently attracted considerable interest in the community of natural language processing. This challenge has great importance for identifying cyberattacks on minors, bullying activities, misogyny, or other kinds of hate discriminations that can cause diseases. Identifying them quickly and accurately can, therefore, help to solve situations that are dangerous for the health of the attacked people. Numerous national and international initiatives have addressed this problem by providing many resources and solutions to the problem. In particular, we focus on the Hate Speech Detection evaluation campaign (HaSpeeDe) held at Evalita 2018. It proposes an evaluation campaign with the aim of developing strategies for identifying hate speeches on Twitter and Facebook written in the Italian language. The dataset released for the task has been used by the classification approach proposed in this work for demonstrating that it is possible to solve the task efficiently and accurately. Our solution is based on an Italian Language Understanding model trained with a BERT architecture and 200M of Italian Tweets (AlBERTo). We used AlBERTo for fine-tuning a classification model of hate speech, obtaining state of the art results considering the best systems presented at the HaSpeeDe workshop. In this regard, AlBERTo is here proposed as one of the most versatile resources to be used for the task of classification of Social Media Textual contents in the Italian Language. The claim is supported by the similar results obtained by AlBERTo in the task of sentiment analysis, and irony detection demonstrated in …
Total citations
201920202021202220232024187743
Scholar articles