View article

[PDF] from unito.it

Alberto: Italian BERT language understanding model for NLP challenging tasks based on tweets

Authors

Marco Polignano, Pierpaolo Basile, Marco De Gemmis, Giovanni Semeraro, Valerio Basile

Publication date

2019

Journal

CEUR workshop proceedings

Volume

2481

Pages

1-6

Publisher

CEUR

Description

Recent scientific studies on natural language processing (NLP) report the outstanding effectiveness observed in the use of context-dependent and task-free language understanding models such as ELMo, GPT, and BERT. Specifically, they have proved to achieve state of the art performance in numerous complex NLP tasks such as question answering and sentiment analysis in the English language. Following the great popularity and effectiveness that these models are gaining in the scientific community, we trained a BERT language understanding model for the Italian language (AlBERTo). In particular, AlBERTo is focused on the language used in social networks, specifically on Twitter. To demonstrate its robustness, we evaluated AlBERTo on the EVALITA 2016 task SENTIPOLC (SENTIment POLarity Classification) obtaining state of the art results in subjectivity, polarity and irony detection on Italian tweets. The pre-trained AlBERTo model will be publicly distributed through the GitHub platform at the following web address: https://github.com/marcopoli/AlBERTo-it in order to facilitate future research.

Total citations

Cited by 274

2019202020212022202320246 49 44 81 73 20

Scholar articles

Alberto: Italian BERT language understanding model for NLP challenging tasks based on tweets

M Polignano, P Basile, M De Gemmis, G Semeraro… - CEUR workshop proceedings, 2019