View article

[PDF] from adelaide.edu.au

Building a large-scale corpus for evaluating event detection on twitter

Authors

Andrew J McMinn, Yashar Moshfeghi, Joemon M Jose

Publication date

2013/10/27

Book

Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Pages

409-418

Description

Despite the popularity of Twitter for research, there are very few publicly available corpora, and those which are available are either too small or unsuitable for tasks such as event detection. This is partially due to a number of issues associated with the creation of Twitter corpora, including restrictions on the distribution of the tweets and the difficultly of creating relevance judgements at such a large scale. The difficulty of creating relevance judgements for the task of event detection is further hampered by ambiguity in the definition of event. In this paper, we propose a methodology for the creation of an event detection corpus. Specifically, we first create a new corpus that covers a period of 4 weeks and contains over 120 million tweets, which we make available for research. We then propose a definition of event which fits the characteristics of Twitter, and using this definition, we generate a set of relevance judgements …

Total citations

Cited by 266

2013201420152016201720182019202020212022202320241 9 20 32 28 25 23 27 33 31 15 21

Scholar articles

Building a large-scale corpus for evaluating event detection on twitter

AJ McMinn, Y Moshfeghi, JM Jose - Proceedings of the 22nd ACM international conference …, 2013