Authors
Antonio J Reinoso, Jesus M Gonzalez-Barahona, Rocio Munoz-Mansilla, Israel Herraiz
Publication date
2013
Journal
New Challenges in Distributed Information Filtering and Retrieval: DART 2011: Revised and Invited Papers
Pages
71-89
Publisher
Springer Berlin Heidelberg
Description
This chapter presents an empirical study about the temporal patterns characterizing the requests submitted by users to Wikipedia. The study is based on the analysis of the log lines registered by the Wikimedia Foundation Squid servers after having sent the appropriate content in response to users’ requests. The analysis has been conducted regarding the ten most visited editions of Wikipedia and has involved more than 14,000 million log lines corresponding to the traffic of the entire year 2009. The conducted methodology has mainly consisted in the parsing and filtering of users’ requests according to the study directives. As a result, relevant information fields have been finally stored in a database for persistence and further characterization. In this way, we, first, assessed, whether the traffic to Wikipedia could serve as a reliable estimator of the overall traffic to all the Wikimedia Foundation projects. Our …
Total citations
201320142015111
Scholar articles
AJ Reinoso, JM Gonzalez-Barahona… - New Challenges in Distributed Information Filtering …, 2013