Authors
Chamikara Jayalath, Julian Stephen, Patrick Eugster
Publication date
2013/5/27
Journal
IEEE Transactions on Computers
Volume
63
Issue
1
Pages
74-87
Description
Efficiently analyzing big data is a major issue in our current era. Examples of analysis tasks include identification or detection of global weather patterns, economic changes, social phenomena, or epidemics. The cloud computing paradigm along with software tools such as implementations of the popular MapReduce framework offer a response to the problem by distributing computations among large sets of nodes. In many scenarios, input data are, however, geographically distributed (geodistributed) across data centers, and straightforwardly moving all data to a single data center before processing it can be prohibitively expensive. Above-mentioned tools are designed to work within a single cluster or data center and perform poorly or not at all when deployed across data centers. This paper deals with executing sequences of MapReduce jobs on geo-distributed data sets. We analyze possible ways of executing …
Total citations
201220132014201520162017201820192020202120222023202413131824241212811661
Scholar articles