Authors
Alexander Schätzle, Martin Przyjaciel-Zablocki, Thorsten Berberich, Georg Lausen
Publication date
2016
Conference
Biomedical Data Management and Graph Online Querying: VLDB 2015 Workshops, Big-O (Q) and DMAH, Waikoloa, HI, USA, August 31–September 4, 2015, Revised Selected Papers 1
Pages
155-168
Publisher
Springer International Publishing
Description
RDF has constantly gained attention for data publishing due to its flexible data model, raising the need for distributed querying. However, existing approaches using general-purpose cluster frameworks employ a record-oriented perception of RDF ignoring its inherent graph-like structure. Recently, GraphX was published as a graph abstraction on top of Spark, an in-memory cluster computing system. It allows to seamlessly combine graph-parallel and data-parallel computation in a single system, an unique feature not available in other systems. In this paper we introduce S2X, a SPARQL query processor for Hadoop where we leverage this unified abstraction by implementing basic graph pattern matching of SPARQL as a graph-parallel task while other operators are implemented in a data-parallel manner. To the best of our knowledge, this is the first approach to combine graph-parallel and data-parallel …
Total citations
201620172018201920202021202220232024872292214543
Scholar articles
A Schätzle, M Przyjaciel-Zablocki, T Berberich… - Biomedical Data Management and Graph Online …, 2016