Authors
Guilherme Schievelbein, Victor Anthony Arrascue Ayala, Fang Wei-Kleiner, Georg Lausen
Publication date
2019
Conference
ISWC (Satellites)
Pages
81-84
Description
Translating SPARQL to Spark SQL has been proposed to achieve better scalability in query evaluation. Recent investigations show that the database design for storing the RDF-graph plays a significant role in the performance, due to intrinsic characteristics of Spark’s computation model. The analysis points to the interesting fact that a Wide Property Table (WPT), a single-table design with one row for each subject and one column for each property, has very nice properties for storing RDF-graphs. In addition to WPT’s simplicity, SPARQL queries, in particular those with many joins on subjects, are translated to an efficient Spark execution plan. We aim to extend the WPT with inverse properties to broaden this benefit to other kinds of queries. Thus, in this paper we propose a framework which can leverage one or a combination of WPTs extensions. Our experiments on a widely used benchmark reveal that a combination of three different kinds of WPT together leads to the best performance for almost all query types but the linear-shaped ones.