Authors
Paolo Missier, Norman W Paton, Khalid Belhajjame
Publication date
2010/3/22
Book
Proceedings of the 13th International Conference on Extending Database Technology
Pages
299-310
Description
The management and querying of workflow provenance data underpins a collection of activities, including the analysis of workflow results, and the debugging of workflows or services. Such activities require efficient evaluation of lineage queries over potentially complex and voluminous provenance logs. Näive implementations of lineage queries navigate provenance logs by joining tables that represent the flow of data between connected processors invoked from workflows. In this paper we provide an approach to provenance querying that: (i) avoids joins over provenance logs by using information about the workflow definition to inform the construction of queries that directly target relevant lineage results; (ii) provides fine grained provenance querying, even for workflows that create and consume collections; and (iii) scales effectively to address complex workflows, workflows with large intermediate data sets, and …
Total citations
20102011201220132014201520162017201820192020202120222023910145384954321
Scholar articles
P Missier, NW Paton, K Belhajjame - Proceedings of the 13th International Conference on …, 2010