Authors
George Papadakis, Georgia Koutrika, Themis Palpanas, Wolfgang Nejdl
Publication date
2013/3/27
Journal
IEEE Transactions on Knowledge and Data Engineering
Volume
26
Issue
8
Pages
1946-1960
Publisher
IEEE
Description
Entity Resolution is an inherently quadratic task that typically scales to large data collections through blocking. In the context of highly heterogeneous information spaces, blocking methods rely on redundancy in order to ensure high effectiveness at the cost of lower efficiency (i.e., more comparisons). This effect is partially ameliorated by coarse-grained block processing techniques that discard entire blocks either a-priori or during the resolution process. In this paper, we introduce meta-blocking as a generic procedure that intervenes between the creation and the processing of blocks, transforming an initial set of blocks into a new one with substantially fewer comparisons and equally high effectiveness. In essence, meta-blocking aims at extracting the most similar pairs of entities by leveraging the information that is encapsulated in the block-to-entity relationships. To this end, it first builds an abstract graph …
Total citations
20132014201520162017201820192020202120222023202425716173021292416127
Scholar articles
G Papadakis, G Koutrika, T Palpanas, W Nejdl - IEEE Transactions on Knowledge and Data …, 2013