Authors
Ahmad Assaf, Aline Senart, Raphaël Troncy
Publication date
2015/5/18
Book
Proceedings of the 24th international conference on World Wide Web
Pages
159-162
Description
Data is being published by both the public and private sectors and covers a diverse set of domains ranging from life sciences to media or government data. An example is the Linked Open Data (LOD) cloud which is potentially a gold mine for organizations and individuals who are trying to leverage external data sources in order to produce more informed business decisions. Considering the significant variation in size, the languages used and the freshness of the data, one realizes that spotting spam datasets or simply finding useful datasets without prior knowledge is increasingly complicated. In this paper, we propose Roomba, a scalable automatic approach for extracting, validating, correcting and generating descriptive linked dataset profiles. While Roomba is generic, we target CKAN-based data portals and we validate our approach against a set of open data portals including the Linked Open Data (LOD) cloud …
Total citations
201420152016201720182019202020212022202320241413222231
Scholar articles
A Assaf, A Senart, R Troncy - Proceedings of the 24th international conference on …, 2015