Authors
Yair Weiss, Antonio Torralba, Rob Fergus
Publication date
2008
Journal
Advances in neural information processing systems
Volume
21
Description
Semantic hashing seeks compact binary codes of datapoints so that the Hamming distance between codewords correlates with semantic similarity. Hinton et al. used a clever implementation of autoencoders to find such codes. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresh-olded eigenvectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigen-functions of manifolds, we show how to efficiently calculate the code of a novel datapoint. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes significantly outperform the state-of-the art.
Total citations
200920102011201220132014201520162017201820192020202120222023202428518813918323127633935330729926924118615157
Scholar articles
Y Weiss, A Torralba, R Fergus - Advances in neural information processing systems, 2008