Authors
Ben Dai, Xiaotong Shen, Junhui Wang
Publication date
2022/1/2
Journal
Journal of the American Statistical Association
Volume
117
Issue
537
Pages
307-319
Publisher
Taylor & Francis
Description
Numerical embedding has become one standard technique for processing and analyzing unstructured data that cannot be expressed in a predefined fashion. It stores the main characteristics of data by mapping it onto a numerical vector. An embedding is often unsupervised and constructed by transfer learning from large-scale unannotated data. Given an embedding, a downstream learning method, referred to as a two-stage method, is applicable to unstructured data. In this article, we introduce a novel framework of embedding learning to deliver a higher learning accuracy than the two-stage method while identifying an optimal learning-adaptive embedding. In particular, we propose a concept of U-minimal sufficient learning-adaptive embeddings, based on which we seek an optimal one to maximize the learning accuracy subject to an embedding constraint. Moreover, when specializing the general framework to …
Total citations
20212022202320243232
Scholar articles
B Dai, X Shen, J Wang - Journal of the American Statistical Association, 2022