Inventors
Gary King, Connor T Jerzak, Anton Strezhnev
Publication date
2022/11/29
Patent office
US
Patent number
11514233
Application number
16415065
Description
Embodiments of the invention utilize a feature-extraction approach and/or a matching approach in combination with a nonparametric approach to estimate the proportion of documents in each of multiple labeled categories with high accuracy. The feature-extraction approach automatically generates continuously valued text features optimized for estimating the category proportions, and the matching approach constructs a matched set that closely resembles a data set that is unobserved based on an observed set, thereby improving the degree to which the distributions of the observed and unobserved sets resemble each other.
Total citations
Scholar articles
G King, CT Jerzak, A Strezhnev - US Patent 11,514,233, 2022