Authors
Babak Shahbaba, Radford Neal
Publication date
2009/8/1
Journal
Journal of Machine Learning Research
Volume
10
Issue
8
Description
We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component, with different regression coefficients. We use simulated data to compare the performance of this new approach to alternative methods such as multinomial logit (MNL) models, decision trees, and support vector machines. We also evaluate our approach on two classification problems: identifying the folding class of protein sequences and detecting Parkinson’s disease. Our model can sometimes improve predictive accuracy. Moreover, by grouping observations into sub-populations (ie, mixture components), our model can sometimes provide insight into hidden structure in the data.
Total citations
200820092010201120122013201420152016201720182019202020212022202320243211131224231520211915282322168
Scholar articles
B Shahbaba, R Neal - Journal of Machine Learning Research, 2009