Authors
Mohammed Khalilia, Sounak Chakraborty, Mihail Popescu
Publication date
2011/12
Journal
BMC medical informatics and decision making
Volume
11
Pages
1-13
Publisher
BioMed Central
Description
Background
We present a method utilizing Healthcare Cost and Utilization Project (HCUP) dataset for predicting disease risk of individuals based on their medical diagnosis history. The presented methodology may be incorporated in a variety of applications such as risk management, tailored health communication and decision support systems in healthcare.
Methods
We employed the National Inpatient Sample (NIS) data, which is publicly available through Healthcare Cost and Utilization Project (HCUP), to train random forest classifiers for disease prediction. Since the HCUP data is highly imbalanced, we employed an ensemble learning approach based on repeated random sub-sampling. This technique divides the training data into multiple sub-samples, while ensuring that each sub-sample is fully balanced. We compared the performance of support …
Total citations
201120122013201420152016201720182019202020212022202320246920252130506883949913510559
Scholar articles
M Khalilia, S Chakraborty, M Popescu - BMC medical informatics and decision making, 2011