Authors
Caitlin E Carey, Rebecca Shafee, Amanda Elliott, Duncan S Palmer, John Compitello, Masahiro Kanai, Liam Abbott, Patrick Schultz, Konrad J Karczewski, Samuel C Bryant, Caroline M Cusick, Claire Churchhouse, Daniel P Howrigan, Daniel King, George Davey Smith, Robbee Wedow, Benjamin M Neale, Raymond K Walters, Elise B Robinson
Publication date
2022/9/4
Journal
medRxiv
Pages
2022.09. 02.22279546
Publisher
Cold Spring Harbor Laboratory Press
Description
Broad yet detailed data collected in biobanks captures variation reflective of human health and behavior, but insights are hard to extract given their complexity and scale. In the largest factor analysis to date, we distill hundreds of medical record codes, physical assays, and survey items from UK Biobank into 35 understandable latent constructs. The identified factors recapitulate known disease classifications, highlight the relevance of psychiatric constructs, improve measurement of health-related behavior, and disentangle elements of socioeconomic status. We demonstrate the power of this principled data reduction approach to clarify genetic signal, enhance discovery, and identify associations between underlying phenotypic structure and health outcomes such as mortality. We emphasize the importance of considering the interwoven nature of the human phenome when evaluating large-scale patterns relevant to public health.
Total citations