Digital medicine and the curse of dimensionality

Visar Berisha; Chelsea Krantsevich; P. Richard Hahn; Shira Hahn; Gautam Dasarathy; Pavan Turaga; Julie Liss

doi:10.1038/s41746-021-00521-5

Digital medicine and the curse of dimensionality

Visar Berisha, Chelsea Krantsevich, P. Richard Hahn, Shira Hahn, Gautam Dasarathy, Pavan Turaga, Julie Liss

Research output: Contribution to journal › Review article › peer-review

109 Scopus citations

Abstract

Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.

Original language	English (US)
Article number	153
Journal	npj Digital Medicine
Volume	4
Issue number	1
DOIs	https://doi.org/10.1038/s41746-021-00521-5
State	Published - Dec 2021

ASJC Scopus subject areas

Medicine (miscellaneous)
Health Informatics
Computer Science Applications
Health Information Management

Access to Document

10.1038/s41746-021-00521-5

Cite this

@article{f1a866ec77ac4b2793300faad728c0cb,

title = "Digital medicine and the curse of dimensionality",

abstract = "Digital health data are multimodal and high-dimensional. A patient{\textquoteright}s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients{\textquoteright} lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.",

author = "Visar Berisha and Chelsea Krantsevich and Hahn, {P. Richard} and Shira Hahn and Gautam Dasarathy and Pavan Turaga and Julie Liss",

note = "Publisher Copyright: {\textcopyright} 2021, The Author(s).",

year = "2021",

month = dec,

doi = "10.1038/s41746-021-00521-5",

language = "English (US)",

volume = "4",

journal = "npj Digital Medicine",

issn = "2398-6352",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - Digital medicine and the curse of dimensionality

AU - Berisha, Visar

AU - Krantsevich, Chelsea

AU - Hahn, P. Richard

AU - Hahn, Shira

AU - Dasarathy, Gautam

AU - Turaga, Pavan

AU - Liss, Julie

PY - 2021/12

Y1 - 2021/12

N2 - Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.

AB - Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.

UR - http://www.scopus.com/inward/record.url?scp=85118356942&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85118356942&partnerID=8YFLogxK

U2 - 10.1038/s41746-021-00521-5

DO - 10.1038/s41746-021-00521-5

M3 - Review article

AN - SCOPUS:85118356942

SN - 2398-6352

VL - 4

JO - npj Digital Medicine

JF - npj Digital Medicine

IS - 1

M1 - 153

ER -

Digital medicine and the curse of dimensionality

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this