Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance

Dong Ao Ma; Jiaxuan Pang; Michael B. Gotway; Jianming Liang

doi:10.1007/978-3-031-43907-0_62

Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance

Dong Ao Ma, Jiaxuan Pang, Michael B. Gotway, Jianming Liang

Health Solutions, College of (CHS)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google’s proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark’s superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google’s proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.

Original language	English (US)
Title of host publication	Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings
Editors	Hayit Greenspan, Hayit Greenspan, Anant Madabhushi, Parvin Mousavi, Septimiu Salcudean, James Duncan, Tanveer Syeda-Mahmood, Russell Taylor
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	651-662
Number of pages	12
ISBN (Print)	9783031439063
DOIs	https://doi.org/10.1007/978-3-031-43907-0_62
State	Published - 2023
Event	26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023 - Vancouver, Canada Duration: Oct 8 2023 → Oct 12 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14220 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023
Country/Territory	Canada
City	Vancouver
Period	10/8/23 → 10/12/23

Keywords

Accruing and Reusing Knowledge
Large-scale Pretraining

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/978-3-031-43907-0_62

Cite this

Ma, D. A., Pang, J., Gotway, M. B., & Liang, J. (2023). Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance. In H. Greenspan, H. Greenspan, A. Madabhushi, P. Mousavi, S. Salcudean, J. Duncan, T. Syeda-Mahmood, & R. Taylor (Eds.), Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings (pp. 651-662). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14220 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43907-0_62

Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance. / Ma, Dong Ao; Pang, Jiaxuan; Gotway, Michael B. et al.
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings. ed. / Hayit Greenspan; Hayit Greenspan; Anant Madabhushi; Parvin Mousavi; Septimiu Salcudean; James Duncan; Tanveer Syeda-Mahmood; Russell Taylor. Springer Science and Business Media Deutschland GmbH, 2023. p. 651-662 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14220 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Ma, DA, Pang, J, Gotway, MB & Liang, J 2023, Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance. in H Greenspan, H Greenspan, A Madabhushi, P Mousavi, S Salcudean, J Duncan, T Syeda-Mahmood & R Taylor (eds), Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14220 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 651-662, 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023, Vancouver, Canada, 10/8/23. https://doi.org/10.1007/978-3-031-43907-0_62

Ma DA, Pang J, Gotway MB, Liang J. Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance. In Greenspan H, Greenspan H, Madabhushi A, Mousavi P, Salcudean S, Duncan J, Syeda-Mahmood T, Taylor R, editors, Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2023. p. 651-662. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-43907-0_62

Ma, Dong Ao ; Pang, Jiaxuan ; Gotway, Michael B. et al. / Foundation Ark : Accruing and Reusing Knowledge for Superior and Robust Performance. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings. editor / Hayit Greenspan ; Hayit Greenspan ; Anant Madabhushi ; Parvin Mousavi ; Septimiu Salcudean ; James Duncan ; Tanveer Syeda-Mahmood ; Russell Taylor. Springer Science and Business Media Deutschland GmbH, 2023. pp. 651-662 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{4b79d368fcac4216a5cbdb8b43e5db2c,

title = "Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance",

abstract = "Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google{\textquoteright}s proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark{\textquoteright}s superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google{\textquoteright}s proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.",

keywords = "Accruing and Reusing Knowledge, Large-scale Pretraining",

author = "Ma, {Dong Ao} and Jiaxuan Pang and Gotway, {Michael B.} and Jianming Liang",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023 ; Conference date: 08-10-2023 Through 12-10-2023",

year = "2023",

doi = "10.1007/978-3-031-43907-0_62",

language = "English (US)",

isbn = "9783031439063",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "651--662",

editor = "Hayit Greenspan and Hayit Greenspan and Anant Madabhushi and Parvin Mousavi and Septimiu Salcudean and James Duncan and Tanveer Syeda-Mahmood and Russell Taylor",

booktitle = "Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings",

address = "Germany",

}

TY - GEN

T1 - Foundation Ark

T2 - 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023

AU - Ma, Dong Ao

AU - Pang, Jiaxuan

AU - Gotway, Michael B.

AU - Liang, Jianming

PY - 2023

Y1 - 2023

N2 - Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google’s proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark’s superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google’s proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.

AB - Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google’s proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark’s superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google’s proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.

KW - Accruing and Reusing Knowledge

KW - Large-scale Pretraining

UR - http://www.scopus.com/inward/record.url?scp=85174576327&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85174576327&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-43907-0_62

DO - 10.1007/978-3-031-43907-0_62

M3 - Conference contribution

AN - SCOPUS:85174576327

SN - 9783031439063

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 651

EP - 662

BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings

A2 - Greenspan, Hayit

A2 - Madabhushi, Anant

A2 - Mousavi, Parvin

A2 - Salcudean, Septimiu

A2 - Duncan, James

A2 - Syeda-Mahmood, Tanveer

A2 - Taylor, Russell

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 8 October 2023 through 12 October 2023

ER -