Daiju Ueda, Toshimasa Matsumoto, Akira Yamamoto, Shannon L Walston, Yasuhito Mitsuyama, Hirotaka Takita, Kazuhisa Asai, Tetsuya Watanabe, Koji Abo, Tatsuo Kimura, Shinya Fukumoto, Toshio Watanabe, Tohru Takeshita, Yukio Miki
Lancet Digit Health . 2024 Jul 8:S2589-7500(24)00113-4. doi: 10.1016/S2589-7500(24)00113-4. Online ahead of print.
Background: Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays.
Methods: In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A-E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV1. X-rays from three institutions (A-C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV1 was evaluated by calculating the Pearson's correlation coefficient (r), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry.
Findings: We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded r values of 0·91 (99% CI 0·90-0·92) for institution D and 0·90 (0·89-0·91) for institution E, ICC of 0·91 (99% CI 0·90-0·92) and 0·89 (0·88-0·90), MSE of 0·17 L2 (99% CI 0·15-0·19) and 0·17 L2 (0·16-0·19), RMSE of 0·41 L (99% CI 0·39-0·43) and 0·41 L (0·39-0·43), and MAE of 0·31 L (99% CI 0·29-0·32) and 0·31 L (0·30-0·32). External testing for FEV1 yielded r values of 0·91 (99% CI 0·90-0·92) for institution D and 0·91 (0·90-0·91) for institution E, ICC of 0·90 (99% CI 0·89-0·91) and 0·90 (0·90-0·91), MSE of 0·13 L2 (99% CI 0·12-0·15) and 0·11 L2 (0·10-0·12), RMSE of 0·37 L (99% CI 0·35-0·38) and 0·33 L (0·32-0·35), and MAE of 0·28 L (99% CI 0·27-0·29) and 0·25 L (0·25-0·26).
Interpretation: This deep learning model allowed estimation of FVC and FEV1 from chest x-rays, showing high agreement with spirometry. The model offers an alternative to spirometry for assessing pulmonary function, which is especially useful for patients who are unable to undergo spirometry, and might enhance the customisation of CT imaging protocols based on insights gained from chest x-rays, improving the diagnosis and management of lung diseases. Future studies should investigate the performance of this AI model in combination with clinical information to enable more appropriate and targeted use.
Funding: None.