Logo Logo

Rudolph, Jan ORCID: 0000-0002-4849-8034; Huemmer, Christian; Preuhs, Alexander; Buizza, Giulia; Dinkel, Julien; Koliogiannis, Vanessa; Fink, Nicola; Goller, Sophia Samira; Schwarze, Vincent; Heimer, Maurice; Hoppe, Boj Friedrich; Liebig, Thomas; Ricke, Jens; Sabel, Bastian Oliver; Rueckel, Johannes (2025): Threshold optimization in AI chest radiography analysis: integrating real-world data and clinical subgroups. European Radiology Experimental, 9: 95. ISSN 2509-9280

[thumbnail of s41747-025-00632-8.pdf] Creative Commons Namensnennung (CC BY)
Veröffentlichte Publikation
s41747-025-00632-8.pdf

Abstract

Background

Manufacturer-defined AI thresholds for chest x-ray (CXR) often lack customization options. Threshold optimization strategies utilizing users’ clinical real-world data along with pathology-enriched validation data may better address subgroup-specific and user-specific needs.

Materials and methods

A pathology-enriched dataset (study cohort, 563 (CXRs)) with pleural effusions, consolidations, pneumothoraces, nodules, and unremarkable findings was analysed by an AI system and six reference radiologists. The same AI model was applied to a routine dataset (clinical cohort, 15,786 consecutive routine CXRs). Iterative receiver operating characteristic analysis linked achievable sensitivities (study cohort) to resulting AI alert rates in clinical routine inpatient or outpatient subgroups. “Optimized” thresholds (OTs) were defined by a 1% sensitivity increase leading to more than a 1% rise in AI alert rates. Threshold comparisons (OTs versus AI vendor’s default thresholds (AIDT) versus Youden’s thresholds) were based on 400 clinical cohort cases with expert radiologists’ reference.

Results

AIDTs, OTs, and Youden’s thresholds varied across scenarios, with OTs differing based on tailoring for inpatient or outpatient CXRs. AIDT lowering most reasonably improved sensitivity for pleural effusion, with increases from 46.8% (AIDT) to 87.2% (OT) for outpatients and from 76.3% (AIDT) to 93.5% (OT) for inpatients; similar trends appeared for consolidations. Conversely, regarding inpatient nodule detection, increasing the threshold improved accuracy from 69.5% (AIDT) to 82.5% (OT) without compromising sensitivity. Graphical analysis supports threshold selection by illustrating estimated sensitivities and clinical routine AI alert rates.

Conclusion

An innovative, subgroup-specific AI threshold optimization is proposed, automatically implemented and transferable to other AI algorithms and varying clinical subgroup settings.

Publikation bearbeiten
Publikation bearbeiten