<th id="5nh9l"></th><strike id="5nh9l"></strike><th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th><strike id="5nh9l"></strike>
<progress id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"><noframes id="5nh9l">
<th id="5nh9l"></th> <strike id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span>
<progress id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span><strike id="5nh9l"><noframes id="5nh9l"><strike id="5nh9l"></strike>
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"></span><span id="5nh9l"><video id="5nh9l"></video></span>
<th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th>
<progress id="5nh9l"><noframes id="5nh9l">

基于可解釋機器學習模型的甲狀腺乳頭狀癌診斷預測

Predictive diagnosis of papillary thyroid carcinoma using interpretable machine learning

  • 摘要: 甲狀腺乳頭狀癌(Papillary thyroid carcinoma, PTC)是甲狀腺癌中最常見的類型,其早期癥狀的隱匿性常常導致診斷延遲. 為了改善這一狀況,本研究旨在開發和驗證一種基于機器學習的預測模型用于PTC的診斷,從而為臨床決策提供強有力的支持. 本研究從2907名良性結節和PTC患者隊列中收集并整合了人口統計學、超聲影像及實驗室特征,采用Lasso回歸進行特征選擇,使用9種機器學習算法構建PTC診斷模型,并使用了SHAP方法對最佳模型進行可解釋性分析. 最終,通過Lasso回歸確定了11個與PTC相關的生物標志物,XGBoost模型在輔助PTC診斷方面表現最佳,受試者工作曲線下面積為0.9066,準確率為0.8744. 校準曲線和臨床決策分析也顯示XGBoost具有最佳的模型性能. SHAP結果顯示,TI-RADS分類、結節形態和超聲回聲強度是PTC診斷的3個最重要預測因素,而年齡、尿酸水平、鉀濃度、甲狀腺球蛋白、抗甲狀腺過氧化物酶抗體、高血壓狀態、鈣化情況以及飲酒情況也顯示出不同程度的影響. 本研究成功開發并驗證了一個綜合性的機器學習模型,該模型結合了多種患者因素用于識別PTC,不僅展示了在PTC診斷方面的巨大潛力,而且通過SHAP分析增強了模型的透明度和可信度,有助于臨床醫生更深入地理解關鍵預測因素的作用機制.

     

    Abstract: Papillary thyroid carcinoma (PTC) is the most prevalent type of thyroid cancer, and the insidious nature of its early symptoms often leads to delayed diagnosis. To address this challenge, this study aimed to develop and validate a machine learning (ML)-based predictive model for PTC diagnosis to enhance clinical decision-making. We enrolled a retrospective cohort of 2,907 patients, including 1,005 individuals with benign thyroid nodules and 1,902 with histologically confirmed PTC. Comprehensive demographic, ultrasonographic, and laboratory data encompassing 70 initial features were collected. The dataset was partitioned into training and independent test sets in an 8:2 ratio to ensure robust model validation. Feature selection was performed using least absolute shrinkage and selection operator regression, which identified 11 clinically relevant biomarkers strongly associated with PTC: age, TI-RADS classification, ultrasound echogenicity, nodule morphology, calcification status, hypertension, alcohol consumption, uric acid level, potassium concentration, thyroglobulin (Tg), and anti-thyroid peroxidase antibodies (Anti-TPO). These features were used to construct diagnostic models using nine ML algorithms: random forest (RF), adaptive boosting, extreme gradient boosting (XGBoost), classification and regression tree, light gradient boosting machine, gradient boosting decision tree, support vector machine, multilayer perceptron, and logistic regression (LR). To optimize the generalizability and stability of the model, a rigorous training framework was implemented by combining 10-fold cross-validation with grid search hyperparameter tuning, where the parameter configurations were optimized to maximize the area under the receiver operating characteristic curve (AUC). The model performance was comprehensively evaluated on the test set using a suite of metrics derived from confusion matrices, including accuracy, recall, F1 score, positive predictive value, negative predictive value, and AUC. Additionally, calibration and decision curve analysis (DCA) were conducted to assess the reliability and clinical utility of the predicted probabilities across risk thresholds. Among the nine models, XGBoost emerged as the top performer, achieving an exceptional discriminative ability with an accuracy of 0.8744, an F1 score of 0.9098, and an AUC of 0.9066. Calibration analysis revealed that XGBoost and LR exhibited the closest alignments to the ideal diagonal curve, indicating well-calibrated probability estimates. DCA further demonstrated that XGBoost and RF provided significantly higher net clinical benefits than the other models within the threshold probability range of 0.1–0.5, underscoring their practical utility in guiding clinical interventions. To enhance the interpretability and foster clinician trust, Shapley additive explanations (SHAP) were applied to deconstruct the optimal XGBoost model. SHAP analysis identified the TI-RADS classification, nodule morphology, and ultrasound echogenicity as the three most influential predictors of PTC. Secondary contributors included age, uric acid, potassium, Tg, anti-TPO, hypertension, calcification, and alcohol consumption, each with variable impact magnitudes. These findings not only align with established clinical knowledge but also highlight novel interactions between metabolic markers and imaging features in PTC pathogenesis. This study successfully developed a multimodal ML framework that integrated diverse patient characteristics for PTC diagnosis. The XGBoost-based model demonstrated superior diagnostic accuracy and clinical applicability, whereas the SHAP-driven interpretability bridged the gap between algorithmic predictions and actionable clinical insights. By elucidating the mechanistic roles of the key predictors, this study advances personalized risk stratification and empowers clinicians to make informed decisions regarding PTC management. Future studies should focus on external validation across multicenter cohorts and real-world implementations to assess the translational impact.

     

/

返回文章
返回
<th id="5nh9l"></th><strike id="5nh9l"></strike><th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th><strike id="5nh9l"></strike>
<progress id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"><noframes id="5nh9l">
<th id="5nh9l"></th> <strike id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span>
<progress id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span><strike id="5nh9l"><noframes id="5nh9l"><strike id="5nh9l"></strike>
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"></span><span id="5nh9l"><video id="5nh9l"></video></span>
<th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th>
<progress id="5nh9l"><noframes id="5nh9l">
259luxu-164