<th id="5nh9l"></th><strike id="5nh9l"></strike><th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th><strike id="5nh9l"></strike>
<progress id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"><noframes id="5nh9l">
<th id="5nh9l"></th> <strike id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span>
<progress id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span><strike id="5nh9l"><noframes id="5nh9l"><strike id="5nh9l"></strike>
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"></span><span id="5nh9l"><video id="5nh9l"></video></span>
<th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th>
<progress id="5nh9l"><noframes id="5nh9l">

融合改進LSTM與XGBoost的可解釋性手足口病發病預測模型

Interpretable prediction model for hand-foot-and-mouth disease incidence based on improved LSTM and XGBoost

  • 摘要: 針對現有手足口病發病預測模型的準確率較低且可解釋性較差的問題,綜合多種氣象因素,提出一種基于自回歸差分移動平均模型(Autoregressive integrated moving average model, ARIMA)、長短時記憶網絡(Long short-term memory, LSTM)、極度梯度提升樹(Extreme gradient boosting, XGBoost)、灰狼優化算法(Grey wolf optimizer, GWO)、遺傳算法(Genetic algorithm, GA)和沙普利加和解釋(Shapley additive explanations, SHAP)的可解釋性預測模型ARIMA–LSTM–XGBoost. 首先,使用ARIMA模型捕捉數據中的線性趨勢進行預測并得到殘差數據;其次,將殘差數據輸入到LSTM神經網絡中,并采用GWO算法對LSTM算法中的關鍵參數進行自適應尋優,以捕捉復雜的非線性關系和長期依賴性;再次,利用GA算法的全局搜索能力,對XGBoost算法的參數進行優化,彌補XGBoost收斂較慢的缺陷;最后,采用誤差倒數法對改進的ARIMA–LSTM與XGBoost算法進行融合,以提升模型的預測準確度,并使用SHAP方法對該模型的特征重要性進行歸因和可解釋性分析. 基于南方某城市2014—2019年手足口病日發病數及氣象監測數據,進行了手足口病發病數預測的對比實驗,結果表明,相比于其他機器學習預測模型,ARIMA–LSTM–XGBoost模型具有更高的預測準確率,能夠準確地預測手足口病發病數以及高效地發現手足口病患病的潛在特征.

     

    Abstract: With the intensification of global warming, climate change has impacted every aspect of the occurrence, transmission, and variation of infectious diseases. The adverse effects of weather-related infectious diseases on human health have become a major public concern worldwide. Accurate and reliable forecasting of daily hand-foot-and-mouth disease (HFMD) cases is imperative for promptly implementing preventive and timely intervention measures. In order to address the issues of low accuracy and poor interpretability in existing HFMD incidence prediction models, in this paper, we propose an interpretable prediction model, namely, ARIMA–LSTM–XGBoost, which integrates multiple meteorological factors with Autoregressive integrated moving average model (ARIMA), Long short-term memory (LSTM), Extreme gradient boosting (XGBoost), Grey wolf optimizer (GWO), Genetic algorithm (GA) and Shapley additive explanations (SHAP). This model takes into account the potential impact of meteorological factors on HFMD incidence rates, aiming to achieve precise prediction of HFMD incidence trends and effective analysis of the key underlying influencing factors through multi-dimensional and multi-layered algorithm integration. Firstly, the ARIMA model is utilized to analyze historical HFMD incidence data to capture linear trends. Through differencing, autoregression, and moving average operations, the ARIMA model effectively extracts structural features from time-series data and generates initial prediction results, along with residual sequences. These residual sequences contain complex information that the ARIMA model fails to fully capture, providing a foundation for the subsequent nonlinear analysis. Secondly, based on the residual data left by the ARIMA model, LSTM is introduced to capture the potential complex nonlinear relationships and long-term dependencies. LSTM networks are particularly suitable for addressing long-term memory issues in time-series data. To further enhance the LSTM performance, the GWO is employed to adaptively optimize the key parameters of the LSTM. Thirdly, to fully leverage the advantages of XGBoost in handling nonlinear relationships and high-dimensional data while overcoming its complexities in parameter tuning and slower convergence, the GA is used to optimize the parameters of XGBoost. By simulating the selection, crossover, and mutation mechanisms in biological evolution, the GA efficiently searches for optimal solutions in the parameter space, thereby optimizing the performance of XGBoost. Finally, the prediction results of ARIMA–LSTM are fused with XGBoost by using a reciprocal error weighting method to improve the overall prediction accuracy. Meanwhile, the SHAP method is used to analyze the feature importance and enhance the interpretability of the proposed model. SHAP provides a fair and consistent approach to assess the contribution of each feature to the model’s prediction results. It not only aids in understanding which factors are most critical for HFMD incidence prediction, but also quantifies the degree of influence of these factors, thereby enhancing the interpretability of the model. Based on daily HFMD incidence and meteorological monitoring data from a southern city between 2014 and 2019, comparative experiments were conducted to evaluate the performance of the proposed model in predicting HFMD incidence. The experimental results demonstrate that the ARIMA–LSTM–XGBoost model achieves a significantly improved prediction accuracy compared to other machine learning prediction models. This model not only accurately predicts HFMD incidence trends, but also effectively identifies key meteorological factors influencing the incidence, providing a scientific basis for public health decision-making.

     

/

返回文章
返回
<th id="5nh9l"></th><strike id="5nh9l"></strike><th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th><strike id="5nh9l"></strike>
<progress id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"><noframes id="5nh9l">
<th id="5nh9l"></th> <strike id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span>
<progress id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span><strike id="5nh9l"><noframes id="5nh9l"><strike id="5nh9l"></strike>
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"></span><span id="5nh9l"><video id="5nh9l"></video></span>
<th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th>
<progress id="5nh9l"><noframes id="5nh9l">
259luxu-164