-
摘要: 傳統PM2.5預測方法獲取污染物濃度數據需要大型精密儀器,成本較高。本文嘗試利用圖像數據進行PM2.5濃度預測。大氣PM2.5濃度的變化與圖像的暗通道強度、對比度和HSI(Hue-saturation-intensity)顏色差異有密切聯系。大氣中PM2.5濃度的升高會導致非天空區域的暗通道強度值下降,圖像對比度下降和HSI空間顏色差異變小。通過分析PM2.5濃度與圖像特征的關系,提出了一種基于圖像混合核的列生成空氣質量PM2.5預測模型。首先,以1 h為采樣周期,每日8:00~17:00為采樣范圍,采集多種天氣條件下的景物圖像,提取圖像的對比度、暗通道強度和HSI顏色差異共5個圖像特征。其次,數據存在樣本規模大、樣本不平坦分布等特點,單個核函數構成的預測模型難以滿足預測精度需求,因此本文按照核結構從簡單到復雜的原則,選擇線性核函數、多項式核函數和高斯核函數三種核函數建立組合模型。然后計算每個核基于訓練樣本的Gram矩陣,并將所有Gram矩陣并列成一個混合核矩陣。利用列生成算法和混合核矩陣建立預測模型,求解模型參數。最后,進行仿真實驗,實驗結果表明本文提出的可滿足預測精度要求,與單核預測模型相比,該預測模型預測精度更高,模型穩定性更好。計算復雜度分析結果顯示基于圖像混合核的列生成模型與單核預測模型相比計算量無明顯增加。Abstract: The conventional method of PM2.5 prediction requires high-precision instruments to obtain data on the concentration of pollutants, resulting in a high prediction costs. In this work, we attempt to use image data to estimate PM2.5 concentration. The concentration of atmospheric PM2.5 is closely linked to the image’s dark channel intensity, contrast, and color difference of HSI. The increase in atmospheric PM2.5 concentration leads to a decrease in the non-sky area dark channel intensity, image contrast, and HSI spatial color difference. In this paper, a Column-Generation PM2.5 prediction model based on image mixture kernel was proposed by analyzing the relationship between PM2.5 and image features. First, the sampling period was taken as 1 h, and 8:00–17:00 was taken as the sampling range daily. The scene images were recorded in different weather conditions, and five image features were extracted, including contrast, dark channel intensity, and HSI color difference. Secondly, the image data has the characteristics of large sample size and uneven distribution, and the prediction model consists of a single kernel function, which makes it difficult to meet the prediction accuracy requirement. Therefore, the linear kernel function, polynomial kernel function, and Gauss kernel function were chosen to construct a composite model according to the concept of kernel structure from simple to complex. Then each kernel's Gram matrix was calculated based on training samples, and all gram matrices were placed into a mixture kernel matrix. Using the column generation algorithm and mixture kernel matrix, the prediction model was developed and the parameters of the model were solved. Finally, simulation experiments were performed; the results show that the prediction model based on the image mixture kernel of Column-Generation PM2.5 can meet the prediction accuracy requirements. The model has higher prediction accuracy and better model stability in comparison with the single-kernel prediction model. A computational complexity analysis shows that the prediction model based on the image mixture kernel of column-generation PM2.5 has no significant increase in computational complexity in comparison with the one-kernel prediction model.
-
表 1 特征與PM2.5相關性值
Table 1. Correlation between characteristics and PM2.5
Fig Fid Fih Fis Fii – 0.55 – 0.46 – 0.36 – 0.4 – 0.29 表 2 4種模型性能對比
Table 2. Performance comparison of the four models
Kernel emse emape/% R2 L 11.959 13.603 0.814 P 13.924 15.601 0.751 R 11.188 12.213 0.843 L+P+R 9.553 9.955 0.895 259luxu-164 -
參考文獻
[1] Zhang X L, Zhao J H, Cai B. Prediction model with dynamic adjustment for single time series of PM2.5. Acta Automatica Sinica, 2018, 44(10): 1790張熙來, 趙儉輝, 蔡波. 針對PM2.5單時間序列數據的動態調整預測模型. 自動化學報, 2018, 44(10):1790 [2] Zhan Y, Luo Y Z, Deng X F, et al. Spatiotemporal prediction of continuous daily PM2.5, concentrations across China using a spatially explicit machine learning algorithm. Atmos Environ, 2017, 155: 129 doi: 10.1016/j.atmosenv.2017.02.023 [3] Sun W, Sun J Y. Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. J Environ Manage, 2016, 188: 144 [4] Qu Y, Qian X, Song H Q, et al. Machine-learning-based model and simulation analysis of PM2.5 concentration prediction in Beijing. Chin J Eng, 2019, 41(3): 401曲悅, 錢旭, 宋洪慶, 等. 基于機器學習的北京市PM2.5濃度預測模型及模擬分析. 工程科學學報, 2019, 41(3):401 [5] Russo A, Raischel F, Lind P G. Air quality prediction using optimal neural networks with stochastic variables. Atmos Environ, 2013, 79: 822 doi: 10.1016/j.atmosenv.2013.07.072 [6] Li J G, Luo A R, Li X L. Prediction of PM2.5 mass concentration based on complementary ensemble empirical mode decomposition and support vector regression. J Beijing Univ Technol, 2018, 44(12): 1494李建更, 羅奧榮, 李曉理. 基于互補集合經驗模態分解與支持向量回歸的PM2.5質量濃度預測. 北京工業大學學報, 2018, 44(12):1494 [7] Liu C B, Tsow F, Zou Y, et al. Particle pollution estimation based on image analysis. PloS One, 2016, 11(2): e0145955 doi: 10.1371/journal.pone.0145955 [8] Gu K, Qiao J F, Li X L. Highly efficient picture-based prediction of PM2.5 concentration. IEEE Trans Ind Electron, 2019, 66(4): 3176 doi: 10.1109/TIE.2018.2840515 [9] Li X L, Zhang S, Wang K. PM2.5 air quality prediction based on image quality analysis. J Beijing Univ Technol, 2020, 46(2): 191李曉理, 張山, 王康. 基于圖像質量分析的PM2.5空氣質量預測. 北京工業大學學報, 2020, 46(2):191 [10] Wang H Q, Sun F C, Cai Y N, et al. On multiple kernel learning methods. Acta Autom Sin, 2010, 36(8): 1037 doi: 10.3724/SP.J.1004.2010.01037汪洪橋, 孫富春, 蔡艷寧, 等. 多核學習方法. 自動化學報, 2010, 36(8):1037 doi: 10.3724/SP.J.1004.2010.01037 [11] Fink M, Desaulniers G, Frey M, et al. Column generation for vehicle routing problems with multiple synchronization constraints. Eur J Oper Res, 2019, 272(2): 699 doi: 10.1016/j.ejor.2018.06.046 [12] Li H. Statistical Learning Method. Beijing: Tsinghua University Press, 2012李航. 統計學習方法. 北京: 清華大學出版社, 2012 [13] Demiriz A, Bennett K P, Shawe-Taylor J. Linear programming boosting via column generation. Mach Learn, 2002, 46(1-3): 225 [14] Bi J B, Zhang T, Bennett K P. Column-generation boosting methods for mixture of kernels//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, 2004: 521 [15] Vapnik V. The Nature of Statistical Learning Theory. Springer Science & Business Media, 2013 [16] Berman D, Treibitz T, Avidan S. Single image dehazing using haze-lines. IEEE Trans Pattern Anal Mach Intell, 2018, 42(3): 720 [17] Seinfeld J H, Pandis S N. Atmospheric Chemistry and Physics: from Air Pollution to Climate Change. John Wiley & Sons, 2016 [18] Graves N, Newsam S. Camera-based visibility estimation: Incorporating multiple regions and unlabeled observations. Ecol Inform, 2014, 23: 62 doi: 10.1016/j.ecoinf.2013.08.005 [19] He K M, Sun J, Tang X O. Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell, 2011, 33(12): 2341 doi: 10.1109/TPAMI.2010.168 [20] Kim K W, Kim Y J. Perceived visibility measurement using the HSI color difference method. J Korean Phys Soc, 2005, 46(5): 1243 [21] Yuan L, Mu Z C, Liu L M. Ear recognition based on kernel principal component analysis and support vector machine. J Univ Sci Technol Beijing, 2006, 28(9): 890 doi: 10.3321/j.issn:1001-053X.2006.09.019袁立, 穆志純, 劉磊明. 基于核主元分析法和支持向量機的人耳識別. 北京科技大學學報, 2006, 28(9):890 doi: 10.3321/j.issn:1001-053X.2006.09.019 -