深度神經網絡模型壓縮綜述

李江昀; 趙義凱; 薛卓爾; 蔡錚; 李擎

doi:10.13374/j.issn2095-9389.2019.03.27.002

摘要: 深度神經網絡近年在計算機視覺以及自然語言處理等任務上不斷刷新已有最好性能，已經成為最受關注的研究方向.深度網絡模型雖然性能顯著，但由于參數量巨大、存儲成本與計算成本過高，仍然難以部署到硬件受限的嵌入式或移動設備上.相關研究發現，基于卷積神經網絡的深度模型本身存在參數冗余，模型中存在對最終結果無用的參數，這為深度網絡模型壓縮提供了理論支持.因此，如何在保證模型精度條件下降低模型大小已經成為熱點問題.本文對國內外學者近幾年在模型壓縮方面所取得的成果與進展進行了分類歸納并對其優缺點進行評價，并探討了模型壓縮目前存在的問題以及未來的發展方向.

Abstract: In recent years, deep neural networks (DNN) have attracted increasing attention because of their excellent performance in computer vision and natural language processing. The success of deep learning is due to the fact that the models have more layers and more parameters, which gives them stronger nonlinear fitting ability. Furthermore, the continuous updating of hardware equipment makes it possible to quickly train deep learning models. The development of deep learning is driven by the greater amounts of available annotated or unannotated data. Specifically, large-scale data provide models with greater learning space and stronger generalization ability. Although the performance of deep neural networks is significant, they are difficult to deploy in embedded or mobile devices with limited hardware due to their large number of parameters and high storage and computing costs. Recent studies have found that deep models based on a convolutional neural network are characterized by parameter redundancy as well as parameters that are irrelevant to the final model results, which provides theoretical support for the compression of deep network models. Therefore, determining ways to reduce model size while retaining model precision has become a hot research issue. Model compression refers to the reduction of a trained model through some operation to obtain a lightweight network with equivalent performance. After model compression, there are fewer network parameters and usually a reduction in the computation required, which greatly reduces the computational and storage costs and enables the deployment of the model in restricted hardware conditions. In this paper, the achievements and progress made in recent years by domestic and foreign scholars with respect to model compressionwere classified and summarized and their advantages and disadvantages were evaluated, including network pruning, parameter sharing, quantization, network decomposition, and network distillation. Then, existing problems and the future development of model compression were discussed.

深度神經網絡模型壓縮綜述

A survey of model compression for deep neural networks