自然場景文本檢測技術研究綜述

白志程; 李擎; 陳鵬; 郭立晴

doi:10.13374/j.issn2095-9389.2020.03.24.002

自然場景文本檢測技術研究綜述

doi: 10.13374/j.issn2095-9389.2020.03.24.002

白志程^{1, 2},
李擎^{1, 2, ,},
陳鵬³,
郭立晴¹

1.
北京科技大學自動化學院，北京 100083
2.
工業過程知識自動化教育部重點實驗室，北京 100083
3.
中國郵政儲蓄銀行金融科技創新部，北京 100808

基金項目: 國家自然科學基金資助項目（11296089）

詳細信息

通訊作者:
E-mail：liqing@ies.ustb.edu.cn

中圖分類號: TP18
計量
- 文章訪問數: 3243
- HTML全文瀏覽量: 1708
- PDF下載量: 188
- 被引次數: 0
出版歷程
- 收稿日期: 2020-03-24
- 刊出日期: 2020-11-25

Text detection in natural scenes: a literature review

BAI Zhi-cheng^{1, 2},
LI Qing^{1, 2
, ,},
CHEN Peng³,
GUO Li-qing¹

1.
School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China
2.
Key Laboratory of Knowledge Automation for Industrial Processes, Ministry of Education, Beijing 100083, China
3.
FINTECH Innovation Division, Postal Savings Bank of China, Beijing 100808, China

More Information

Corresponding author: E-mail: liqing@ies.ustb.edu.cn

摘要

摘要: 文本檢測在自動駕駛和跨模態圖像檢索中具有極為廣泛的應用。該技術也是基于光學字符的文本識別任務中重要的前置環節。目前，復雜場景下的文本檢測仍極具挑戰性。本文對自然場景文本檢測進行綜述，回顧了針對該問題的主要技術和相關研究進展，并對研究現狀進行分析。首先對問題進行概述，分析了自然場景中文本檢測的主要特點；接著，介紹了經典的基于連通域分析、基于滑動檢測窗的自然場景文本檢測技術；在此基礎上，綜述了近年來較為常用的深度學習文本檢測技術；最后，對自然場景文本檢測未來可能的研究方向進行展望。
- 文本檢測 /
- 場景文本 /
- 連通域分析 /
- 圖像處理 /
- 統計學習 /
- 深度學習
Abstract: Text detection is widely applied in the automatic driving and cross-modal image retrieval fields. This technique is also an important pre-procedure in optical character-based text recognition tasks. At present, text detection in complex natural scenes remains a challenging topic. Because text distribution and orientation are varied in different scenes and domains, there is still room for improvement in existing computer vision-based text detection methods. To complicate matters, natural scene texts, such as those in guideposts and shop signs, always contain words in different languages. Even characters are missing from some natural scene texts. These circumstances present more difficulties for feature extraction and feature description, thereby weakening the detectability of existing computer vision and image processing methods. In this context, text detection applications in natural scenes were summarized in this paper, the classical and newly presented techniques were reviewed, and the research progress and status were analyzed. First, the definitions of natural scene text detection and associated concepts were provided based on an analysis of the main characteristics of this problem. In addition, the classic natural scene text detection technologies, such as connected component analysis-based methods and sliding detection window-based methods, were introduced comprehensively. These methods were also compared and discussed. Furthermore, common deep learning models for scene text detection of the past decade were also reviewed. We divided these models into two main categories: region proposal-based models and segmentation-based models. Accordingly, the typical detection and semantic segmentation frameworks, including Faster R-CNN, SSD, Mask R-CNN, FCN, and FCIS, were integrated in the deep learning methods reviewed in this section. Moreover, hybrid algorithms that use region proposal ideas and segmentation strategies were also analyzed. As a supplement, several end-to-end text recognition strategies that can automatically identify characters in natural scenes were elucidated. Finally, possible research directions and prospects in this field were analyzed and discussed.
- text detection /
- scene text /
- connected domain analysis /
- image processing /
- statistical learning /
- deep learning

HTML全文

圖 1 自然場景示例圖片

Figure 1. Sample images of nature scenes

下載: 全尺寸圖片幻燈片

圖 2 筆劃寬度的定義^[13]。（a）一種典型的筆劃；（b）筆劃邊界像素；（c）筆劃束上的每個像素

Figure 2. Definition of the stroke width^[13]: (a) a typical stroke; (b) a pixel on the boundary of the stroke; (c) each pixel along the ray

下載: 全尺寸圖片幻燈片

圖 3 多邊形滑動窗口和矩形滑動窗口檢測結果比較^[25]。（a）多邊形滑窗檢測結果；（b）矩形滑窗檢測結果

Figure 3. Comparison of the detection results between polygon sliding windows and rectangular sliding windows^[25]: (a) detection results of polygon sliding window; (b) detection result of rectangular sliding window

下載: 全尺寸圖片幻燈片

圖 4 Text Snake表征圖示^[54]

Figure 4. Illustration of the proposed Text Snake representation^[54]

下載: 全尺寸圖片幻燈片

圖 5 PixelLink結構圖^[56]

Figure 5. Architecture of PixelLink^[56]

下載: 全尺寸圖片幻燈片

表 1 文本檢測常用數據集

Table 1. Common datasets for text detection

Dataset	Presenter	Type	Sample size（Training/Test）	Language	Direction
CTW	THU, Tencent	Scene	32285	Chinese	Horizontal
ICDAR2003	ICDAR	Scene	2276（1110/115）	English	Horizontal
ICDAR2011		Scene	484（229/255）	English	Horizontal
ICDAR2011		Graph	522（420/102）	English	Curve
ICDAR2013		Scene	463（229/233）	English	Horizontal
		Graph	551（410/141）	English	Multiple
		Video	28（13/15）	English, French, Spanish	Multiple
MSRA-TD500	HUST	Scene	500（300/200）	English Chinese	Multiple
COCO-Text	Microsoft	Scene	63686	English	Multiple
RCTW-17	HUST	Scene	12263（8034/4229）	Chinese	Horizontal
RCTW-17	HUST	Scene	12263（8034/4229）	English	Horizontal
MLT2017	ICDAR	Scene	18000（7200/10800）	Multi-lingual	Horizontal
MLT2019	ICDAR	Scene	20000（10000/10000）	Multi-lingual	Horizontal
Total-Text	UM	Scene	1525(1225/300)	English	Multiple
SCUT-CTW1500	SCUT	Scene	1500（1000/500）	Multi-lingual	Multiple
ArT	UM, SCUT, Baidu	Scene	10166(5603/4563)	English	Multiple
ArT	UM, SCUT, Baidu	Scene	10166(5603/4563)	Chinese	Multiple

下載: 導出CSV

259luxu-164

參考文獻(101)

[1]	Dai J. Review of research on text detection technology in natural scenes. Comput CD Software Appl, 2013(18): 104 戴津. 自然場景中文本檢測技術研究綜述. 計算機光盤軟件與應用, 2013(18):104
[2]	Zhuo L, Long H X, Peng Y F, et al. Image processing in encrypted domain: a comprehensive survey. J Beijing Univ Technol, 2016, 42(2): 174 卓力, 龍海霞, 彭遠帆, 等. 加密域圖像處理綜述. 北京工業大學學報, 2016, 42(2):174
[3]	Fan Y L.Natural Scene Text Detection Algorithm Research Based on Mobile Terminal [Dissertation]. Xi’an: Xidian University, 2015 樊亞玲. 移動終端自然場景文本檢測算法研究[學位論文]. 西安: 西安電子科技大學, 2015
[4]	Wang R M, Sang N, Ding D, et al. Text detection in natural scene image: a survey. Acta Autom Sin, 2018, 44(12): 2113 王潤民, 桑農, 丁丁, 等. 自然場景圖像中的文本檢測綜述. 自動化學報, 2018, 44(12):2113
[5]	Li J G, Li L J, Zhang Y, et al. A method which is suitable for the training of convolutional neural networks with multiple classifiers. J Beijing Univ Technol, 2018, 44(10): 1291 李建更, 李立杰, 張巖, 等. 適用于具有多分類器的卷積神經網絡訓練方法. 北京工業大學學報, 2018, 44(10):1291
[6]	Li X L, Zhang B, Wang K, et al. The development and application of artificial intelligence. J Beijing Univ Technol, 2020, 46(6): 583 李曉理, 張博, 王康, 等. 人工智能的發展及應用. 北京工業大學學報, 2020, 46(6):583
[7]	Canny J. A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell, 1986, 8(6): 679
[8]	Liu C M, Wang C H, Dai R W. Text detection in images based on unsupervised classification of edge-based features // Eighth International Conference on Document Analysis and Recognition (ICDAR'05). Seoul, 2005: 610
[9]	Sobel I E. Camera Models and Machine Perception [Dissertation]. San Francisco: Stanford University, 1970
[10]	Shivakumara P, Phan T Q, Tan C L. A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell, 2011, 33(2): 412 doi: 10.1109/TPAMI.2010.166
[11]	Yu C, Song Y, Meng Q, et al. Text detection and recognition in natural scene with edge analysis. IET Computer Vision, 2015, 9(4): 603 doi: 10.1049/iet-cvi.2013.0307
[12]	Buta M, Neumann L, Matas J. FASText: Efficient unconstrained scene text detector // Proceedings of the IEEE International Conference on Computer Vision. Santiago, 2015: 1206
[13]	Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform // 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. California, 2010: 2963
[14]	Yao C, Bai X, Liu W Y, et al. Detecting texts of arbitrary orientations in natural images // 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, 2012: 1083
[15]	Huang W L, Lin Z, Yang J C, et al. Text localization in natural images using stroke feature transform and text covariance descriptors // Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, 2013: 1241
[16]	Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from maximally stable extremal regions. Image Vision Computing, 2004, 22(10): 761 doi: 10.1016/j.imavis.2004.02.006
[17]	Gomez L, Karatzas D. Object proposals for text extraction in the wild // 2015 13th International Conference on Document Analysis and Recognition (ICDAR). Tunis, 2015: 206
[18]	Neumann L, Matas J. A method for text localization and recognition in real-world images // 10th Asian Conference on Computer Vision. Queenstown, 2010: 770
[19]	Neumann L, Matas J. Real-time scene text localization and recognition // 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, 2012: 3538
[20]	Sun L, Huo Q. A component-tree based method for user-intention guided text extraction // Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). Tsukuba, 2012: 633
[21]	Sun L, Huo Q, Jia W, et al. A robust approach for text detection from natural scene images. Pattern Recognit, 2015, 48(9): 2906 doi: 10.1016/j.patcog.2015.04.002
[22]	Zhou P F.Research on Text Detection and Recognition in Natural Scene Images [Dissertation]. Xi’an: Xi’an University of Technology, 2019 周鵬飛. 自然場景圖像中的文本檢測與識別技術研究[學位論文]. 西安: 西安理工大學, 2019
[23]	Chen X R, Yuille A L. Detecting and reading text in natural scenes // Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, 2004: II
[24]	Lee J J, Lee P H, Lee S W, et al. AdaBoost for text detection in natural scene // 2011 International Conference on Document Analysis and Recognition. Beijing, 2011: 429
[25]	Liu Y L, Jin L W. Deep matching prior network: Toward tighter multi-oriented text detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 1962
[26]	Yin B C, Wang W T, W L C. Review of deep learning research. J Beijing Univ Technol, 2015, 41(1): 48 尹寶才, 王文通, 王立春. 深度學習研究綜述. 北京工業大學學報, 2015, 41(1):48
[27]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031
[28]	Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector // European Conference on Computer Vision. Amsterdam, 2016: 21
[29]	Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks// Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, 2016: 379
[30]	Yu Z, Wang Q Q, Lü Y. Scene text detection based on feature fusion network. Comput Syst Appl, 2018, 27(10): 1 余崢, 王晴晴, 呂岳. 基于特征融合網絡的自然場景文本檢測. 計算機系統應用, 2018, 27(10):1
[31]	Karatzas D, Mestre S R, Mas J, et al. ICDAR 2011 robust reading competition-challenge 1: reading text in born-digital images (web and email) // 2011 International Conference on Document Analysis and Recognition. Beijing, 2011: 1485
[32]	Karatzas D, Shafait F, Uchida S, et al. ICDAR 2013 robust reading competition // 2013 12th International Conference on Document Analysis and Recognition. Washington, 2013: 1484
[33]	Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia, 2018, 20(11): 3111 doi: 10.1109/TMM.2018.2818020
[34]	Jiang Y Y, Zhu X Y, Wang X B, et al. R2CNN: rotational region CNN for orientation robust scene text detection [J/OL]. arXiv preprint (2017-06-30)[2020-03-01]. https://arxiv.org/abs/1706.09579
[35]	Zhong Z Y, Sun L, Huo Q. An anchor-free region proposal network for Faster R-CNN-based text detection approaches. Int J Doc Anal Recognit, 2019, 22(3): 315 doi: 10.1007/s10032-019-00335-y
[36]	Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL]. arXiv preprint (2015-04-10)[2020-03-01]. https://arxiv.org/abs/1409.1556
[37]	Shi B G, Bai X, Belongie S. Detecting oriented text in natural images by linking segments // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2550
[38]	Liao M H, Shi B G, Bai X, et al. TextBoxes: a fast text detector with a single deep neural network // Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, 2017: 4161
[39]	Liao M H, Shi B G, Bai X. TextBoxes++: a single-shot oriented scene text detector. IEEE Trans Image Process, 2018, 27(8): 3676 doi: 10.1109/TIP.2018.2825107
[40]	He P, Huang W L, He T, et al. Single shot text detector with regional attention // Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, 2017: 3047
[41]	Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning // Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, 2017: 4278
[42]	Liao M H, Zhu Z, Shi B G, et al. Rotation-sensitive regression for oriented scene text detection // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 5909
[43]	Liu X, Zhang R, Zhou Y S, et al. Scene text detection with feature pyramid network and linking segments // 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, 2019: 508
[44]	Zhang S, Liu Y L, Jin L W, et al. Feature enhancement network: a refined scene text detector // Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, 2018: 2612
[45]	Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network // European Conference on Computer Vision. Munich, 2016: 56
[46]	Wang X B, Jiang Y Y, Luo Z B, et al. Arbitrary shape scene text detection with adaptive text region representation // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 6449
[47]	He K M, Gkioxari G, Dollár P, et al. Mask R-CNN // Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, 2017: 2961
[48]	Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39(4): 640 doi: 10.1109/TPAMI.2016.2572683
[49]	Li Y, Qi H Z, Dai J F, et al. Fully convolutional instance-aware semantic segmentation // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 4438
[50]	Lyu P Y, Liao M H, Yao C, et al. Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes // Proceedings of the European Conference on Computer Vision. Munich, 2018: 67
[51]	Liao M H, Lyu P Y, He M H, et al. Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans Pattern Anal Machine Intelligence, 2019: 1
[52]	Xie E Z, Zang Y H, Shao S, et al. Scene text detection with supervised pyramid context network. Proc AAAI Conf Artif Intell, 2019, 33: 9038
[53]	Zhang Z, Zhang C Q, Shen W, et al. Multi-oriented text detection with fully convolutional networks // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 4159
[54]	Long S B, Ruan J Q, Zhang W J, et al. TextSnake: a flexible representation for detecting text of arbitrary shapes // Proceedings of the European Conference on Computer Vision. Munich, 2018: 19
[55]	He T, Huang W L, Qiao Y, et al. Accurate text localization in natural image with cascaded convolutional text network [J/OL]. arXiv preprint (2016-03-31)[2020-03-01]. https://arxiv.org/abs/1603.09423
[56]	Deng D, Liu H F, Li X L, et al. PixelLink: Detecting scene text via instance segmentation [J/OL]. arXiv preprint (2018-01-04)[2020-03-01]. https://arxiv.org/abs/1801.01315
[57]	Yang Q P, Cheng M L, Zhou W M, et al. IncepText: a new inception-text module with deformable PSROI pooling for multi-oriented scene text detection // Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, 2018: 1071
[58]	Dai Y C, Huang Z, Gao Y T, et al. Fused text segmentation networks for multi-oriented scene text detection //2018 24th International Conference on Pattern Recognition (ICPR). Beijing, 2018: 3604
[59]	Wang W H, Xie E Z, Li X, et al. Shape robust text detection with progressive scale expansion network //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 9328
[60]	Wang W H, Xie E Z, Song X G, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, 2019: 8439
[61]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 770
[62]	Baek Y, Lee B, Han D, et al. Character region awareness for text detection // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 9357
[63]	Tian Z T, Shu M, Lyu P Y, et al. Learning shape-aware embedding for scene text detection // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 4229
[64]	Liao M H, Wan Z Y, Yao C, et al. Real-time scene text detection with differentiable binarization [J/OL]. arXiv preprint (2019-12-03)[2020-03-01]. https://arxiv.org/abs/1911.08947
[65]	Lyu P Y, Yao C, Wu W H, et al. Multi-oriented scene text detection via corner localization and region segmentation // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 7553
[66]	Li Y, Yu Y J, Li Z F, et al. Pixel-Anchor: a fast oriented scene text detector with combined networks [J/OL]. arXiv preprint (2018-11-19)[2020-03-01]. http://export.arxiv.org/abs/1811.07432
[67]	Jiang F, Hao Z H, Liu X R. Deep scene text detection with connected component proposals [J/OL]. arXiv preprint (2017-08-17)[2020-03-01]. http://export.arxiv.org/abs/1708.05133
[68]	Qiao L, Tang S L, Cheng Z Z, et al. Text perceptron: towards end-to-end arbitrary-shaped text spotting [J/OL]. arXiv preprint (2020-02-17)[2020-03-01]. https://arxiv.org/abs/2002.06820
[69]	Zhou X Y, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2642
[70]	Li J R, Zhou Z J, Su Z Z, et al. A new parallel detection-recognition approach for end-to-end scene text extraction // 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, 2019: 1358
[71]	He T, Tian Z, Huang W L, et al. An end-to-end TextSpotter with explicit alignment and attention // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, 2018: 5020
[72]	Kim K H, Hong S, Roh B, et al. PVANET: Deep but lightweight neural networks for real-time object detection. arXiv preprint (2019-09-30)[2020-03-01]. https://arxiv.org/abs/1608.08021
[73]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition // IEEE Conference on Computer Vision & Pattern Recognition. Las Vegas, 2016: 770
[74]	Wang F F, Zhao L M, Li X, et al. Geometry-aware scene text detection with instance transformation network // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 1381
[75]	Duan J Q, Xu Y J, Kuang Z H, et al. Geometry normalization networks for accurate scene text detection // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, 2019: 9136
[76]	Liu Z C, Lin G S, Yang S, et al. Towards robust curve text detection with conditional spatial expansion // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 7261
[77]	Liu Y L, Chen H, Shen C H, et al. ABCNet: real-time scene text spotting with adaptive Bezier-curve network [J/OL]. arXiv preprint (2020-02-25)[2020-03-01]. https://arxiv.org/abs/2002.10200v2
[78]	Wang H, Lu P, Zhang H, et al. All you need is boundary: toward arbitrary-shaped text spotting [J/OL]. arXiv preprint (2019-11-21)[2020-03-01]. https://arxiv.org/abs/1911.09550
[79]	Zhang A X. Research on Natural Scene Text Detection Algorithms Based on Deep Learning [Dissertation]. Beijing: North China University of Technology, 2019 張艾萱. 基于深度學習的自然場景文本檢測算法研究[學位論文]. 北京: 北方工業大學, 2019
[80]	Zhou X Y, Gao Z H. Research on inclined text location method of natural scene based on YOLO. Comput Eng Appl, 2020, 56(9): 213 doi: 10.3778/j.issn.1002-8331.1911-0032 周翔宇, 高仲合. 基于YOLO的自然場景傾斜文本定位方法研究. 計算機工程與應用, 2020, 56(9):213 doi: 10.3778/j.issn.1002-8331.1911-0032
[81]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 779
[82]	Niu Z D, Li H D. Natural scene text detection algorithm with attention mechanism. Comput Appl Software, 2019, 36(9): 198 doi: 10.3969/j.issn.1000-386x.2019.09.035 牛作東, 李捍東. 引入注意力機制的自然場景文本檢測算法研究. 計算機應用與軟件, 2019, 36(9):198 doi: 10.3969/j.issn.1000-386x.2019.09.035
[83]	Yuan T L, Zhu Z, Xu K, et al. Chinese text in the wild [J/OL]. arXiv preprint (2018-02-26)[2020-03-01]. https://arxiv.org/abs/1803.00085
[84]	Lucas S M, Panaretos A, Sosa L, et al. ICDAR 2003 robust reading competitions // Seventh International Conference on Document Analysis and Recognition. Edinburgh, 2003: 682
[85]	Veit A, Matera T, Neumann L, et al. COCO-Text: dataset and benchmark for text detection and recognition in natural images [J/OL]. arXiv preprint (2016-06-19)[2020-03-01]. https://arxiv.org/abs/1601.07140
[86]	Shi B G, Yao C, Liao M H, et al. ICDAR2017 competition on reading Chinese text in the wild (RCTW-17) // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, 2017: 1429
[87]	Ch’ng C K, Chan C S. Total-Text: a comprehensive dataset for scene text detection and recognition // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, 2017: 935
[88]	Nayef N, Yin F, Bizid I, et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, 2017: 1454
[89]	Chng C K, Liu Y L, Sun Y P, et al. ICDAR2019 robust reading challenge on arbitrary-shaped text-RRC-ArT. // 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, 2019: 1571
[90]	Liu Y L, Jin L W, Zhang S T, et al. Detecting curve text in the wild: new dataset and new solution. arXiv preprint (2017-12-06)[2020-3-1]. https://arxiv.org/abs/1712.02170
[91]	Wang J X, Wang Z Y, Tian X. Review of natural scene text detection and recognition based on deep learning. J Software, 2020, 31(5): 1465 王建新, 王子亞, 田萱. 基于深度學習的自然場景文本檢測與識別綜述. 軟件學報, 2020, 31(5):1465
[92]	Liu Y L. Jin L W. Zhang S T, et al. Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit, 2019, 90: 337 doi: 10.1016/j.patcog.2019.02.002
[93]	Peng B F. Tencent cloud university big players share \| decryption OCR text recognition technology [EB/OL]. Tencent Cloud Community Column (2019-08-13) [2020-03-01]. https://cloud.tencent.com/developer/article/1473262 彭碧發. 騰訊云大學大咖分享 \| 解密OCR文字識別技術 [EB/OL] 騰訊云社區專欄 (2019-08-13) [2020-03-01]. https://cloud.tencent.com/developer/article/1473262
[94]	Youdao Z Y. Text recognition OCR service [EB/OL]. Youdao Intelligent Cloud·AI Open Platform (2019-12-17) [2020-03-01]. https://ai.youdao.com/product-ocr.s 有道智云. 文字識別OCR服務 [EB/OL]. 有道智云·AI開放平臺 (2019-12-17) [2020-03-01]. https://ai.youdao.com/product-ocr.s
[95]	Baidu Clound Engine. Universal text recognition [EB/OL]. Baidu Intelligent Cloud (2020-02-05) [2020-03-01]. https://cloud.baidu.com/product/ocr/general 百度云. 通用文字識別 [EB/OL]. 百度智能云 (2020-02-05) [2020-03-01]. https://cloud.baidu.com/product/ocr/general
[96]	Chuangyejun. Chuang Lan 253- the image recognition OCR technology of Chuanglan Myriads platform [EB/OL]. Chuang Lan 253 Column (2018-07-19) [2020-03-01]. https://blog.csdn.net/chuangyejun/article/details/81113833 Chuangyejun. 創藍253-創藍萬數平臺圖像識別OCR技術 [EB/OL] 創藍253專欄 (2018-07-19) [2020-03-01]. https://blog.csdn.net/chuangyejun/article/details/81113833
[97]	ZJULearning. Pixel_link [EB/OL]. GitHub (2019-11-21) [2020-03-01]. https://github.com/ZJULearning/pixel_link
[98]	Huoyijie. AdvancedEAST [EB/OL]. GitHub (2020-4-3) [2020-03-01]. https://github.com/huoyijie/AdvancedEAST
[99]	Dengdan. Seglink [EB/OL]. GitHub (2018-5-3) [2020-03-01]. https://github.com/dengdan/seglink
[100]	Tianzhi0549. CTPN [EB/OL]. GitHub (2020-4-3) [2020-03-01]. https://github.com/tianzhi0549/CTPN
[101]	Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network // ECCV 2016: European Conference on Computer Vision. Amsterdam, 2016: 56