<th id="5nh9l"></th><strike id="5nh9l"></strike><th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th><strike id="5nh9l"></strike>
<progress id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"><noframes id="5nh9l">
<th id="5nh9l"></th> <strike id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span>
<progress id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span><strike id="5nh9l"><noframes id="5nh9l"><strike id="5nh9l"></strike>
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"></span><span id="5nh9l"><video id="5nh9l"></video></span>
<th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th>
<progress id="5nh9l"><noframes id="5nh9l">

基于文本語料的涉恐事件實體屬性抽取

Entity and attribute extraction of terrorism event based on text corpus

  • 摘要: 基于語義角色分析,提出了一種三元組涉恐事件實體屬性抽取方法,為網絡空間涉恐活動的監測及預警提供技術支持。首先,基于西北政法大學“反恐怖主義信息網”文本語料數據進行數據采集和清洗等預處理工作,采用樸素貝葉斯文本分類算法識別涉恐事件文本,并采用關鍵詞提取算法TF-IDF(Term frequency-inverse document frequency,詞頻-逆文檔頻率)構建涉恐專有詞庫,結合自然語言處理技術構建帶詞性的涉恐專有詞庫。然后通過語義角色分析、句法依存分析,提取了主語謂語賓語關系、定語后置動賓關系、人名//地名//機構和介賓關系主謂動補4類涉恐三元組結構。最后,利用正則表達式及帶詞性的涉恐專有名詞分析,在4類三元組短文本中提取出恐怖事件發生時間、發生地點、傷亡情況、攻擊方式、武器類型和恐怖組織6類實體屬性。對采集的4221篇文章數據進行實驗分析,6類實體屬性抽取的測評結果F1值均超過80%,對網絡空間的涉恐事件監測及預警,維護社會公共安全具有重要現實意義。

     

    Abstract: Affected by complex international factors in recent years, terrorism events are increasingly rampant in many countries, thereby posing a great threat to the gloal community. In addition, with the widespread use of emerging technologies in military and commercial fields, terrorist organizations have begun to use emerging technologies to engage in destructive activities. As the Internet and information technology develop, terrorism has been rapidly spreading in cyberspace. Terrorist organizations have created terrorism websites, established multinational networks of terrorist organizations, released recruitment information and even conducted training activities through various mainstream websites with a worldwide reach. Compared with traditional terrorist activities, cyber terrorist activities have a greater degree of destructiveness. Cybercrime and cyber terrorism have become the most serious challenges for societies. Terrorist organizations take advantage of the Internet in rapid dissemination of extremism ideas, and develop a large number of terrorists and supporters around the world, especially in developed Western countries. Terrorist organizations even use the Internet and “dark net” networks to conduct terrorist training, and their activities are concealed. As a result, the "lone wolf" terrorist attacks in various countries have emerged in an endless stream, which is difficult to prevent. This study proposed a method of extracting entities and attributes of terrorist events based on semantic role analysis, and provided technical support for monitoring and predicting cyberspace terrorism activities. Firstly, a naive Bayesian text classification algorithm is used to identify terrorism events on the cleaned text corpus collected from the Anti-Terrorism Information Site of the Northwest University of Political Science and Law. The keyword extraction algorithm TF-IDF is adopted for constructing the terrorism vocabularies from the classified text corpus, combining natural language processing technology. Then, semantic role and syntactic dependency analyses are conducted to mine the attributive post-targeting relationship, the name//place name//organization, and the mediator-like relationship. Finally, regular expressions and constructed lexical terrorism-specific vocabularies are used to extract six entities and attributes (occurrence time, occurrence location, casualties, attack methods, weapon types and terrorist organizations) of terrorism event based on the four types of triad short texts. The F1 values of the six types of entity attribute extraction evaluation results exceeded 80% based on the experimental data of 4221 articles collected. Therefore, the method proposed has practical significance for maintaining social public safety because of the positive effect in monitoring and predicting cyberspace terrorism events.

     

/

返回文章
返回
<th id="5nh9l"></th><strike id="5nh9l"></strike><th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th><strike id="5nh9l"></strike>
<progress id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"><noframes id="5nh9l">
<th id="5nh9l"></th> <strike id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span>
<progress id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"><noframes id="5nh9l"><span id="5nh9l"></span><strike id="5nh9l"><noframes id="5nh9l"><strike id="5nh9l"></strike>
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"><noframes id="5nh9l">
<span id="5nh9l"></span><span id="5nh9l"><video id="5nh9l"></video></span>
<th id="5nh9l"><noframes id="5nh9l"><th id="5nh9l"></th>
<progress id="5nh9l"><noframes id="5nh9l">
259luxu-164