•  
  •  
 

Scientific Information Research

Keywords

political data, risk identification, text classification, TextRNN, declaration commodities, intelligence value, HS code

Abstract

[Purpose/significance]To employ the intelligence value in political big data with Intelligence methods,we take the study of automatic classification of customs declaration commodities as an example to explore the typical applications of large-scale data accumulated over a long period of time in automatic data processing and analysis,so as to effectively reflect the value of intelligence.[Method/process]Extracting the related features of declaration commodity categories,we conduct experiments with deep learning methods. Extracting the relevant feature information of the customs declaration commodities and their categories in the accumulated government affairs data,we use the deep learning method to model and train them, and finally use the intelligence obtained by machine learning to realize the customs declaration commodities of unknown categories automatic classification to achieve the purpose of risk avoidance.[Result/conclusion]We first compare different deep learning text classification models.After analyzing the intelligence,we choose to build a TextRNN model with attention mechanism.The experimental results show that the model has the best performance,can better classify customs declaration commodities to avoid risks,and can more fully explore the intelligence value of customs declaration data.[Limitation]In the experiment,the discussion on risk characteristics is limited.When selecting the characteristics,only historical research,expert opinions and correlation values are referred to.Other effective characteristics may be filtered.

First Page

74

Digital Object Identifier (DOI)

10.19809/j.cnki.kjqbyj.2020.04.007

Reference

[1] 段尧清,尚婷,周密.我国政务大数据政策扩散特征与主题分析[J].图书情报工作,2020,64(13):133-139.
[2] 苏征,丛凯,陈宏.基于区块链技术在政务大数据中的应用研究[J].数字通信世界,2020(08):209-210.
[3] 许留芳,钱华生,夏云青.出口加工区报关单填制和申报的若干问题[J].对外经贸实务,2014(08):62-64.
[4] 中华人民共和国海关总署.中华人民共和国海关报关员执业管理办法[EB/OL].[2006-03-20].http://www.gov.cn/gongbao/content/2007/content_588176.htm.
[5] 周欣,张弛海.基于数据挖掘的海关风险分类预测模型研究[J].海关与经贸研究,2017,38(02):22-31.
[6] 卢金秋.数据挖掘中的人工神经网络算法及应用研究[D].杭州:浙江工业大学,2005:42-59.
[7] 宁靓,张卓群,毛万磊.大数据背景下政府网络回应效度研究:以山东政务服务网数据为例[J].重庆理工大学学报(社会科学),2019,33(10):98-109.
[8] 陈平刚,蔡利华,王玲丽.政务大数据支撑的政府舆情预警研究[J].现代商贸工业,2019,40(34):141-142.
[9] 谭必勇,陈艳.我国开放政府数据平台数据质量研究:以十省、市为研究对象[J].情报杂志,2017,36(11):99-105.
[10] 孙卓林,何云飞.中部省份电子政务文本计量分析:以政策工具为视角[J].苏州市职业大学学报,2019,30(01):38-43.
[11] 赵浚吟.大数据视野下抖音政务号用户信息行为研究[J].江苏科技信息,2019,36(08):74-77.
[12] 张亦鸣.1996年版《商品名称及编码协调制度》对我国进出口税则的影响[J].中国海关,1995(02):27-28.
[13] 王克海.大规模产品生产作业计划作业事项号的自动生成[J].系统工程理论与实践,1994,14(08):51-55.
[14] 陈东明,常桂然.基于分段编码自动生成产品结构树的研究[J].计算机集成制造系统,2005,11(07):1014-1018.
[15] WANG J,LEE M C.Reconstructing ddc for interactive classification[C]//Sixteenth ACM conference on Conference on information and knowledge management.New York:ACM,2007:137-146.
[16] KOLLER D,SAHAMI M.Hierarchically classifying documents using very few words[C]//Fourteenth international conference on Machine Learning.ICML'97,1997:170-178.
[17] ZIMEK A,BUCHWALD F,FRANK E,et al.A study of hierarchical and flat classification of proteins[J].IEEE/ACM transactions on computational biology & bioinformatics,2010,7(03):563-571.
[18] 王昊,叶鹏,邓三鸿.机器学习在中文期刊论文自动分类研究中的应用[J].现代图书情报技术,2014,30(03):80-87.
[19] 谢小楚.数据挖掘技术在海关缉私系统中的设计与应用[D].北京:北京工业大学,2007:1-64.
[20] 严俊龙,李铁源.基于SVM的网络安全风险评估模型及应用[J].计算机与数字工程,2012,40(01):82-84.
[21] 罗方科,陈晓红.基于Logistic回归模型的个人小额贷款信用风险评估及应用[J].财经理论与实践,2017,38(01):30-35.
[22] 郭丽丽,丁世飞.深度学习研究进展[J].计算机科学,2015,42(05):28-33.
[23] 余凯,贾磊,陈雨强,等.深度学习的昨天、今天和明天[J].计算机研究与发展,2013,50(09):1799-1804.
[24] 陈硕.深度学习神经网络在语音识别中的应用研究[D].广州:华南理工大学,2013:1-10.
[25] 卢宏涛,张秦川.深度卷积神经网络在计算机视觉中的应用研究综述[J].数据采集与处理,2016,31(01):1-17.
[26] 焦李成,杨淑媛,刘芳,等.神经网络七十年:回顾与展望[J].计算机学报,2016,39(08):1697-1716.
[27] 刘昌伟,段景辉.基于因子分析法的海关风险管理评价分析[J].海关与经贸研究,2016,37(06):27-42.
[28] 周欣,张弛海.基于数据挖掘的海关风险分类预测模型研究[J].海关与经贸研究,2017,38(02):22-31.
[29] LI G,LI N.Customs classification for cross-border ecommerce based on text-image adaptive convolutional neural network[J].Electronic commerce research,2019,19(4SI):779-800.
[30] NODA K,YAMAGUCHI Y,NAKADAI K,et al.Audio-visual speech recognition using deep learning[J].Applied intelligence,2015,42(04):722-737.
[31] 陆跃平.《商品名称及编码协调制度》及其公约介绍[J].国际贸易,1992(01):51-53.
[32] 中华人民共和国海关进出口税则编委会.中华人民共和国海关进出口税则[M].北京:经济日报出版社,2012:2-10.
[33] CHEN Y,LIN Z,ZHAO X,et al.Deep learning-based classification of hyperspectral data[J].IEEE journal of selected topics in applied earth observations and remote sensing,2017,7(06):2094-2107.
[34] SHEN P,WANG H,MENG Z et al.An improved parallel Bayesian text classification algorithm[J].Review of computer engineering studies,2016,3(01):6-10.
[35] 陆彦婷,陆建峰,杨静宇.层次分类方法综述[J].模式识别与人工智能,2013,26(12):1130-1139.
[36] HINTON G E.Learning distributed representations of concepts[C]//Eighth conference of the cognitive science society.Amherst, Massachusetts:1989.
[37] BENGIO Y,SCHWENK H,SENECAL J S,et al.Neural probabilistic language models[M].Springer Berlin Heidelberg:Innovations in machine learning,2006:137-186.
[38] MATHEW J,RADHAKRISHNAN D.An FIR digital filter using one-hot coded residue representation[C]//2000 10th European signal processing conference(EUSIPCO).IEEE,2008:1-4.
[39] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].Computer ence,2013.
[40] BOJANOWSKI P,GRAVE E,JOULIN A,et al.Enriching word vectors with subword information[J].Transactions of the association of computational linguistics,2017(05):135-146.

Share

COinS