•  
  •  
 

Scientific Information Research

Keywords

policy text; text mining; bibliometrics; quantitative research; research review

Abstract

[Purpose/significance]With the help of information technology, quantitative analysis on policy text is an emerging interdisciplinary research direction.[Method/process]This paper systematically sorts out the current progress of quantitative research on policy text from the three-dimensional perspective of data sources, methods and applications. After summarizing the distribution of metadata and data sources of policy text, at the method level, it is divided into three categories: content analysis method, bibliometric method and text mining method, and in the application of policy text mining, there are mainly policy topic mining, policy target tool mining, political position analysis, distribution of publishing agencies and policy diffusion research.[Result/conclusion]In the future, researchers should pay more attention to the mining of policy content and combine it with quantitative analysis research.

First Page

92

Reference

[1] 苏竣.公共科技政策导论[M].北京:科学出版社,2014. [2] CHILTON P A,SCHÄFFNER C.Politics as Text and Talk:Analytic approaches to political discourse[M]. Philadelphia:John Benjamins Publishing,2002. [3] ZENG W,YAO C,LI H. The exploration of information extraction and analysis about science and technology policy in China[J].The Electronic Library,2017,35(04):709-723. [4] GRIMMER J,STEWART B M.Text as Data:The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts[J].Political Analysis,2013,21(03):267-297. [5] 李江,刘源浩,黄萃,等.用文献计量研究重塑政策文本数据分析:政策文献计量的起源、迁移与方法创新[J].公共管理学报,2015,12(02):138-144,159. [6] LLEE-RESEARCH-COLLABORATORS.Supreme Court Dialogs Corpus[EB/OL].(2020-12-22)[2022-02-25]. https://confluence.cornell.edu/display/llresearch/Supreme+Court+Dialogs+Corpus. [7] TAE YANO,WILLIAM COHEN,NOAH SMITH.Political Blog Corpora[EB/OL].(2009-05-29)[2022-02-25]. https://www.cs.cmu.edu/~ark/blog-data. [8] Manifesto Project Data Dashboard. Browse Data[EB/OL].[2022-02-25].https://visuals.manifesto-project.wzb.eu/mpdb-shiny/cmp_dashboard_dataset. [9] Hong Kong Baptist University.Corpus of Political Speeches[EB/OL].[2022-02-25].https://digital.lib.hkbu.edu.hk/corpus. [10] 邱均平,邹菲.关于内容分析法的研究[J].中国图书馆学报,2004(02):14-19. [11] 谭春辉,谢荣,刘倩.政策工具视角下的我国政府信息公开政策文本量化研究[J].电子政务,2020(02):111-124. [12] 曾粤亮,韩世曦.政策工具视角下我国老年人智能技术运用政策文本量化研究[J/OL].情报资料工作,1-16[2022-10-21].http://kns.cnki.net/kcms/detail/11.1448.G3.20220930.1320.002.html. [13] 李明德,黄安,张宏邦.互联网舆情政策文本量化研究:2009-2016[J].情报杂志,2017,36(03):55-60,91. [14] 吴宾,刘雯雯.中国养老服务业政策文本量化研究(1994-2016年)[J].经济体制改革,2017(04):20-26. [15] 李燕萍,吴绍棠,郜斐,等.改革开放以来我国科研经费管理政策的变迁、评介与走向:基于政策文本的内容分析[J].科学学研究,2009,27(10):1441-1447,1453. [16] 黄新平,黄萃,苏竣.基于政策工具的我国科技金融发展政策文本量化研究[J].情报杂志,2020,39(01):130-137. [17] 张红芳.专利权质押政策文本量化研究[J].科学管理研究,2017,35(03):102-105. [18] 黄萃,赵培强,李江.基于共词分析的中国科技创新政策变迁量化分析[J].中国行政管理,2015(09):115-122. [19] 曹海军,侯甜甜.我国城市网格化管理的注意力变迁及逻辑演绎:基于2005—2021年中央政策文本的共词与聚类分析[J].南通大学学报(社会科学版),2022,38(02):73-83. [20] JOACHIMS T.Text Categorization with Support Vector Machines:Learning with Many Relevant Features[C]//Proceedings of the 10th European Conference on Machine Learning,Berlin:1998,1398:137-142. [21] 王涛.教育政策文本的分类算法研究与应用[D].合肥:安徽大学,2019. [22] DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by Latent Semantic Analysis[J].Journal of the American Society for Information Science,1990,41(06):391-407. [23] HOFMANN T.Unsupervised Learning by Probabilistic Latent Semantic Analysis[J].Machine Learning,2001,42:177-196. [24] BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].The Journal of Machine Learning Research,2003,03:993-1002. [25] 杨慧,杨建林.融合LDA模型的政策文本量化分析:基于国际气候领域的实证[J].现代情报,2016,36(05):71-81. [26] 张涛,马海群.一种基于LDA主题模型的政策文本聚类方法研究[J].数据分析与知识发现,2018,2(09):59-65. [27] DAVID SARNE,JONATHAN SCHLER,ALON SINGER,et al.Unsupervised Topic Extraction from Privacy Policies[C]//Companion Proceedings of the 2019 World Wide Web Conference,San Francisco: 2019,563–568. [28] MOODY C E.Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec[J/OL].(2016-05-06)[2022-02-25].https://arxiv.org/pdf/1605.02019.pdf. [29] WANG Z,MA L,ZHANG Y.A Hybrid Document Feature Extraction Method Using Latent Dirichlet Allocation and Word2Vec[C]//Proceedings of IEEE 1st International Conference on Data Science in Cyberspace,Changsha:2016,98-103. [30] NIU L Q,DAI X Y,ZHANG J B,et al.Topic2Vec:Learning distributed representations of topics[C]//Proceedings of 2015 International Conference on Asian Language Processing,Suzhou:2015,193-196. [31] 胡吉明,钱玮,李雨薇,等.基于LDA2Vec的政策文本主题挖掘与结构化解析框架研究[J].情报科学,2021,39(10):11-17. [32] YANG Y,LIU X.A Re-Examination of Text Categorization Methods[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,California:1999,42-49. [33] KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Confe-rence on Empirical Methods in Natural Language Processing,Doha,2014,1746-1751. [34] LAI S W, XU L H,LIU K,et al.Recurrent Convolutional Neural Networks for Text Classification[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence,2015,2267-2273. [35] LIU P F,QIU X P,HUANG X J.Recurrent Neural Network for Text Classification with Multi-Task Learning[J/OL].(2016-05-17)[2022-02-25].https://arxiv.org/pdf/1605.05101.pdf. [36] BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[J/OL].(2016-05-19)[2022-02-25]. https://arxiv.org/pdf/1409.0473.pdf. [37] 胡吉明,付文麟,钱玮,等.融合主题模型和注意力机制的政策文本分类模型[J].情报理论与实践,2021,44(07):159-165.. [38] 尹陈,吴敏.N-gram模型综述[J].计算机系统应用,2018,27(10):33-38. [39] HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF models for sequence tagging[J].(2015-08-09)[2022-02-25]. http://de.arxiv.org/pdf/1508.01991. [40] Zheng W J,Hua B.Named Entity Recognition for Science and Technology Policy Dynamics[C]//3rd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents 2022,co-located with the JCDL 2022(EEKE 2022),2022,3210:138-141. [41] 赵洪,王芳,王晓宇,等.基于大规模政府公文智能处理的知识发现及应用研究[J].情报学报,2018,37(08):805-812. [42] 张永安, 耿喆, 王燕妮.我国区域科技创新政策的系统性分类:基于中关村数据的研究[J].系统科学学报, 2016, 24(02):4. [43] 张永安,耿喆,王燕妮.区域科技创新政策分类与政策工具挖掘:基于中关村数据的研究[J].科技进步与对策,2015,32(17):116-122. [44] CHANG A F,HUA B,YU D.Keyword Extraction and Technology Entity Extraction for Disruptive Technology Policy Texts[C]//Proceedings of the 2nd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents,co-located with JCDL 2021(EEKE 2021),2021,3004:36-40. [45] HUANG C,SU J,XIE X,et al.A bibliometric study of China's science and technology policies:1949—2010[J].Scientimetrics,2015:1521-1539. [46] YANG C,HUANG C,SU J.A bibliometrics-based research framework for exploring policy evolution:A case study of China's information technology policies[J].Technological Forecasting and Social Change,2020,157:120116. [47] 赵筱媛,苏竣.基于政策工具的公共科技政策分析框架研究[J].科学学研究,2007,25(01):52-56. [48] 黄萃,苏竣,施丽萍,等.政策工具视角的中国风能政策文本量化研究[J].科学学研究,2011,29(06):876-882,889. [49] HUANG C,YANG C,SU J.Policy change analysis based on “policy target–policy instrument”patterns:a case study of China's nuclear energy policy[J].Scientimetrics,2018,117:1081-1114. [50] 鲁旭,汪晓燕,桂晓璟,等.基于内容分析法的江苏省科技创新政策研究[J].科技情报研究,2020,2(03):82-93. [51] 黄晓林,胡锡晟,黄卉,等.中国氢能源产业政策量化分析及区域布局研究[J].科技情报研究,2021,3(02):83-95. [52] LAVER M,BENOIT K,GARRY J. Extracting Policy Positions from Political Texts Using Words as Data[J].American Political Science Review,2003,97(02):311-331. [53] ROBERTS I,WENTZ R,EDWARDS P.Car manufacturers and global road safety:a word frequency analysis of road safety documents[J].Injury Prevention Journal of the International Society for Child & Adolescent Injury Prevention,2006,12(05):320-322. [54] BHATIA A.Critical discourse analysis of political press conferences[J].Discourse & Society,2006,17(02):173-203. [55] YU B,KAUFMANN S,DIERMEIER D.Classifying Party Affiliation from Political Speech[J].Journal of e-government,2008,5(01):33-48. [56] CERON A,CURINI L,IACUS S M,et al.Every tweet counts?How sentiment analysis of social media can improve our knowledge of citizens' political preferences with an application to Italy and France[J].New Media & Society,2014,16(02):340-358. [57] O'CONNOR B,BALASUBRAMANYAN R,ROUTLEDGE B R,et al.From Tweets to Polls:Linking Text Sentiment to Public Opinion Time Series[C/OL]//[2022-02-25].International AAAI Conference on Weblogs and Social Media,Washington:2010,https://homes.cs.washington.edu/~nasmith/papers/oconnor+balasubramanyan+routledge+smith.icwsm10.pdf. [58] HOBOLT S B,KLEMMENSEN R.Government Responsiveness and Political Competition in Comparative Perspective[J].Comparative Political Studies,2008,41(03):309-337. [59] WANG JING.Research on sustainable evolution of China's cloud manufacturing policies[J].Technology in Society,2022,66(Aug):101639.1-101639.15. [60] NORMANN H E.Policy networks in energy transitions:The cases of carbon capture and storage and offshore wind in Norway[J].Technological Forecasting and Social Change,2017,118(C):80-93. [61] BROWNE J,LEEUW E D,GLEESON D,et al.A network approach to policy framing:A case study of the National Aboriginal and Torres Strait Islander Health Plan[J].Social Science & Medicine,2017,172:10-18. [62] MIKULSKIENE B,PITRENAITE-ZILENIENE B.Management of Participation Practice:Reconstruction of Lithuania's Formal Policy Networks using Social Network Analysis[J].Procedia-Social and Behavioral Sciences,2013,79:127-140. [63] 王浦劬,赖先进.中国公共政策扩散的模式与机制分析[J].北京大学学报(哲学社会科学版),2013,50(06):14-23. [64] 张剑,黄萃,叶选挺,等.中国公共政策扩散的文献量化研究:以科技成果转化政策为例[J].中国软科学,2016(02):145-155. [65] ZHANG J,SU J,HUANG C,et al.Measurement on the policy diffusion:evidence from China's policy documents[J].Technology Analysis and Strategic Management,2022,34(01):71-84.

Share

COinS