Journal of Scientific Information Research
Keywords
digital humanities; Large language model; GLM-4; promt engneering; Chain of Thought; machine learning
Abstract
[Purpose/significance]This paper aims to explore the evolution trend of research methods in the field of digital humanities with the help of large language model technology. [Method/process]This paper mainly focuses on the data of CNKI journal articles, selects the general Chinese large language model GLM-4, uses prompt engineering and chain of thought to extract and cluster the abstract data, of papers and analyzes its evolution trend through quantitative processing. [Result/conclusion]The study shows that GLM-4 can well identify and extract research methods from complex abstract data. Analyzing the evolution trend in chronological order, it is found that research methods such as "interview survey" and "grounded theory" are gradually marginalized, while machine learning and other related research methods are gradually becoming mainstream. This article reveals the evolution trend of research methods in the field of Chinese digital humanities, and gives the research results of digital humanities a richer and more comprehensive cultural connotation.
First Page
65
Last Page
74
Submission Date
12-Aug-2024
Revision Date
13-Sep-2024
Acceptance Date
27-Sep-2024
Published Date
01-Jan-2025
Reference
[1] 朱本军, 聂华. 跨界与融合: 全球视野下的数字人文: 首届北京大学“数字人文论坛”会议综述[J]. 大学图书馆学报, 2016, 34 (05): 16-21.
[2] OUYANG L, WU J, XU J, et al. Training language models to follow instructions with human feedback[EB/OL]. (2022-05-04) [2024-07-16]. https://arxiv.org/pdf/2203.02155.
[3] 车万翔, 窦志成, 冯岩松, 等. 大模型时代的自然语言处理: 挑战、机遇与发展[J]. 中国科学: 信息科学, 2023, 53 (09): 1645-1687.
[4] 黄水清, 刘浏, 王东波. 计算人文教育的回顾和探讨[J]. 情报理论与实践, 2024, 47 (07): 1-9.
[5] 刘炜, 叶鹰. 数字人文的技术体系与理论结构探讨[J]. 中国图书馆学报, 2017, 43 (05): 32-41.
[6] 柯平, 宫平. 数字人文研究演化路径与热点领域分析[J]. 中国图书馆学报, 2016, 42 (06): 13-30.
[7] 张琪, 王东波, 黄水清, 等. 史书多维知识重组与可视化研究: 以《史记》为对象[J]. 情报学报, 2022, 41 (02): 130-141.
[8] 张卫, 王昊, 王东波, 等. 以数据关联促文学认知: 古诗隐喻文化图式的语义组织方法[J]. 图书情报工作, 2024, 68 (04): 109-123.
[9] 傅予, 李博然, 徐拥军. 数字人文视角下文化资源数字化开发和传播要素与影响机理研究[J]. 图书情报工作, 2023, 67 (20): 45-57.
[10] 邱伟云. 我国台湾数字人文研究进程 (2009—2017) [J]. 图书馆论坛, 2020, 40 (07): 9-19.
[11] 邵晓宁, 叶鹰. 国内外数字学术类研究的高引论文特征简析: 兼论数字学术、数字人文、人文计算关联[J]. 情报理论与实践, 2020, 43 (12): 120-125.
[12] 王丽华, 骆雨辰. 数字人文领域主题热点与演化趋势: 基于ADHO数字人文年会论文的分析[J]. 图书情报工作, 2024, 68 (13): 15-27.
[13] ZHOU Z, SHI J X, SONG P X, et al. LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model[EB/OL]. (2024-06-07) [2024-07-16]. https://arxiv.org/pdf/2406.04614.
[14] ZHANG H, CHEN J, JIANG F, et al. Huatuogpt, towards taming language model to be a doctor[EB/OL]. (2023-05-24) [2024-07-16]. https://arxiv.org/pdf/2305.15075.
[15] ZHANG X, YANG Q. Xuanyuan 2. 0: A large chinese financial chat model with hundreds of billions parameters[C]//Proceedings of the 32nd ACM international conference on information and knowledge management. NewYork: 2023, 4435-4439.
[16] 夏翠娟, 林海青, 刘炜. 面向循证实践的中文古籍数据模型研究与设计[J]. 中国图书馆学报, 2017, 43 (06): 16-34.
[17] 李娜, 包平. 面向数字人文的馆藏方志古籍地名自动识别模型构建[J]. 图书馆, 2018, (05): 67-73.
[18] 刘浏, 齐月, 刘雏菲, 等. 计算人文下的古籍引书研究及全文本知识库的构建[J]. 情报学报, 2023, 42 (12): 1498-1512.
[19] 赵志枭, 胡蝶, 刘畅, 等. 人文社科领域中文通用大模型性能评测[J]. 图书情报工作, 2024, 68 (13): 132-143.
[20] 赵雪, 赵志枭, 孙凤兰, 等. 面向语言文学领域的大语言模型性能评测研究[J]. 外语电化教学, 2023 (06): 57-65, 114.
[21] 张宏玲, 沈立力, 韩春磊, 等. 大语言模型对图书馆数字人文工作的挑战及应对思考[J]. 图书馆杂志, 2023, 42 (11): 31-39, 61.
[22] WEI J, TAY Y, BOMMASANI R, et al. Emergent abilities of large language models[J]. [EB/OL]. (2022-10-26) [2024-07-16]. https://arxiv.org/pdf/2206.07682.
[23] PARK D, AN G T, KAMYOD C, et al. A Study on Performance Improvement of Prompt Engineering for Generative AI with a Large Language Model[J]. Journal of Web Engineering, 2023 (08): 22.
[24] 张玲玲, 黄务兰. 基于ChatGPT API和提示词工程的专利知识图谱构建[J/OL]. 情报杂志, 1-8[2024-09-10]. http://kns.cnki.net/kcms/detail/61.1167.G3.20240828.1212.006.html.
[25] WEI J, WANG X, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[J]. Advances in Neural Information Processing Systems, 2022, 35: 24824-24837.
[26] LIU P, YUAN W, FU J, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing[EB/OL]. (2021-07-28) [2024-07-16]. https://arxiv.org/pdf/2107.13586.
[27] ZHANG Z, ZHANG A, LI M, et al. Automatic chain of thought prompting in large language models[EB/OL]. (2022-10-07) [2024-07-16]. https://arxiv.org/pdf/2210.03493.
[28] ZHANG Z, ZHANG A, LI M, et al. Automatic chain of thought prompting in large language models[EB/OL]. (2022-10-07) [2024-07-16]. https://arxiv.org/pdf/2210.03493.
[29] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2019-05-24) [2024-07-16]. https://arxiv.org/pdf/1810.04805.
[30] 何悄吟, 王晓光. 数实共生: 预见数字人文未来图景: 2023年中国数字人文年会综述[J]. 数字人文研究, 2024, 4 (01): 3-17.
Digital Object Identifier (DOI)
10.19809/j.cnki.kjqbyj.2025.01.006
Recommended Citation
SUN, Guangyao and WANG, Dongbo
(2025)
"The Evolution of Research Methods in the Digital Humanities Perspective: A Quantitative Analysis Based on CNKI Data and a Large Language Model,"
Journal of Scientific Information Research: Vol. 7:
Iss.
1, Article 6.
DOI: 10.19809/j.cnki.kjqbyj.2025.01.006
Available at:
https://eng.kjqbyj.com/journal/vol7/iss1/6