DeepSeek对干细胞领域科技论文的创新性评价研究

doi:10.3976/j.issn.1002-4026.2025172

Abstract

Abstract: Current approaches to assessing the innovativeness of scientific and technological papers rely predominantly on expert peer review, a process that is often inefficient and subject to bias. Quantitative metrics offer greater objectivity but are largely retrospective and provide limited foresight or explanatory insight. This study proposes a novel framework for evaluating the innovativeness of stem cellscientific and technological papers using the DeepSeek model. Focusing on a corpus of stem cell research articles, the titles and abstracts of representative papers were vectorized using the bge-large-en-v1.5 model to construct a semantic vector database. Subsequently, the deepseek-reasoner model was applied to extract innovation-related features, which were organized into a vectorized innovation feature database. The two databases were subsequently integrated using a weighted fusion strategy. Target papers were then evaluated through FAISS-based vector retrieval and Top-k similarity matching within the unified database, resulting in a final innovativeness score and ranking. The results were rigorously validated against scores generated by the unassisted DeepSeek model to assess the framework’s effectiveness in evaluating innovativeness in biomedical scientific and technological papers. Empirical results indicate that the DeepSeek model tends to overestimate innovation when used without calibration. However, after targeted training, the model exhibits substantially improved stability and validity in innovation assessment, highlighting its strong potential for identifying innovative dimensions and distinguishing features in scientific literature.

Key words: generative large language models, innovation assessment, semantic embeddings, automated evaluation, peer review

CLC Number:

Cite this article

MA Chunjian, Wang Chao, XU Haiyun, WANG Lekang, ZHANG Xin, CHEN Liang. Evaluation of innovativeness instem cell research articles using DeepSeek[J].Shandong Science, 0, (): 1-.

References

[1] 张光耀, 谢维熙, 姜春林, 等. 科学计量视角下的论文同行评议研究综述[J]. 图书情报工作, 2022, 66(14): 137-149. DOI:10.13266/j.issn.0252-3116.2022.14.014.

[2] 王丽丽, 王银宏, 杨永强, 等. 国内外英文科技期刊同行评议的方法与质量控制研究[J]. 编辑学报, 2024, 36(S2): 37-43.

[3]Thelwall M. In which fields can ChatGPT detect journal article quality? An evaluation of REF2021 results[J]. Journal of Data and Information Science, 2025, 13(1): 1. DOI:10.2478/jdis-2025-0001.

[4] UZZI B, MUKHERJEE S, STRINGER M, et al. Atypical combinations and scientific impact[J]. Science, 2013, 342(6157): 468-472. DOI:10.1126/science.1240474.

[5] 宋歌.科研成果创新力指标S指数的设计与实证[J].图书情报工作, 2016, 60(5): 77-86. DOI:10.13266/j.issn.0252-3116.2016.05.012.

[6] Liu Hua, Dai Ling, Jiang Haozhe. Applied with caution: Extremescenario testing reveals significant risks in using LLMs for humanities and social sciences paper evaluation[J]. Applied Sciences, 2025, 15(19): 10696. DOI:10.3390/app151910696.

[7] Li Junyi, Chen Jie, Ren Ruiyang, et al. The dawn after the dark: An empirical study on factuality hallucination in large language models[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Bangkok, Thailand. Stroudsburg, PA, USA: ACL, 2024: 10879-10899.. DOI:10.18653/v1/2024.acl-long.586.

[8] Falk Delgado A, Garretson G, Falk Delgado A. The language of peer review reports on articles published in the BMJ, 2014–2017: An observational study[J]. Scientometrics, 2019, 120(3): 1225-1235. DOI:10.1007/s11192-019-03160-6.

[9] Zou Huang, Tang Xinhua, Xie Bin, et al. Sentiment classification using machine learning techniques with syntax features[C]//2015 International Conference on Computational Science and Computational Intelligence (CSCI). Las Vegas, NV, USA. IEEE, 2016: 175-179. DOI:10.1109/CSCI.2015.44.

[10] Han Ruxue, Zhou Haomin, Zhong Jiangtao, et al. Aspect-based sentiment evolution and its correlation with review rounds in multi-round peer reviews: A deep learning approach[J]. Data and Information Management, 2026, 10(1): 100105. DOI:10.1016/j.dim.2025.100105.

[11] 涂子依, 周凯静, 孙梦婷, 等. 打开同行评议的“黑匣子”：专家评审行为特征分析[J]. 图书馆论坛, 2024, 44(10): 131-142.DOI:10.3969/j.issn.10021167.2024.10.014.

[12] 颜兆萍, 石进. 开放同行评议背景下评审意见质量分析：以ICLR会议为例[J]. 图书馆建设,2025(5): 71-81. DOI:10.19764/j.cnki.tsgjs.20241379.

[13] Xu Yejun, Li K W, Wang Huimin. Distance-based consensus models for fuzzy and multiplicative preference relations[J]. Information Sciences, 2013, 253: 56-73. DOI:10.1016/j.ins.2013.08.029.

[14] LyonsWarren A M, Aamodt W W, Pieper K M, et al. A structured, journal-led peer-review mentoring program enhances peer review training[J]. Research Integrity and Peer Review, 2024, 9(1): 3. DOI:10.1186/s41073-024-00143-x.

[15] Aczel B, Szaszi B, Holcombe A O. A billion-dollar donation: Estimating the cost of researchers’time spent on peer review[J]. Research Integrity and Peer Review, 2021, 6(1): 14.DOI:10.1186/s41073-021-00118-2.

[16] 阎雅娜,聂兰渤,王静.单篇文献的引文计量指标与Altmetrics的比较分析——以ESI的HotPapers为例[J].图书馆杂志, 2018, 37(3): 100-107.DOI:10.13663/j.cnki.lj.2018.03.015.

[17] 赵勇.期刊共引分析及可视化实证研究——以图书情报学研究为例[J].图书与情报, 2009(3): 89-94.DOI:10.3969/j.issn.1003-6938.2009.03.021.

[18] 俞立平,张矿伟.学术期刊影响速度、加速度与影响强度研究——以CSSCI经济学期刊为例[J].图书馆杂志, 2021, 40(1): 93-103. DOI:10.13663/j.cnki.lj.2021.01.012.

[19] 林松,张娅彭,张维维,等.科技期刊审稿人推荐作者引用文献的动因分析[J].编辑学报, 2018, 30(4): 358-361.DOI:10.16811/j.cnki.1001-4314.2018.04.006.

[20] 杨素娟.科技项目立项同行评议评审专家反评价体系构建研究[D].沈阳:沈阳理工大学,2009.

[21] Wang Jian, Veugelers R, Stephan P. Bias against novelty in science: A cautionary tale for users of bibliometric indicators[J]. Research Policy, 2017, 46(8): 1416-1436. DOI:10.1016/j.respol.2017.06.006.

[22] 逯万辉,谭宗颖.学术成果主题新颖性测度方法研究——基于Doc2Vec和HMM算法[J].数据分析与知识发现, 2018, 2(3): 22-29.DOI:10.11925/infotech.2096-3467.2017.1012.

[23] Zhang Yi, Tsai F S. Chinese novelty mining[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 3 - EMNLP '09. Singapore. Morristown, NJ, USA: ACL, 2009: ■-■.DOI:10.3115/1699648.1699703.. DOI:10.3115/1699648.1699703.

[24] 沈律.科技创新的一般均衡理论——关于科技成果创新度评价的科学计量学分析[J].科学学研究, 2003, 21(2): 205-209. DOI:10.16192/j.cnki.1003-2053.2003.02.020.

[25] 沈阳.一种基于关键词的创新度评价方法[J].情报理论与实践, 2007, 30(1): 125-127. DOI:10.16353/j.cnki.1000-7490.2007.01.034.

[26] 许丹,徐爽,陈斯斯,等.基于自然语言词对法的文献主题新颖性探测研究[J].图书情报工作, 2018, 62(8): 130-138.DOI:10.13266/j.issn.0252-3116.2018.08.017.

[27] 阮光册,夏磊.基于Doc2Vec的期刊论文热点选题识别[J].情报理论与实践, 2019, 42(4): 107-111. DOI:10.16353/j.cnki.1000-7490.2019.04.019.

[28] Bommasani R, Hudson D A, Adeli E, et al. On the opportunities and risks of foundation models[PP/OL]. arXiv, ［2025-12-01］. http://arxiv.org/pdf/2108.07258.

[29] Bubeck S, Chandrasekaran V, Eldan R, et al. Sparks of artificial general intelligence: early experiments with GPT-4[A]. arXiv, 2025-12-01］.https://arxiv.org/pdf/2303.12712.

[30] 陆伟,刘家伟,马永强,等. ChatGPT为代表的大模型对信息资源管理的影响[J].图书情报知识, 2023, 40(2): 6-9. DOI:10.13366/j.dik.2023.02.006.

[31] Naddaf M. How are researchers using AI? Survey reveals pros and cons for science[J]. Nature, 2025: 02-04.DOI:10.1038/d41586-025-00343-5 DOI:10.1038/d41586-025-00343-5.

[32] Khalifa M, Albadawy M. Using artificial intelligence in academic writing and research: An essential productivity tool[J]. Computer Methods and Programs in Biomedicine Update, 2024, 5: 100145. DOI:10.1016/j.cmpbup.2024.100145.

[33] 王雅琪,曹树金. ChatGPT用于论文创新性评价的效果及可行性分析[J].情报资料工作, 2023, 44(5): 28-38.DOI:10.12154/j.qbzlgz.2023.05.003.

[34] Huang Shengzhi, Huang Yong, Liu Yinpeng, et al. Are large language models qualified reviewers in originality evaluation?[J]. Information Processing & Management, 2025, 62(3): 103973. DOI:10.1016/j.ipm.2024.103973.

[35] Li Dong, Jin Ruoming, Gao Jing, et al. On sampling top-K recommendation evaluation[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Virtual Event CA USA. ACM, 2020: 2114-2124. DOI:10.1145/3394486.3403262.. DOI:10.1145/3394486.3403262.

[36] Jurgens D, Kumar S, Hoover R, et al. Measuring the evolution of a scientific field through citation frames[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 391-406. DOI:10.1162/tacl_a_00028.

[37] 时宗彬,朱丽雅,乐小虬.基于本地大语言模型和提示工程的材料信息抽取方法研究[J].数据分析与知识发现, 2024, 8(7): 23-31.DOI:10.11925/infotech.2096-3467.2023.1119.

[38] 魏绪秋,申力旭.学术论文创新性研究述评[J].图书情报知识, 2022, 39(4): 68-79.DOI:10.13366/j.dik.2022.04.068.

[39] LuSheng, Kuznetsov I, Gurevych I. Gurevych I. Identifying aspects in peer reviews[C]//Findings of the Association for Computational Linguistics: EMNLP 2025. Suzhou, China. ACL, 2025: 6145-6167. DOI:10.18653/v1/2025.findings-emnlp.326.

[40] Afzal O M, Nakov P, Hope T, et al. Beyond “not novel enough”: enriching scholarly critique with LLM-assisted feedback[A]. arXiv, ［2025-12-01］.http://arxiv.org/abs/2508.10795.

[41] Ginsburg S, Gingerich A, Kogan J R, et al. Idiosyncrasy in assessment comments: Do faculty have distinct writing styles when completing in-training evaluation reports?[J]. Academic Medicine, 2020, 95(11S): S81-S88. DOI:10.1097/acm.0000000000003643.

[42] Xiong L, Xiong C, Li Y, et al. Approximate nearest neighbor negative contrastive learning for dense text retrieval[A]. arXiv, ［2025-12-01］.http://arxiv.org/abs/2007.00808.2020.

Metrics

Comments

Recommended 0

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), which permits third parties to freely share (i.e., copy and redistribute the material in any medium or format) and adapt (i.e., remix, transform, or build upon the material) the articles published in this journal, provided that appropriate credit is given, a link to the license is provided, and any changes made are indicated. The material may not be used for commercial purposes. For details of the CC BY-NC 4.0 license, please visit: https://creativecommons.org/licenses/by-nc/4.0

Evaluation of innovativeness instem cell research articles using DeepSeek

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0