Shandong Science

   

Construction of a knowledge graph for the educational statistical indicator system based on large language models

WANG Pengyu1,JIANG Shuming1*,WEI Zhiqiang1,YU Jun1,ZHANG Mengmeng2   

  1. 1. Information Research Institute of Shandong Academy of Sciences, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China;2. Liaocheng Science and Technology Information Research Center, Liaocheng 252000, China
  • Received:2025-07-25 Accepted:2025-11-02 Online:2026-01-07
  • Contact: JIANG Shuming E-mail:jsm@qlu.edu.cn

Abstract: Current statistical work in the field of educational development faces challenges such as low knowledge-retrieval efficiency and high technical barriers to data utilization. This study focuses on the knowledge related to the indicator system that underpins educational statistical work and proposes a large language model (LLM)-based method for constructing a knowledge graph for the educational statistical indicator system so as to provide knowledge support for addressing the above issues. Specifically, a research framework comprising the following layers is established: data processing, ontology construction, graph construction, and application prospect. In the data processing layer, original documents are converted into Markdown format and cleaned to enhance structural parsing. In the ontology construction layer, a seven-step method is used to construct the ontology of the educational statistical indicator system. In the graph construction layer, an enhanced prompting strategy integrating chain-of-thought prompting with the ontology structure guides the LLM in knowledge extraction, with Neo4j used for storage and visualization. Experimental results show that this strategy achieved an F1 score of 96.22% for entity–attribute extraction and 92.23% for entity–relation extraction, significantly outperforming the basic prompting strategy, thereby verifying the effectiveness of chain-of-thought prompting in complex information extraction. The constructed knowledge graph effectively represents the complex knowledge in the educational statistical indicator system, providing a knowledge foundation for intelligent information retrieval and natural language data querying and offering insights for knowledge organization in other statistical fields.

Key words:  , Educational Statistics, Indicator System, Knowledge Graph, Large Language Model, Ontology Construction, Chain-of-Thought Prompting

CLC Number: 

  • TP391.1

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), which permits third parties to freely share (i.e., copy and redistribute the material in any medium or format) and adapt (i.e., remix, transform, or build upon the material) the articles published in this journal, provided that appropriate credit is given, a link to the license is provided, and any changes made are indicated. The material may not be used for commercial purposes. For details of the CC BY-NC 4.0 license, please visit: https://creativecommons.org/licenses/by-nc/4.0