Bin Ge, Chunhui, Chong Zhang, Jibing Wu, National University of Defense Technology, China
Identifying high-quality frontier topics from massive scientific research data to assist researchers in accurately conducting scientific research is of paramount importance. Traditional analysis methods face bottlenecks such as limited cross-domain adaptability, high resource consumption, and low efficiency. To address these challenges, this study proposes an AI-agent-based frontier topic mining method. An innovative generative-verification dual-agents (D-Agents) architecture is constructed. Specifically, prompt engineering is employed to develop a generative agent (G-Agent), which leverages the semantic understanding capabilities of large-scale pre-trained language models to automatically generate candidate frontier topics. Subsequently, a verification agent (V-Agent) is introduced to establish a multi-dimensional evaluation system, which automatically verifies candidate topics from dimensions including academic novelty, topic accuracy, and completeness of frontier topics. The effectiveness of the proposed method is validated through three manually labeled datasets in computer vision (CV), natural language processing (NLP), and machine learning (ML). Experimental results demonstrate that the D-Agents framework can simultaneously perform frontier topic mining tasks across multiple domains. On the three labeled datasets (CV-DataSet, NLP-DataSet, and ML-DataSet), the D-Agents achieve a precision exceeding 74% while maintaining a recall over 85%. Compared with the traditional bibliometric method, this method significantly improves the precision and recall of frontier topic mining in recommendation system, and the performance reaches 86%. The D-Agents framework effectively mitigates the hallucination issue of G-Agent through its automatic generation and self-verification mechanism, thereby substantially enhancing the efficiency of frontier topic mining.
LLMs, Frontier Topics, Prompt Engineering, D-Agents, G-Agent, V-Agent, RAG