CiteGuard: Faithful Citation Attribution for LLMs via Retrieval-Augmented Validation
Abstract
CiteGuard, a retrieval-aware agent framework, enhances citation accuracy in LLM-generated text by aligning citations with human choices, achieving near-human performance.
Large Language Models (LLMs) have emerged as promising assistants for scientific writing. However, there have been concerns regarding the quality and reliability of the generated text, one of which is the citation accuracy and faithfulness. While most recent work relies on methods such as LLM-as-a-Judge, the reliability of LLM-as-a-Judge alone is also in doubt. In this work, we reframe citation evaluation as a problem of citation attribution alignment, which is assessing whether LLM-generated citations match those a human author would include for the same text. We propose CiteGuard, a retrieval-aware agent framework designed to provide more faithful grounding for citation validation. CiteGuard improves the prior baseline by 12.3%, and achieves up to 65.4% accuracy on the CiteME benchmark, on par with human-level performance (69.7%). It also enables the identification of alternative but valid citations.
Community
Large Language Models (LLMs) have emerged as promising assistants for scientific writing. However, there have been concerns regarding the quality and reliability of the generated text, one of which is the citation accuracy and faithfulness. While most recent work relies on methods such as LLM-as-a-Judge, the reliability of LLM-as-a-Judge alone is also in doubt. In this work, we reframe citation evaluation as a problem of citation attribution alignment, which is assessing whether LLM-generated citations match those a human author would include for the same text. We propose CiteGuard, a retrieval-aware agent framework designed to provide more faithful grounding for citation validation. CiteGuard improves the prior baseline by 12.3%, and achieves up to 65.4% accuracy on the CiteME benchmark, on par with human-level performance (69.7%). It also enables the identification of alternative but valid citations. https://kathcym.github.io/CiteGuard_Page/
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Generation-Time vs. Post-hoc Citation: A Holistic Evaluation of LLM Attribution (2025)
- SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation (2025)
- Enhancing Factual Accuracy and Citation Generation in LLMs via Multi-Stage Self-Verification (2025)
- Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation (2025)
- FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering (2025)
- VeriCite: Towards Reliable Citations in Retrieval-Augmented Generation via Rigorous Verification (2025)
- Who Gets Cited Most? Benchmarking Long-Context Language Models on Scientific Articles (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper