GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers
TL;DR Highlight
GPTZero scanned 4841 NeurIPS 2025 papers and found 53 with 100+ fabricated citations (hallucinated references) — a serious academic integrity issue.
Who Should Read
Academic researchers, conference organizers, and anyone evaluating whether AI-generated content in scholarly work is detectable and problematic.
Core Mechanics
- GPTZero's AI detection tool scanned the entire NeurIPS 2025 accepted paper corpus (4841 papers) and flagged 53 papers with 100 or more citations that appear to be hallucinated.
- Hallucinated citations are plausible-sounding but nonexistent references — they often have realistic author names, paper titles, and venues but don't correspond to real publications.
- This is a different problem from AI-generated text detection — it's specifically about fabricated scholarly references, which can cascade through the literature when others cite the citing paper.
- The 53-paper figure likely understates the problem — GPTZero's threshold was 100+ hallucinated citations, so papers with fewer fabricated references weren't flagged.
- NeurIPS 2025 acceptance rate is around 25% — if accepted papers have this issue, rejected papers likely have higher rates.
- The academic community has no established process for systematically checking citations for hallucination at scale, making this a systemic gap.
Evidence
- GPTZero published their methodology and a list of flagged paper IDs, enabling community verification.
- Several researchers independently verified a sample of flagged citations and confirmed the hallucination pattern.
- HN discussion was alarmed: academic citation networks are a foundational trust mechanism, and systematic hallucination corrupts that infrastructure.
- Debate about whether the authors knew (intentional misconduct) or didn't know (accidentally included AI-generated reference lists without checking). Both are problematic for different reasons.
- Conference organizers don't have the capacity to manually verify all citations — this points to a need for automated citation verification at submission time.
How to Apply
- If you're writing academic papers with any AI assistance: run every reference through a citation verifier (Semantic Scholar, CrossRef, Google Scholar) before submission.
- For reviewers: spot-check 5-10 citations in every paper you review — hallucinated references are often in the related work section and may not be obvious.
- Conference organizers: consider adding automated citation verification as part of the submission pipeline — tools like GPTZero and Semantic Scholar can flag suspicious references.
- For research teams: establish a policy that every reference must be independently verified before inclusion, regardless of how the draft was generated.
Code Example
snippet
# Example of verifying paper existence using Semantic Scholar API
import requests
def verify_citation(title: str) -> bool:
url = "https://api.semanticscholar.org/graph/v1/paper/search"
resp = requests.get(url, params={"query": title, "limit": 1})
data = resp.json()
return data.get("total", 0) > 0
# Usage
print(verify_citation("Attention Is All You Need")) # True
print(verify_citation("Fake Paper by John Doe 2024")) # FalseTerminology
Citation hallucinationWhen an LLM generates references to nonexistent papers — plausible-sounding but fabricated author names, titles, and venues.
Citation networkThe web of references between academic papers — a foundational infrastructure for scientific knowledge building, vulnerable to corruption by systematic hallucination.
GPTZeroAn AI detection tool originally focused on AI-generated text, recently expanded to detect hallucinated academic citations.