GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers

TL;DR Highlight

GPTZero scanned 4841 NeurIPS 2025 papers and found 53 with 100+ fabricated citations (hallucinated references) — a serious academic integrity issue.

Who Should Read

Academic researchers, conference organizers, and anyone evaluating whether AI-generated content in scholarly work is detectable and problematic.

Core Mechanics

GPTZero's AI detection tool scanned the entire NeurIPS 2025 accepted paper corpus (4841 papers) and flagged 53 papers with 100 or more citations that appear to be hallucinated.
Hallucinated citations are plausible-sounding but nonexistent references — they often have realistic author names, paper titles, and venues but don't correspond to real publications.
This is a different problem from AI-generated text detection — it's specifically about fabricated scholarly references, which can cascade through the literature when others cite the citing paper.
The 53-paper figure likely understates the problem — GPTZero's threshold was 100+ hallucinated citations, so papers with fewer fabricated references weren't flagged.
NeurIPS 2025 acceptance rate is around 25% — if accepted papers have this issue, rejected papers likely have higher rates.
The academic community has no established process for systematically checking citations for hallucination at scale, making this a systemic gap.

Evidence

GPTZero published their methodology and a list of flagged paper IDs, enabling community verification.
Several researchers independently verified a sample of flagged citations and confirmed the hallucination pattern.
HN discussion was alarmed: academic citation networks are a foundational trust mechanism, and systematic hallucination corrupts that infrastructure.
Debate about whether the authors knew (intentional misconduct) or didn't know (accidentally included AI-generated reference lists without checking). Both are problematic for different reasons.
Conference organizers don't have the capacity to manually verify all citations — this points to a need for automated citation verification at submission time.

How to Apply

If you're writing academic papers with any AI assistance: run every reference through a citation verifier (Semantic Scholar, CrossRef, Google Scholar) before submission.
For reviewers: spot-check 5-10 citations in every paper you review — hallucinated references are often in the related work section and may not be obvious.
Conference organizers: consider adding automated citation verification as part of the submission pipeline — tools like GPTZero and Semantic Scholar can flag suspicious references.
For research teams: establish a policy that every reference must be independently verified before inclusion, regardless of how the draft was generated.

Code Example

snippet

# Example of verifying paper existence using Semantic Scholar API
import requests

def verify_citation(title: str) -> bool:
    url = "https://api.semanticscholar.org/graph/v1/paper/search"
    resp = requests.get(url, params={"query": title, "limit": 1})
    data = resp.json()
    return data.get("total", 0) > 0

# Usage
print(verify_citation("Attention Is All You Need"))  # True
print(verify_citation("Fake Paper by John Doe 2024"))  # False

Terminology

Citation hallucinationWhen an LLM generates references to nonexistent papers — plausible-sounding but fabricated author names, titles, and venues.

Citation networkThe web of references between academic papers — a foundational infrastructure for scientific knowledge building, vulnerable to corruption by systematic hallucination.

GPTZeroAn AI detection tool originally focused on AI-generated text, recently expanded to detect hallucinated academic citations.