구조적 추론을 통한 LLM의 Prompt-Induced Hallucination 완화

Mitigating Prompt-Induced Hallucinations in Large Language Models via Structured Reasoning

Jan 6, 2026•Jinbo Hao, Kai Yang, Qingzhen Su +3•View PDF

TL;DR Highlight

Knowledge Graph 탐색 코드를 Chain-of-Thought 프롬프트에 직접 끼워 넣어 GPT-4/LLaMA 3.3의 환각을 HIT@1 기준 15%p 이상 줄인 방법.

Who Should Read

LLM 기반 QA 시스템이나 지식 검색 파이프라인에서 hallucination 문제를 겪고 있는 AI 백엔드 개발자. 특히 Knowledge Graph를 이미 갖고 있거나 구조적 추론 파이프라인을 설계 중인 경우.

Core Mechanics

Knowledge Distillation Chain(KDCM) 모델에 코드 모듈을 추가해 Knowledge Graph 탐색을 구조화 — 자연어 추론만 쓸 때 생기는 오류 전파를 차단
Chain-of-Thought(단계별 추론 생성 기법) 프롬프트 안에 코드를 포함시켜 외부 지식을 명시적으로 주입, 모델이 내부 추측에 덜 의존하게 만듦
처리 흐름이 3단계: 프롬프트를 서브태스크로 분해 → 코드-Knowledge Graph로 중간 추론 단계 제약 → 검증된 중간 결과 기반으로 최종 답변 생성
GPT-4와 LLaMA 3.3으로 WebQSP, CWQ, GSM8K, MWP, Dr.SPIDER 5개 공개 데이터셋에서 검증
기존 RAG나 Self-Check 방식 대비 HIT@1 기준 약 7~8%p 높은 성능, 특히 멀티스텝 수학 문제에서 격차가 큼
프롬프트가 모호하거나 불완전해도 성능 유지 — 구조적 추론이 노이즈에 대한 완충 역할

Evidence

KDCM 단독 대비 Code Module 추가 시 WebQSP HIT@1 82.36% → 99.33% (+17%p), CWQ 81.36% → 97.86% (+16.5%p)
5개 데이터셋 평균 HIT@1 98.40%, HIT@3 96.83%, HIT@5 95.51% — 비교 대상 RAG는 각각 90.23%, 90.28%, 90.18%
일반화(Generalization) 검증에서도 제안 방법 HIT@1 99.18% vs RAG 90.36%, Self-Check 90.28%
논문이 보고한 전체 개선폭: HIT@1 +15.64%, HIT@3 +13.38%, HIT@5 +13.28% (베이스라인 KDCM 대비)

How to Apply

LLM에 질문을 던지기 전에 Knowledge Graph(예: Virtuoso, Neo4j)에서 관련 엔티티·관계를 코드로 쿼리하고, 그 결과를 CoT 프롬프트 앞부분에 구조화된 컨텍스트로 삽입 — RAG가 문서 청크를 붙이는 방식과 유사
복잡한 질문을 명시적으로 서브태스크로 분해하는 프롬프트 단계를 추가하고, 각 중간 결론을 Knowledge Graph로 검증한 뒤 다음 단계로 넘기는 체이닝 구조로 파이프라인 설계
Knowledge Graph 인프라가 없는 경우에도 아이디어를 부분 적용 가능 — 답변 생성 전에 코드(Python/SQL)로 검증 가능한 사실을 먼저 계산하고 그 결과를 프롬프트에 포함하는 'Code-first CoT' 패턴 활용

Code Example

snippet

# Code-first CoT 패턴 예시 (Knowledge Graph 없이 부분 적용)
# 질문: "2024년 기준 한국의 인구는 프랑스보다 몇 배 많은가?"

# Step 1: 코드로 검증 가능한 사실 먼저 계산
facts = {
    "korea_population": 51_700_000,
    "france_population": 68_000_000
}
ratio = facts["korea_population"] / facts["france_population"]

# Step 2: 계산 결과를 CoT 프롬프트에 포함
prompt = f"""
[구조적 사실 - 검증 완료]
- 한국 인구: {facts['korea_population']:,}명
- 프랑스 인구: {facts['france_population']:,}명
- 계산된 비율: {ratio:.3f}

[추론 지시]
위 검증된 수치를 기반으로 단계별로 답변하세요.
1. 비율 해석: ...
2. 최종 답변: ...
"""

# Step 3: LLM 호출
response = llm.complete(prompt)

Terminology

HallucinationLLM이 사실이 아닌데도 자신 있게 그럴듯한 내용을 생성하는 현상. 잘 모르는 질문에 당당하게 틀린 답을 말하는 것과 같음.

Chain-of-ThoughtLLM에게 '단계별로 생각해봐'라고 시키는 프롬프트 기법. 바로 답 말고 풀이 과정을 쓰게 해서 정확도를 높임.

Knowledge Graph엔티티(사람, 장소, 개념)와 그 관계를 그래프 구조로 저장한 지식 DB. '파리 → 수도 → 프랑스'처럼 사실을 노드-엣지로 표현.

Knowledge Distillation큰 모델(Teacher)의 지식을 작은 모델(Student)에 전수하는 학습 기법. 선생님 풀이를 보고 따라 배우는 것과 비슷.

HIT@K상위 K개 후보 답변 안에 정답이 들어있는 비율. HIT@1은 첫 번째 답이 정답인 경우, HIT@5는 5개 중 하나라도 맞으면 성공.

Prompt-Induced Hallucination모델 자체 문제가 아니라 애매하거나 불완전한 프롬프트 때문에 발생하는 환각. 질문을 잘못 던져서 모델이 엉뚱한 방향으로 추론하는 경우.

RAG외부 문서나 DB를 실시간으로 검색해서 LLM 응답에 붙여주는 기법(Retrieval-Augmented Generation). 모델 내부 기억 대신 검색 결과를 근거로 답변하게 만듦.

Original Abstract (Expand)

To address hallucination issues in large language models (LLMs), this paper proposes a method for mitigating prompt-induced hallucinations. Building on a knowledge distillation chain-style model, we introduce a code module to guide knowledge-graph exploration and incorporate code as part of the chain-of-thought prompt, forming an external knowledge input that provides more accurate and structured information to the model. Based on this design, we develop an improved knowledge distillation chain-style model and leverage it to analyze and constrain the reasoning process of LLMs, thereby improving inference accuracy. We empirically evaluate the proposed approach using GPT-4 and LLaMA-3.3 on multiple public datasets. Experimental results demonstrate that incorporating code modules significantly enhances the model's ability to capture contextual information and effectively mitigates prompt-induced hallucinations. Specifically, HIT@1, HIT@3, and HIT@5 improve by 15.64%, 13.38%, and 13.28%, respectively. Moreover, the proposed method achieves HIT@1, HIT@3, and HIT@5 scores exceeding 95% across several evaluation settings. These results indicate that the proposed approach substantially reduces hallucination behavior while improving the accuracy and verifiability of large language models.