Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph
TL;DR Highlight
LLMs traverse Knowledge Graphs step-by-step to reduce hallucinations and improve accuracy.
Who Should Read
Backend/AI developers who want to reduce LLM hallucinations. Especially useful when building domain-knowledge-based QA systems or fact-checking pipelines.
Core Mechanics
- Uses LLM as an agent on a Knowledge Graph (KG), traversing KG nodes step-by-step (beam search style) to answer questions
- Instead of fetching text chunks like traditional RAG, constructs reasoning paths by following entity-relation triples in the KG
- The LLM autonomously decides which relation to follow at each traversal step and self-evaluates whether the answer is sufficient
- Reasoning is traceable via KG paths, making the answer explainable
- Queries external KGs (Freebase, Wikidata, etc.) in real-time, so no model retraining needed for up-to-date knowledge
Evidence
- ~8-12% accuracy improvement over traditional RAG on KGQA benchmarks (WebQSP, CWQ)
- Up to 15%+ performance improvement over standalone Chain-of-Thought on multi-hop reasoning questions
- Provides explicit reasoning paths (traceability) compared to black-box LLMs
How to Apply
- If you have a domain KG (e.g., medical or legal ontology), extract entities from user queries and build a KG traversal agent with an LLM to create a multi-hop answer pipeline
- Replacing chunk retrieval with KG triple path search in your RAG pipeline improves accuracy especially for relational questions like 'who has what relationship with whom'
- Present a list of KG relation candidates in the LLM prompt and let it choose which relation to explore — you can connect to Freebase/Wikidata APIs for a quick implementation
Code Example
# Think-on-Graph core loop pseudo-code
def think_on_graph(question, kg_client, llm, max_depth=3, beam_width=3):
# 1. Extract starting entities from the question
start_entities = llm.extract_entities(question)
paths = [(entity, []) for entity in start_entities] # (current node, path)
for depth in range(max_depth):
candidates = []
for current_node, path in paths:
# 2. Retrieve relation candidates for the current node from KG
relations = kg_client.get_relations(current_node)
# 3. LLM selects relevant relations
prompt = f"""
Question: {question}
Current exploration node: {current_node}
Path so far: {path}
Available relations: {relations}
From the above relations, select the ones to explore in order to answer the question.
If none are relevant, output 'none'."""
selected_relations = llm.call(prompt)
for rel in selected_relations[:beam_width]:
next_nodes = kg_client.get_neighbors(current_node, rel)
for node in next_nodes:
candidates.append((node, path + [(current_node, rel, node)]))
# 4. LLM determines whether the current candidates are sufficient to answer
answer_check_prompt = f"""
Question: {question}
Explored paths and entities: {candidates}
Can the question be answered with the current information?
If yes, provide the answer; otherwise, output 'continue'."""
result = llm.call(answer_check_prompt)
if result != 'continue':
return result, candidates # Return answer + supporting paths
paths = candidates[:beam_width] # Maintain beam
return llm.call(f"Question: {question}\nCollected information: {paths}\nFinal answer:"), pathsTerminology
Related Papers
Show HN: Airbyte Agents – context for agents across multiple data sources
Airbyte가 Slack, Salesforce, Linear 등 여러 SaaS 시스템의 데이터를 미리 인덱싱해서 Agent가 API를 일일이 뒤지지 않아도 되는 Context Store를 출시했다. 기존 MCP 방식보다 토큰을 최대 90%까지 줄이는 효과를 확인했다.
A polynomial autoencoder beats PCA on transformer embeddings
PCA 인코더에 2차 다항식 디코더를 붙여서 닫힌 형태(closed-form)로 embedding 압축 품질을 크게 개선하는 기법으로, SGD 없이 numpy만으로 구현 가능하다.
From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction
RAG 스타일 텍스트 검색 대신 Schema로 정의된 구조화 레코드에 메모리를 저장하면, 정확한 사실 조회·상태 추적·집계 쿼리에서 압도적으로 높은 정확도를 얻을 수 있다.
Show HN: Atomic – Local-first, AI-augmented personal knowledge base
Atomic builds a self-hosted, open-source personal knowledge graph app that automatically embeds, tags, and links notes, web clips, and RSS feeds—supporting semantic search, LLM-powered wiki synthesis, and MCP integration.
We replaced RAG with a virtual filesystem for our AI documentation assistant
Explains how Mintlify overcame RAG chunking limitations by building a virtual filesystem (ChromaFs) on top of Chroma DB that mimics UNIX commands, reducing session boot time from 46 seconds to 100ms.
Chroma Context-1: Training a Self-Editing Search Agent