Different Language Models Learn Similar Number Representations
TL;DR Highlight
LLMs, regardless of architecture—from Transformers to LSTMs—consistently learn periodic patterns with periods T=2, 5, and 10 when representing numbers, mathematically explaining a 'convergent evolution' phenomenon beyond model architecture.
Who Should Read
ML researchers curious about how LLM internal representations are formed, and AI system developers aiming to better embed numerical reasoning capabilities into models.
Core Mechanics
- Language models trained on natural language text consistently learn dominant periodic features with periods T=2, 5, and 10 when internally representing numbers, naturally corresponding to the decimal system and even/odd distinctions.
- Researchers discovered a 'two-tiered hierarchy' within these periodic features: a spike at specific frequencies in the Fourier domain, and a 'mod-T geometrically separable' representation—though not all models exhibit the latter.
- Mathematically, sparsity in the Fourier domain is a necessary but not sufficient condition for mod-T geometric separability; periodic patterns alone do not guarantee a representation capable of linearly classifying numbers.
- Geometrically separable representations arise through two paths: learning from 'text-number co-occurrence' and 'number-number interactions' in general language data, or training on 'multi-token addition problems'.
- Structurally disparate models—Transformers, Linear RNNs, LSTMs, and even classical word embeddings—all learn similar numerical representations, a phenomenon the researchers liken to 'convergent evolution' in biology.
- The quality of numerical representations (geometric separability) is influenced by learning data, architecture, optimizer, and tokenizer—it’s not determined by any single factor.
- This research informs attempts to connect external mathematical computation circuits to LLMs (neurosymbolic approaches); the potential opens to leverage a 'common representation' if different models share compatible numerical representations.
Evidence
- "A comment on HN suggested these results support the 'Platonic Representation Hypothesis' (that models converge to a common reality representation when trained on the same data), noting shared representations could simplify connecting mathematical circuits between models. Conversely, critical comments argued the 'learning reality' phrasing is overstated, claiming models only learn statistical regularities, and warned against misusing the hypothesis to justify claims of LLM objectivity. Observations of eigenvalue distributions resembling Benford's Law prompted questions about whether this pattern is expected in human-curated text corpora. Comments also questioned whether the phenomenon stems from training data or model architecture, despite the paper noting all four factors—data, architecture, optimizer, and tokenizer—play a role. Finally, a self-promotional comment introduced 'turnstyle', a library implementing neurosymbolic programming leveraging shared representations."
How to Apply
- "To improve numerical accuracy in LLMs, fine-tuning with multi-token addition problems may be more effective than single-token addition for creating mod-T geometrically separable numerical representations. When designing systems to share or transfer numerical representations across models (e.g., sharing math modules), consider whether a T=2, 5, 10 period-based Fourier representation can serve as a common interface. Account for the impact of tokenizer design on numerical representation quality, and experiment with digit-level versus number-level tokenization for domain-specific models handling numerical data. When probing model internal representations, test for linear separability based on mod-2, mod-5, and mod-10 criteria to quickly assess the model’s numerical representation quality."
Terminology
Related Papers
Did Claude increase bugs in rsync?
rsync 프로젝트에 Claude AI가 도입된 이후 버그가 늘었다는 소셜 미디어 주장을 실제 데이터와 통계 분석으로 검증한 글로, 결론적으로 Claude 도입 후 릴리즈가 역사적 분포에서 유독 버그가 많다는 통계적 근거는 없었다.
I built a vulnerable app and spent $1,500 seeing if LLMs could hack it
Firebase 취약점을 가진 앱을 직접 제작하고 GPT-5.5, Claude, Deepseek 등 주요 LLM이 자율적으로 해킹할 수 있는지 실험한 결과, GPT-5.5가 70% 성공률로 압도적이었고 Claude는 보안 거부 정책 때문에 능력과 무관하게 낮은 점수를 기록했다.
Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models
LLM이 여러 답변을 의미 단위로 묶어 객관식으로 만들고 스스로 채점해서 '이 답 얼마나 확신해?'를 수치로 뽑아내는 기법.
SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction
AI 에이전트가 사용하는 'Skill 패키지'에 악성 페이로드를 심으면 최신 모델도 86%까지 뚫린다는 보안 벤치마크.
MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems
RAG, Mem0 같은 LLM 메모리 시스템이 왜 틀린 답을 내는지 자동으로 찾아주는 디버깅 프레임워크
DeepSWE: A contamination-free benchmark for long-horizon coding agents
기존 SWE-bench의 데이터 오염 및 검증 오류 문제를 해결하기 위해 처음부터 새로 만든 코딩 에이전트 벤치마크로, GPT-5.5가 70%로 1위를 차지하고 모델 간 성능 격차가 훨씬 뚜렷하게 드러난다.