SEAL: Training-Free로 LLM의 Chain-of-Thought 추론을 조정하는 방법

SEAL: Steerable Reasoning Calibration of Large Language Models for Free

Apr 7, 2025•Runjin Chen, Zhenyu (Allen) Zhang, Junyuan Hong +2•View PDF

TL;DR Highlight

DeepSeek-R1 같은 추론 모델이 쓸데없이 '다시 확인'하고 '방향 전환'하는 습관을 latent space 조작으로 교정해 정확도는 올리고 토큰은 줄이는 기법

Who Should Read

DeepSeek-R1이나 QwQ 같은 추론 모델을 프로덕션에 올리면서 토큰 비용과 응답 지연을 줄이고 싶은 ML 엔지니어. 모델 파인튜닝 없이 추론 품질을 개선할 방법을 찾는 개발자.

Core Mechanics

CoT(Chain-of-Thought) 추론 과정을 3가지 타입으로 분류: 문제를 단계별로 푸는 'execution', 이전 단계를 재검토하는 'reflection', 다른 방향으로 전환하는 'transition'
틀린 답변일수록 reflection/transition 토큰이 압도적으로 많음 - 정답 샘플 평균 1,534 토큰 vs 오답 샘플 평균 7,586 토큰 (Math-500 기준)
세 가지 thought 타입은 모델의 deep layer(중간~후반 레이어)에서 latent space 상에 명확히 분리되어 있어 벡터 연산으로 조작 가능
SEAL은 약 1,000개 학습 샘플로 'execution 벡터 - reflection/transition 벡터' 형태의 steering vector를 오프라인에서 추출한 뒤, 추론 중 각 thought 끝마다 hidden state에 더해줌
수학 데이터로 추출한 steering vector가 코드 생성(LiveCodeBench), 캘린더 플래닝 등 다른 도메인에도 전이됨 - 태스크마다 새로 만들 필요 없음
DeepSeek-R1-Distill-Qwen-1.5B에서 Math-500 Hard 기준 정확도 +14.1%, 토큰 사용량 -28.8% 동시 달성

Evidence

Math-500에서 DeepSeek-R1-Distill-Qwen-1.5B: 정확도 67.0% → 78.0% (+11%), 토큰 4,526 → 3,154 (-30.3%)
Math-500 Hard에서 DeepSeek-R1-Distill-Qwen-1.5B: 정확도 54.2% → 68.3% (+14.1%), 토큰 -28.8%
LiveCodeBench에서 수학 steering vector 전이 적용 시 DeepSeek-R1-Distill-Qwen-1.5B: 정확도 18.5% → 28.5% (+10%), 토큰 8,205 → 6,923 (-15.6%)
단순 토큰 잘라내기(think budget=3,500) 대비 비교: truncation은 정확도 85.0%인 반면 SEAL은 89.4% (비슷한 출력 길이 ~2,600 토큰에서)

How to Apply

DeepSeek-R1-Distill 또는 QwQ-32B-Preview 모델 사용 중이라면 VITA-Group/SEAL GitHub에서 코드를 받아 Math-500 학습셋 1,000개로 steering vector를 오프라인 추출하고, 추론 시 layer 20(1.5B/7B) 또는 55(32B)에서 α=1.0으로 hidden state에 더해주면 됨
수학 도메인 외 다른 태스크(코드 생성, 플래닝 등)에 적용할 때도 Math-500으로 추출한 vector를 그대로 재사용 가능 - 도메인별 재추출 불필요
키워드 기반 분류('Wait', 'Alternatively', 'let me check' 등)로 reflection/transition을 구분하므로, 새 도메인에서는 GPT-4o 기반 labeling으로 대체해 steering vector 추출 파이프라인을 유연하게 확장 가능

Code Example

snippet

# SEAL 핵심 로직 요약 (pseudo-code)

# 1단계: 오프라인 - steering vector 추출
def extract_steering_vector(model, train_samples, target_layer=20):
    exec_reps, refl_trans_reps = [], []
    
    for sample in train_samples:  # ~1000개 샘플
        output = model.generate(sample)
        thoughts = output.split('\n\n')  # 개행 2개로 thought 분리
        
        for thought in thoughts:
            hidden = model.get_hidden_state(thought, layer=target_layer)  # '\n\n' 토큰의 representation
            
            # 키워드 기반 분류
            if thought.startswith('Wait') or 'let me check' in thought.lower():
                refl_trans_reps.append(hidden)  # reflection
            elif thought.startswith('Alternatively') or 'another approach' in thought.lower():
                refl_trans_reps.append(hidden)  # transition
            else:
                exec_reps.append(hidden)  # execution
    
    H_exec = mean(exec_reps)
    H_refl_trans = mean(refl_trans_reps)
    steering_vector = H_exec - H_refl_trans  # 핵심 수식
    return steering_vector

# 2단계: 온라인 - 추론 중 개입
def decode_with_seal(model, question, steering_vector, alpha=1.0, layer=20):
    # 각 thought 끝의 '\n\n' 토큰 위치에서 hidden state 수정
    # H_new = H_original + alpha * steering_vector
    # 이후 토큰 생성은 수정된 H_new 기반으로 진행
    return model.generate_with_hook(
        question,
        hook_layer=layer,
        hook_fn=lambda h: h + alpha * steering_vector
    )

Terminology

Chain-of-Thought (CoT)AI가 답을 바로 내놓지 않고 '1단계: ..., 2단계: ...' 처럼 풀이 과정을 길게 생성하는 방식. 사람이 수학 문제 풀 때 풀이 과정 쓰는 것과 같음.

Latent Space모델 내부에서 토큰들이 숫자 벡터로 변환되어 존재하는 추상적 공간. 의미가 비슷한 것들이 이 공간에서 가까이 모여 있음.

Steering Vector모델의 내부 표현(hidden state)에 더하거나 빼서 모델의 행동 방향을 조종하는 벡터. 마치 배의 키처럼 생각의 방향을 틀어줌.

Hidden StateTransformer 모델이 각 레이어에서 토큰을 처리할 때 내부적으로 가지고 있는 중간 표현값. 모델의 '생각' 상태라고 볼 수 있음.

t-SNE고차원 데이터를 2D 평면에 시각화하는 기법. 비슷한 것들을 가까이, 다른 것들을 멀리 배치해서 군집 구조를 눈으로 확인할 수 있게 해줌.

KV CacheTransformer가 이미 처리한 토큰의 계산 결과를 저장해두는 메모리. 시퀀스가 길수록 메모리를 많이 먹어 속도가 느려짐.

Logits Penalty특정 단어의 생성 확률을 낮추는 방식. 예를 들어 'Wait'이라는 단어가 나오기 어렵게 점수를 깎음. SEAL보다 덜 효과적인 기존 대안 방법.

Related Resources

Original Abstract (Expand)

Large Language Models (LLMs), such as OpenAI's o1-series have demonstrated compelling capabilities for complex reasoning tasks via the extended chain-of-thought (CoT) reasoning mechanism. However, recent studies reveal substantial redundancy in the CoT reasoning traces, which not only increases inference latency but also negatively impacts model performance by diverting attention to unnecessary reasoning paths. To address this issue, we investigate the internal reasoning structures of LLMs and categorize them into three primary thought types: execution, reflection, and transition thoughts. Moreover, our analysis reveals that excessive reflection and transition thoughts are strongly correlated with failure cases and these thought categories exhibit clear separation in the latent space. Based on these, we introduce SEAL (Steerable reasoning calibration), a training-free approach that seamlessly calibrates the CoT process, improving accuracy while demonstrating significant efficiency gains. SEAL consists of an offline stage for extracting the reasoning steering vector in the latent space, followed by an on-the-fly calibration of the reasoning trace through representation intervention using the steering vector. Notably, the steering vector exhibits strong transferability across various tasks. Extensive experiments across multiple models (DeepSeek-R1-Distill and QwQ-32B-Preview) and benchmarks (Math500, GSM8K, LiveCodeBench) validate the effectiveness of SEAL, up to a 11% improvement in accuracy while reducing reasoning tokens by 11.8% to 50.4%. Our code is publicly available at https://github.com/VITA-Group/SEAL.