Taught Claude to talk like a caveman to use 75% less tokens.
TL;DR Highlight
This post details a prompt technique that drastically compresses Claude's response style, reducing token usage by 75%, which could be useful for developers interested in reducing API costs.
Who Should Read
Developers who want to reduce token costs while using the Claude API, or developers operating chatbots/automation pipelines that require optimized response length.
Core Mechanics
- The title claims that you can reduce the number of tokens in Claude's response by up to 75% by instructing it to 'speak like a caveman'.
- Access to the original page is blocked, so the specific prompt content or methodology cannot be confirmed, but it is presumed to remove unnecessary modifiers, complete sentence structures, and polite expressions.
- Token reduction techniques are directly linked to API costs, so substantial cost savings can be expected in production environments that handle a large volume of requests.
- Since the original content cannot be confirmed, detailed information such as the exact prompt, experimental conditions, and measurement methods are unknown.
Evidence
- "(No comment information)"
How to Apply
- Add the instruction 'Respond as briefly and concisely as possible, focusing only on keywords' to the system prompt in your Claude API pipeline, and measure the changes in response quality and token count.
- In pipelines that are not directly read by humans, such as internal tools or automation scripts, it is acceptable to drastically compress the response style, so apply token reduction experiments first in these environments.
Code Example
snippet
// System prompt example (compression instruction for token reduction)
const systemPrompt = `
Respond in minimal words. No full sentences. No pleasantries.
Keywords only. Like caveman speech.
Example: Instead of 'The answer to your question is yes, you should use X',
just say: 'yes. use X.'
`;Terminology
토큰(Token)The minimum unit that LLMs process text with, approximately 3/4 of an English word. API fees are charged based on the sum of input and output tokens, so reducing the number of tokens directly reduces costs.
시스템 프롬프트(System Prompt)Instructions that specify the AI's role, tone, and behavior before the conversation begins. It is applied consistently throughout all conversations.
토큰 절감(Token Reduction)An optimization technique that reduces the number of tokens used to generate a response, typically by removing unnecessary explanations, greetings, and complete sentence structures.