Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
TL;DR Highlight
Simply by reading social feeds in the background, an AI agent can store misinformation in long-term memory and influence future user behavior.
Who Should Read
Developers deploying persistent agent frameworks like OpenClaw or MemGPT in production, and engineers responsible for agent security. Essential reading if your agent is configured to automatically monitor external sources such as email, Slack, or social feeds.
Core Mechanics
- The heartbeat (periodic background execution) of Claw-family agents (OpenClaw, CoPaw, etc.) shares the same session context as foreground conversations, allowing external content read in the background to enter memory without the user's knowledge
- Attackers don't need prompt injection — simply posting plausible misinformation on a social platform is enough, as the agent automatically ingests it via heartbeat, initiating contamination
- Among social signals, 'consensus' is the strongest contamination trigger — even without authoritative accounts, having multiple accounts post agreeing comments yields an ASR (Attack Success Rate) above 73%
- The rate at which short-term memory contamination is promoted to long-term memory reaches up to 91% — when a user says 'save today's content,' contaminated information gets stored in MEMORY.md as well
- Contamination stored in long-term memory persists after session resets, influencing behavior in new sessions at a rate of 76% (without web_search)
- The benchmark model is Claude Haiku 4.5, and the experimental domains are three: software security (recommending CVE-vulnerable versions), financial decision-making (selecting DeFi protocols), and academic references (citing fake papers)
Evidence
- "Under the strongest social signal conditions (both authority and consensus present), ASR reached 82.2%, dropping sharply to 22.2% without consensus — consensus signals are the dominant factor. Under the S4 (explicit save request) condition, the average long-term memory storage rate was 91.1% and the cross-session behavioral influence ASR averaged 75.6% (without web_search). Even under realistic dilution conditions (only 1 contaminated post among 20), the Financial domain storage rate was 60% and cross-session ASR was 33.3% — contextual management mechanisms are not a complete defense. Even a Skeptical persona recorded an ASR of 16.7% under strong social signal conditions — no persona is fully immune."
How to Apply
- "If your agent is currently monitoring external sources (email, Slack, RSS, social feeds) via heartbeat, consider reconfiguring heartbeat execution to run in a separate isolated session — the default 'session main' sharing in Claw is the core vulnerability. Add provenance tagging logic before writing to MEMORY.md — attach metadata such as 'source: heartbeat, url: ...' to information collected during background heartbeat execution so it can be reviewed before being saved to long-term memory. Insert a memory provenance verification step before executing sensitive downstream tasks (security package recommendations, financial decisions, etc.) — add a condition to the agent prompt such as 'verify whether the original source of this information was explicitly requested by the user.'"
Code Example
# Example of heartbeat isolation configuration for Claw/OpenClaw-style agents
# Modify HEARTBEAT.md or specify a separate session in the agent config
# Vulnerable default configuration (shared session)
heartbeat_config = {
"session": "main", # ← Shares the same session as foreground (dangerous)
"interval": 300,
"tasks": ["check_email", "monitor_social_feed"]
}
# Recommended configuration: use an isolated session
heartbeat_config = {
"session": "heartbeat_isolated", # ← Separate session
"interval": 300,
"tasks": ["check_email", "monitor_social_feed"],
"memory_write_policy": "user_confirmed_only", # Save only after user confirmation
"provenance_tagging": True # Provenance tagging required
}
# Example of adding provenance metadata when saving to memory (MEMORY.md format)
"""
## [2026-03-24] Heartbeat-acquired info (UNVERIFIED)
Source: social_platform / submolt: security-updates
Ingested via: heartbeat background execution
Content: ...
Verification status: NOT verified by user
"""Terminology
Related Resources
Original Abstract (Expand)
We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.