Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

Mar 24, 2026•Yechao Zhang, Shiqian Zhao, Jie Zhang +5•View PDF

TL;DR Highlight

Simply by reading social feeds in the background, an AI agent can store misinformation in long-term memory and influence future user behavior.

Who Should Read

Developers deploying persistent agent frameworks like OpenClaw or MemGPT in production, and engineers responsible for agent security. Essential reading if your agent is configured to automatically monitor external sources such as email, Slack, or social feeds.

Core Mechanics

The heartbeat (periodic background execution) of Claw-family agents (OpenClaw, CoPaw, etc.) shares the same session context as foreground conversations, allowing external content read in the background to enter memory without the user's knowledge
Attackers don't need prompt injection — simply posting plausible misinformation on a social platform is enough, as the agent automatically ingests it via heartbeat, initiating contamination
Among social signals, 'consensus' is the strongest contamination trigger — even without authoritative accounts, having multiple accounts post agreeing comments yields an ASR (Attack Success Rate) above 73%
The rate at which short-term memory contamination is promoted to long-term memory reaches up to 91% — when a user says 'save today's content,' contaminated information gets stored in MEMORY.md as well
Contamination stored in long-term memory persists after session resets, influencing behavior in new sessions at a rate of 76% (without web_search)
The benchmark model is Claude Haiku 4.5, and the experimental domains are three: software security (recommending CVE-vulnerable versions), financial decision-making (selecting DeFi protocols), and academic references (citing fake papers)

Evidence

"Under the strongest social signal conditions (both authority and consensus present), ASR reached 82.2%, dropping sharply to 22.2% without consensus — consensus signals are the dominant factor. Under the S4 (explicit save request) condition, the average long-term memory storage rate was 91.1% and the cross-session behavioral influence ASR averaged 75.6% (without web_search). Even under realistic dilution conditions (only 1 contaminated post among 20), the Financial domain storage rate was 60% and cross-session ASR was 33.3% — contextual management mechanisms are not a complete defense. Even a Skeptical persona recorded an ASR of 16.7% under strong social signal conditions — no persona is fully immune."

How to Apply

"If your agent is currently monitoring external sources (email, Slack, RSS, social feeds) via heartbeat, consider reconfiguring heartbeat execution to run in a separate isolated session — the default 'session main' sharing in Claw is the core vulnerability. Add provenance tagging logic before writing to MEMORY.md — attach metadata such as 'source: heartbeat, url: ...' to information collected during background heartbeat execution so it can be reviewed before being saved to long-term memory. Insert a memory provenance verification step before executing sensitive downstream tasks (security package recommendations, financial decisions, etc.) — add a condition to the agent prompt such as 'verify whether the original source of this information was explicitly requested by the user.'"

Code Example

snippet

# Example of heartbeat isolation configuration for Claw/OpenClaw-style agents
# Modify HEARTBEAT.md or specify a separate session in the agent config

# Vulnerable default configuration (shared session)
heartbeat_config = {
    "session": "main",  # ← Shares the same session as foreground (dangerous)
    "interval": 300,
    "tasks": ["check_email", "monitor_social_feed"]
}

# Recommended configuration: use an isolated session
heartbeat_config = {
    "session": "heartbeat_isolated",  # ← Separate session
    "interval": 300,
    "tasks": ["check_email", "monitor_social_feed"],
    "memory_write_policy": "user_confirmed_only",  # Save only after user confirmation
    "provenance_tagging": True  # Provenance tagging required
}

# Example of adding provenance metadata when saving to memory (MEMORY.md format)
"""
## [2026-03-24] Heartbeat-acquired info (UNVERIFIED)
Source: social_platform / submolt: security-updates
Ingested via: heartbeat background execution
Content: ...
Verification status: NOT verified by user
"""

Terminology

heartbeatA mechanism by which an agent periodically wakes up without a user request to automatically check email, social feeds, and more. Similar to how a smartphone checks for notifications in the background.

persistent agentAn AI agent that retains memory and state even when a conversation is interrupted. Unlike a typical chatbot that starts fresh each time, it remembers previous session content and operates continuously.

short-term memoryThe agent's working context maintained only within the current session. Conceptually like a session cookie that disappears when you close a browser tab.

long-term memoryPersistent memory saved to a file (such as MEMORY.md) that persists after a session ends and is referenced in future sessions. Similar to how writing in a diary lets you remember things later.

ASR (Attack Success Rate)The rate at which an attack succeeds. The proportion of cases in which the agent believed the planted misinformation and acted on it.

E→M→B pathwayThe attack chain of Exposure → Memory (storage) → Behavior (influence). Misinformation is shown → stored in memory → influences later behavior.

prompt injectionAn attack that hides malicious instructions in ordinary text to trick an AI model into following them. This paper demonstrates that contamination is possible even without such explicit attacks.

provenanceThe origin and path of information. Metadata that allows tracking of where information came from. Without it, fake information collected in the background can masquerade as the agent's 'own knowledge.'

Related Resources

Original Abstract (Expand)

We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.