From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Mar 23, 2026•Ling Yue, Kushal Raj Bhandari, Ching-Yun Ko +6•View PDF

TL;DR Highlight

A 31-page survey unifying LLM agent workflows as Agentic Computation Graphs (ACG) with a taxonomy of static vs. dynamic optimization — IBM + RPI joint research

Who Should Read

Engineers and researchers designing and optimizing LLM agent systems; teams managing cost-quality tradeoffs in multi-agent workflows

Core Mechanics

Introduces ACG (Agentic Computation Graph) — distinguishes reusable templates, run-specific realized graphs, and execution traces to unify method comparison
Classifies static (pre-deployment fixed) vs dynamic (runtime-determined) optimization with two descriptors: GDT (Graph Determination Time) and GPM (Graph Plasticity Mode)
Practical recipe: start with static scaffold + node optimization → add graph search when structural failures appear → use dynamic selection/generation for heterogeneous tasks → in-execution editing only for interactive environments
Verifiers deliver the highest value when cheap and semantically meaningful — unit tests, schema checks, and executability checks are prime examples
Workflow structure (edges, verifiers, branching logic) is a higher-leverage intervention than adding more agents

Evidence

39 core papers, 7 adjacent, 31 background resources + 27 workflow evaluation assets systematically catalogued
AFlow, ADAS, DSPy, G-Designer, DyFlow, MetaGen and others aligned on a unified comparison card (GDT/GPM/feedback signal/update mechanism)

How to Apply

Position your approach as static or dynamic using the ACG taxonomy before designing an agent system — prevents over-engineering with unnecessary dynamism
When debugging failures, inspect traces first: structural failures (wrong node, wrong information path) require graph fixes, not prompt fixes
Browse the IBM curated list at github.com/IBM/awesome-agentic-workflow-optimization for quick survey of related work

Terminology

Original Abstract (Expand)

Large language model (LLM)-based systems are becoming increasingly popular for solving tasks by constructing executable workflows that interleave LLM calls, information retrieval, tool use, code execution, memory updates, and verification. This survey reviews recent methods for designing and optimizing such workflows, which we treat as agentic computation graphs (ACGs). We organize the literature based on when workflow structure is determined, where structure refers to which components or agents are present, how they depend on each other, and how information flows between them. This lens distinguishes static methods, which fix a reusable workflow scaffold before deployment, from dynamic methods, which select, generate, or revise the workflow for a particular run before or during execution. We further organize prior work along three dimensions: when structure is determined, what part of the workflow is optimized, and which evaluation signals guide optimization (e.g., task metrics, verifier signals, preferences, or trace-derived feedback). We also distinguish reusable workflow templates, run-specific realized graphs, and execution traces, separating reusable design choices from the structures actually deployed in a given run and from realized runtime behavior. Finally, we outline a structure-aware evaluation perspective that complements downstream task metrics with graph-level properties, execution cost, robustness, and structural variation across inputs. Our goal is to provide a clear vocabulary, a unified framework for positioning new methods, a more comparable view of existing body of literature, and a more reproducible evaluation standard for future work in workflow optimizations for LLM agents.