LLM 기반 Multi-Agent 시스템: 현황과 과제 Survey

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Jan 21, 2024•Taicheng Guo, Xiuying Chen, Yaqi Wang +5•View PDF

TL;DR Highlight

LLM 여러 개를 협력시키는 Multi-Agent 시스템의 구조, 통신 방식, 적용 사례를 한 눈에 정리한 서베이

Who Should Read

AutoGen, MetaGPT, CrewAI 같은 Multi-Agent 프레임워크를 도입하거나 설계 중인 백엔드/AI 개발자. LLM 에이전트를 단일 에이전트에서 다중 에이전트 구조로 확장하려는 팀에 특히 유용.

Core Mechanics

Multi-Agent 시스템은 에이전트-환경 인터페이스(Sandbox/Physical/None), 에이전트 프로파일링, 통신 구조, 능력 획득 방식 4가지 축으로 분류할 수 있음
통신 구조는 계층형(Layered), 분산형(Decentralized), 중앙집중형(Centralized), 공유 메시지 풀(Shared Message Pool) 4가지로 나뉨 — MetaGPT는 Shared Message Pool 방식 사용
에이전트 역할(Profile) 부여 방식은 Pre-defined(설계자가 직접 정의), Model-Generated(LLM이 자동 생성), Data-Derived(데이터셋 기반) 3가지
에이전트 능력 향상 전략은 Memory(과거 기록 저장/검색), Self-Evolution(목표·전략 자체 수정), Dynamic Generation(실행 중 새 에이전트 생성) 3가지
활용 분야는 크게 Problem Solving(소프트웨어 개발, 로봇, 과학 실험, 토론)과 World Simulation(사회, 경제, 게임, 정책, 질병 전파)으로 구분
주요 과제: Hallucination의 에이전트 간 전파, 집단 지능(Collective Intelligence) 최적화, GPT-4급 LLM 다수 운용 시 컴퓨팅 비용 및 오케스트레이션 복잡도

Evidence

S3 시스템은 8,563개 및 17,945개 에이전트로 구성된 소셜 네트워크를 시뮬레이션하여 젠더 차별, 핵에너지 여론 전파를 재현
Agent4Rec은 MovieLens-1M 데이터 기반 1,000개 생성 에이전트로 실제 사용자 추천 행동과 필터버블 효과를 재현
MetaGPT는 HumanEval, MBPP 벤치마크에서 멀티에이전트 협력 기반 코드 생성을 평가, SOP 인코딩으로 hallucination 문제를 구조적으로 감소시킴
Multi-agent Debate(Du et al., 2023)는 GSM8K, StrategyQA 등 6개 추론·사실성 태스크에서 단일 에이전트 대비 factuality 향상을 실험적으로 입증

How to Apply

소프트웨어 개발 자동화를 구현할 때 MetaGPT나 AutoGen을 활용해 PM→개발자→테스터 역할을 계층형(Layered) 통신 구조로 구성하면, 단일 에이전트보다 복잡한 태스크 분해와 검증이 가능
LLM 답변의 정확도를 높이고 싶다면 여러 에이전트가 서로 반박하며 합의에 이르는 Debate 패러다임을 적용 — 의료 진단, 코드 리뷰, 수학 풀이 등 정확성이 중요한 태스크에 효과적
사용자 행동 시뮬레이션이나 A/B 테스트 대체가 필요한 경우, Data-Derived 방식으로 실제 데이터에서 에이전트 프로파일을 생성하고 Sandbox 환경에서 상호작용을 시뮬레이션하면 실제 배포 전 사전 검증 가능

Code Example

snippet

# AutoGen으로 간단한 Multi-Agent Debate 구성 예시
import autogen

config_list = [{"model": "gpt-4", "api_key": "YOUR_API_KEY"}]

llm_config = {"config_list": config_list}

# 에이전트 역할 정의 (Pre-defined Profiling)
proponent = autogen.AssistantAgent(
    name="Proponent",
    system_message="당신은 주어진 주장을 지지하는 토론자입니다. 근거를 들어 주장을 옹호하세요.",
    llm_config=llm_config,
)

opponent = autogen.AssistantAgent(
    name="Opponent",
    system_message="당신은 주어진 주장에 반대하는 토론자입니다. 논리적으로 반박하세요.",
    llm_config=llm_config,
)

judge = autogen.AssistantAgent(
    name="Judge",
    system_message="당신은 공정한 심판입니다. 양측 주장을 듣고 최종 합의점을 도출하세요.",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="UserProxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=6,
)

# Debate 시작 (Centralized 구조: judge가 중재)
groupchat = autogen.GroupChat(
    agents=[user_proxy, proponent, opponent, judge],
    messages=[],
    max_round=6,
    speaker_selection_method="round_robin",
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

user_proxy.initiate_chat(
    manager,
    message="토론 주제: LLM은 인간 의사를 대체할 수 있는가?"
)

Terminology

Multi-Agent System여러 AI 에이전트가 각자 역할을 맡아 협력하거나 토론하는 시스템. 마치 회사에서 PM, 개발자, QA가 각자 역할을 나눠 일하는 것과 같음.

Agent Profiling각 에이전트에게 특정 역할, 성격, 전문성을 부여하는 과정. 배우에게 캐릭터 설정을 주는 것과 비슷.

HallucinationLLM이 사실이 아닌 내용을 자신 있게 생성하는 현상. Multi-Agent에서는 한 에이전트의 거짓 정보가 다른 에이전트로 전파될 위험이 있음.

Shared Message Pool에이전트들이 메시지를 공용 게시판에 올리고 필요한 메시지만 구독하는 통신 방식. Slack 채널처럼 모두가 올리고 관심 있는 것만 읽는 구조.

Self-Evolution에이전트가 피드백을 받아 자신의 목표나 전략을 스스로 수정하는 능력. 사람이 실수 후 접근 방식을 바꾸는 것과 유사.

Theory of Mind (ToM)다른 에이전트(또는 사람)의 의도나 생각을 추론하는 능력. '저 에이전트가 왜 저런 행동을 했을까'를 이해하는 것.

SOP (Standard Operating Procedure)표준 업무 절차. MetaGPT에서는 이를 프롬프트에 인코딩해 에이전트들이 체계적으로 협력하도록 가이드함.

Collective Intelligence개별 에이전트의 능력을 합친 것보다 뛰어난, 에이전트 집단 전체의 지능. 개미 군집이 복잡한 구조물을 짓는 것처럼 개체 수준을 넘는 문제 해결 능력.

Related Resources

Original Abstract (Expand)

Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks. Due to their notable capabilities in planning and reasoning, LLMs have been utilized as autonomous agents for the automatic execution of various tasks. Recently, LLM-based agent systems have rapidly evolved from single-agent planning or decision-making to operating as multi-agent systems, enhancing their ability in complex problem-solving and world simulation. To offer an overview of this dynamic field, we present this survey to offer an in-depth discussion on the essential aspects and challenges of LLM-based multi-agent (LLM-MA) systems. Our objective is to provide readers with an in-depth understanding of these key points: the domains and settings where LLM-MA systems operate or simulate; the profiling and communication methods of these agents; and the means by which these agents develop their skills. For those interested in delving into this field, we also summarize the commonly used datasets or benchmarks. To keep researchers updated on the latest studies, we maintain an open-source GitHub repository (github.com/taichengguo/LLM_MultiAgents_Survey_Papers), dedicated to outlining the research of LLM-MA research.