Hermes + AutoGen：用多个 Agent 协作完成复杂任务

一个模型再聪明，能力也有上限。让一个模型又当程序员又当产品经理又当测试工程师，结果往往是哪个角色都演不好。

但如果你让多个 Agent 各自扮演一个角色，互相讨论、互相检查呢？这就是多 Agent 协作的思路。微软的 AutoGen 框架就是干这个的。

今天用 AutoGen + Hermes 本地模型，搭一个多 Agent 协作系统。

AutoGen 的核心概念

在动手之前，先理解几个关键概念：

Agent：一个有特定角色和能力的 AI 实体。每个 Agent 有自己的 system prompt、可以使用的工具、以及行为模式。

ConversableAgent：AutoGen 里最基础的 Agent 类型，能参与对话。

AssistantAgent：预配置的助手 Agent，默认由 LLM 驱动。

UserProxyAgent：代理用户的 Agent，可以执行代码、调用工具，也可以让真人介入。

GroupChat：多个 Agent 在同一个聊天组里讨论问题，类似群聊。

Task：Agent 需要完成的任务。

环境准备

安装 AutoGen：

1	pip install autogen-agentchat autogen-ext[openai]

确保 Hermes 在本地运行。可以通过 Ollama 或者 vLLM。这里用 Ollama 示例。

1	ollama pull hermes3:8b

基础：两个 Agent 对话

最简单的场景——一个助手 Agent 和一个用户代理对话：

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

# 配置模型客户端，指向本地 Ollama
model_client = OpenAIChatCompletionClient(
    model="hermes3:8b",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown"
    }
)

# 创建助手 Agent
assistant = AssistantAgent(
    name="coder",
    model_client=model_client,
    system_message="""你是一个经验丰富的 Python 开发者。
你的职责是编写高质量的代码来解决用户的需求。
代码要求：
- 有清晰的注释
- 有错误处理
- 遵循 PEP 8 规范
当你认为任务已经完成，在回复末尾加上 TERMINATE""",
)

# 设置终止条件
termination = TextMentionTermination("TERMINATE")

# 创建团队
team = RoundRobinGroupChat(
    participants=[assistant],
    termination_condition=termination,
)

# 运行
async def main():
    result = await team.run(
        task="写一个 Python 函数，实现 LRU Cache，要求支持 get 和 put 操作，时间复杂度 O(1)"
    )
    print(result)

asyncio.run(main())

进阶：多 Agent 协作

让多个角色一起工作才是 AutoGen 的精髓。

代码开发 + 审查流程

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination, MaxMessageTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="hermes3:8b",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown"
    }
)

# 程序员 Agent
programmer = AssistantAgent(
    name="programmer",
    model_client=model_client,
    system_message="""你是一个 Python 程序员。
你的职责是根据需求编写代码。
收到代码审查意见后，修改代码并重新提交。
代码要写完整，可以直接运行。""",
)

# 代码审查员 Agent
reviewer = AssistantAgent(
    name="reviewer",
    model_client=model_client,
    system_message="""你是一个严格的代码审查员。
你的职责是审查 programmer 写的代码，检查：
1. 逻辑正确性
2. 边界条件处理
3. 代码风格和可读性
4. 性能问题
5. 安全隐患

如果代码有问题，具体指出问题并给出修改建议。
如果代码没问题或已经达到可接受的质量，回复 APPROVE 并说明理由，在最后加上 TERMINATE""",
)

# 终止条件：审查员说 TERMINATE 或者超过 10 轮
termination = TextMentionTermination("TERMINATE") | MaxMessageTermination(10)

# 轮询式团队聊天
team = RoundRobinGroupChat(
    participants=[programmer, reviewer],
    termination_condition=termination,
)

async def main():
    result = await team.run(
        task="写一个线程安全的单例模式实现，要求支持懒加载和参数传递"
    )
    
    # 打印完整对话
    for msg in result.messages:
        print(f"\n{'='*60}")
        print(f"[{msg.source}]:")
        print(msg.content)

asyncio.run(main())

跑起来你会看到这样的流程：

programmer 先写出代码
reviewer 审查代码，提出问题
programmer 根据反馈修改
reviewer 再次审查
直到 reviewer 满意或达到最大轮数

这个过程完全自动化，不需要人工介入。

三角色协作：产品经理 + 程序员 + 测试

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination, MaxMessageTermination
from autogen_agentchat.teams import SelectorGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="hermes3:8b",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown"
    }
)

# 产品经理
pm = AssistantAgent(
    name="product_manager",
    model_client=model_client,
    system_message="""你是产品经理。你的职责：
1. 分析用户需求，细化为具体的功能点
2. 定义验收标准
3. 协调程序员和测试的工作
4. 确认最终产出是否满足需求

首先把需求拆解成明确的功能点和验收标准，然后交给程序员实现。""",
)

# 程序员
dev = AssistantAgent(
    name="developer",
    model_client=model_client,
    system_message="""你是一个全栈开发者。你的职责：
1. 根据产品经理的需求文档编写代码
2. 代码要完整可运行
3. 根据测试反馈修复 bug
4. 解释技术方案的选择理由""",
)

# 测试工程师
tester = AssistantAgent(
    name="tester",
    model_client=model_client,
    system_message="""你是测试工程师。你的职责：
1. 根据产品经理的验收标准设计测试用例
2. 审查开发者的代码，找出 bug
3. 验证修复后的代码是否通过测试
4. 当所有测试通过且满足验收标准时，回复 ALL_TESTS_PASSED TERMINATE""",
)

termination = TextMentionTermination("TERMINATE") | MaxMessageTermination(15)

# 使用 SelectorGroupChat，让模型自动决定下一个发言者
team = SelectorGroupChat(
    participants=[pm, dev, tester],
    model_client=model_client,
    termination_condition=termination,
)

async def main():
    result = await team.run(
        task="开发一个命令行版的待办事项管理工具，支持添加、删除、标记完成、列出所有待办项，数据保存在本地 JSON 文件中"
    )
    
    for msg in result.messages:
        print(f"\n[{msg.source}]:")
        print(msg.content[:500])  # 截取前500字符避免输出太长

asyncio.run(main())

SelectorGroupChat 和 RoundRobinGroupChat 的区别在于：后者是固定顺序轮流发言，前者让模型自己决定谁该在什么时候说话。这更像真实的团队协作——需要讨论需求时产品经理发言，需要写代码时开发者接手，需要测试时测试工程师上场。

给 Agent 加上工具

Agent 不只是聊天，还可以执行实际操作：

import asyncio
import subprocess
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_core.tools import FunctionTool
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="hermes3:8b",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown"
    }
)

# 定义工具
def execute_python(code: str) -> str:
    """执行 Python 代码并返回结果"""
    try:
        result = subprocess.run(
            ["python", "-c", code],
            capture_output=True,
            text=True,
            timeout=30
        )
        output = result.stdout
        if result.returncode != 0:
            output += f"\nError: {result.stderr}"
        return output if output else "代码执行成功，无输出"
    except subprocess.TimeoutExpired:
        return "执行超时（30秒限制）"
    except Exception as e:
        return f"执行失败: {str(e)}"

def read_file(filepath: str) -> str:
    """读取文件内容"""
    try:
        with open(filepath, "r", encoding="utf-8") as f:
            return f.read()
    except Exception as e:
        return f"读取失败: {str(e)}"

def write_file(filepath: str, content: str) -> str:
    """写入文件"""
    try:
        with open(filepath, "w", encoding="utf-8") as f:
            f.write(content)
        return f"文件已写入: {filepath}"
    except Exception as e:
        return f"写入失败: {str(e)}"

# 包装成 AutoGen 工具
python_tool = FunctionTool(execute_python, description="执行 Python 代码并返回结果")
read_tool = FunctionTool(read_file, description="读取指定路径的文件内容")
write_tool = FunctionTool(write_file, description="将内容写入指定路径的文件")

# 创建有工具的 Agent
coding_agent = AssistantAgent(
    name="coding_agent",
    model_client=model_client,
    tools=[python_tool, read_tool, write_tool],
    system_message="""你是一个能执行代码的编程助手。
你可以：
1. 编写 Python 代码并执行
2. 读取和写入文件
3. 根据执行结果调试代码

工作流程：先写代码，执行看结果，有错就改，直到正确运行。
任务完成后回复 TERMINATE""",
)

termination = TextMentionTermination("TERMINATE")

team = RoundRobinGroupChat(
    participants=[coding_agent],
    termination_condition=termination,
)

async def main():
    result = await team.run(
        task="写一个脚本统计当前目录下所有 .py 文件的总行数，并按文件大小排序输出"
    )
    for msg in result.messages:
        print(f"\n[{msg.source}]: {msg.content[:300]}")

asyncio.run(main())

实战案例：自动化技术文档生成

把多 Agent 协作用在实际场景中：

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination, MaxMessageTermination
from autogen_agentchat.teams import SelectorGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="hermes3:8b",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown"
    }
)

# 研究员：负责收集信息和整理要点
researcher = AssistantAgent(
    name="researcher",
    model_client=model_client,
    system_message="""你是一个技术研究员。你的任务：
1. 根据主题列出需要覆盖的知识点
2. 为每个知识点提供准确的技术细节
3. 整理出清晰的文档大纲
4. 确保信息的准确性和完整性""",
)

# 作者：负责撰写内容
writer = AssistantAgent(
    name="writer",
    model_client=model_client,
    system_message="""你是一个技术文档作者。你的任务：
1. 基于研究员提供的大纲和要点写作
2. 使用通俗易懂的语言
3. 加入实际的代码示例
4. 保持结构清晰、段落简短""",
)

# 编辑：负责审校
editor = AssistantAgent(
    name="editor",
    model_client=model_client,
    system_message="""你是一个技术文档编辑。你的任务：
1. 审查文档的技术准确性
2. 检查逻辑连贯性
3. 优化表达和排版
4. 提出具体的修改建议
当文档质量达标时，回复 APPROVED TERMINATE""",
)

termination = TextMentionTermination("TERMINATE") | MaxMessageTermination(12)

team = SelectorGroupChat(
    participants=[researcher, writer, editor],
    model_client=model_client,
    termination_condition=termination,
)

async def main():
    result = await team.run(
        task="写一篇关于 Python 装饰器的技术教程，面向有基础 Python 知识但不了解装饰器的开发者"
    )
    
    # 输出最终结果
    for msg in result.messages:
        print(f"\n{'='*60}")
        print(f"[{msg.source}]:")
        print(msg.content)

asyncio.run(main())

性能和成本考量

用本地模型跑多 Agent 有个问题：每个 Agent 每次发言都是一次模型推理。三个 Agent 讨论 10 轮就是 30 次推理调用。

几个优化建议：

1. 控制对话轮数

1	termination = MaxMessageTermination(8) # 最多8条消息

2. 精简 system prompt

system prompt 每次都会发送，越短越省资源。把那些花哨的描述去掉，只保留核心指令。

3. 减少不必要的 Agent

两个 Agent 能搞定的事别用三个。Agent 数量和对话质量不是正比关系。

4. 用更快的推理后端

Ollama 够用但不是最快的。高频使用的话换 vLLM 能明显提速。

和 CrewAI 的对比

AutoGen 和 CrewAI 都是多 Agent 框架，但设计哲学不同：

AutoGen：更底层、更灵活，适合需要精细控制 Agent 交互模式的场景
CrewAI：更高层、更易用，适合快速搭建标准化的 Agent 团队

如果你需要 Agent 之间有复杂的对话模式（比如嵌套对话、条件分支），AutoGen 更合适。如果你的场景比较标准（一个 Agent 做研究、一个写文档、一个审核），CrewAI 上手更快。

cocoloop 社区里两个框架都有人在用，各有各的拥趸。建议两个都试试，看看哪个更适合你的场景。

Hermes 模型在多 Agent 场景下有个很大的优势——它的角色扮演能力很强，不同 Agent 用不同的 system prompt 真的能表现出不同的”性格”和”专业倾向”。这对多 Agent 协作的效果有直接影响。不是所有开源模型都能做到这一点的。

目录