LlamaIndex 中使用 Agents

todo 需要在实践中理解

LlamaIndex 支持的 Agents

Function Calling Agents - 这些适用于可以调用特定功能的 AI 模型。
ReAct Agents - 这些可以与任何进行聊天或文本端点的 AI 配合使用，并处理复杂的推理任务。
Advanced Custom Agents - 这些使用更复杂的方法来处理更复杂的任务和工作流程。比如 LLMCompiler 或 Chain-of-abstraction

初始化 Agents

!pip install llama-index llama-index-vector-stores-chroma llama-index-llms-huggingface-api llama-index-embeddings-huggingface -U -q

使用 AgentWorkflow 初始化一个 Agent。

# 登陆使用serverless API
from huggingface_hub import login
login()

from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.core.agent.workflow import AgentWorkflow, ToolCallResult, AgentStream


def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two numbers"""
    return a - b


def multiply(a: int, b: int) -> int:
    """Multiply two numbers"""
    return a * b


def divide(a: int, b: int) -> int:
    """Divide two numbers"""
    return a / b


llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")

agent = AgentWorkflow.from_tools_or_functions(
    tools_or_functions=[subtract, multiply, divide, add],
    llm=llm,
    system_prompt="You are a math agent that can add, subtract, multiply, and divide numbers using provided tools.",
)

然后就可以执行推理了：

handler = agent.run("What is (2 + 2) * 2?")
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

记住上下文的 Agents

默认情况下的 Agent 是 stateless 的，没有上下文的，通过 Context 可以让 Agent 记住上下文。

from llama_index.core.workflow import Context

ctx = Context(agent)

response = await agent.run("My name is Bob.", ctx=ctx)
response = await agent.run("What was my name again?", ctx=ctx)
response

LlamaIndex 中的代理是异步的，因为它们使用了 Python 的 await 操作符。

使用 QueryEngineTools 创建 RAG 代理

import chromadb

from llama_index.core import VectorStoreIndex
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.tools import QueryEngineTool
from llama_index.vector_stores.chroma import ChromaVectorStore

# Create a vector store
db = chromadb.PersistentClient(path="./alfred_chroma_db")
chroma_collection = db.get_or_create_collection("alfred")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Create a query engine
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model
)
query_engine = index.as_query_engine(llm=llm)
query_engine_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="personas",
    description="descriptions for various types of personas",
    return_direct=False,
)

# Create a RAG agent
query_engine_agent = AgentWorkflow.from_tools_or_functions(
    tools_or_functions=[query_engine_tool],
    llm=llm,
    system_prompt="You are a helpful assistant that has access to a database containing persona descriptions. ",
)

展示思考和推理过程：

handler = query_engine_agent.run(
    "Search the database for 'science fiction' and return some persona descriptions."
)
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

创建 Multi-agent systems

AgentWorkflow 类也直接支持多智能体系统。通过给每个智能体一个名称和描述，系统维护一个单一的活动说话者，每个智能体都有能力将任务交接给另一个智能体。

通过缩小每个智能体的作用范围，我们可以帮助提高它们在响应用户消息时的总体准确性。

LlamaIndex 中的智能体也可以直接用作其他智能体的工具:

from llama_index.core.agent.workflow import (
    AgentWorkflow,
    ReActAgent,
)

# Define some tools
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b

# Create agent configs
# NOTE: we can use FunctionAgent or ReActAgent here.
# FunctionAgent works for LLMs with a function calling API.
# ReActAgent works for any LLM.
calculator_agent = ReActAgent(
    name="calculator",
    description="Performs basic arithmetic operations",
    system_prompt="You are a calculator assistant. Use your tools for any math operation.",
    tools=[add, subtract],  # 将agent 用作tool
    llm=llm,
)

query_agent = ReActAgent(
    name="info_lookup",
    description="Looks up information about XYZ",
    system_prompt="Use your tool to query a RAG system to answer information about XYZ",
    tools=[query_engine_tool],  # 将agent 用作tool
    llm=llm,
)

# Create and run the workflow
agent = AgentWorkflow(agents=[calculator_agent, query_agent], root_agent="calculator")

# Run the system
handler = agent.run(user_msg="Can you add 5 and 3?")


# 展示思考和推理过程：
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

LlamaIndex 支持的 Agents#

初始化 Agents#

记住上下文的 Agents#

使用 QueryEngineTools 创建 RAG 代理#

创建 Multi-agent systems#

LlamaIndex 支持的 Agents

初始化 Agents

记住上下文的 Agents

使用 QueryEngineTools 创建 RAG 代理

创建 Multi-agent systems