todo 需要在实践中理解
LlamaIndex 支持的 Agents
Function Calling Agents - 这些适用于可以调用特定功能的 AI 模型。
ReAct Agents - 这些可以与任何进行聊天或文本端点的 AI 配合使用,并处理复杂的推理任务。
Advanced Custom Agents - 这些使用更复杂的方法来处理更复杂的任务和工作流程。比如
LLMCompiler或Chain-of-abstraction
初始化 Agents
!pip install llama-index llama-index-vector-stores-chroma llama-index-llms-huggingface-api llama-index-embeddings-huggingface -U -q
使用 AgentWorkflow 初始化一个 Agent。
# 登陆使用serverless API
from huggingface_hub import login
login()
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.core.agent.workflow import AgentWorkflow, ToolCallResult, AgentStream
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
def subtract(a: int, b: int) -> int:
"""Subtract two numbers"""
return a - b
def multiply(a: int, b: int) -> int:
"""Multiply two numbers"""
return a * b
def divide(a: int, b: int) -> int:
"""Divide two numbers"""
return a / b
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
agent = AgentWorkflow.from_tools_or_functions(
tools_or_functions=[subtract, multiply, divide, add],
llm=llm,
system_prompt="You are a math agent that can add, subtract, multiply, and divide numbers using provided tools.",
)
然后就可以执行推理了:
handler = agent.run("What is (2 + 2) * 2?")
async for ev in handler.stream_events():
if isinstance(ev, ToolCallResult):
print("")
print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
elif isinstance(ev, AgentStream): # showing the thought process
print(ev.delta, end="", flush=True)
resp = await handler
resp
记住上下文的 Agents
默认情况下的 Agent 是 stateless 的,没有上下文的,通过 Context 可以让 Agent 记住上下文。
from llama_index.core.workflow import Context
ctx = Context(agent)
response = await agent.run("My name is Bob.", ctx=ctx)
response = await agent.run("What was my name again?", ctx=ctx)
response
LlamaIndex 中的代理是异步的,因为它们使用了 Python 的 await 操作符。
使用 QueryEngineTools 创建 RAG 代理
import chromadb
from llama_index.core import VectorStoreIndex
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.tools import QueryEngineTool
from llama_index.vector_stores.chroma import ChromaVectorStore
# Create a vector store
db = chromadb.PersistentClient(path="./alfred_chroma_db")
chroma_collection = db.get_or_create_collection("alfred")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
# Create a query engine
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
index = VectorStoreIndex.from_vector_store(
vector_store=vector_store, embed_model=embed_model
)
query_engine = index.as_query_engine(llm=llm)
query_engine_tool = QueryEngineTool.from_defaults(
query_engine=query_engine,
name="personas",
description="descriptions for various types of personas",
return_direct=False,
)
# Create a RAG agent
query_engine_agent = AgentWorkflow.from_tools_or_functions(
tools_or_functions=[query_engine_tool],
llm=llm,
system_prompt="You are a helpful assistant that has access to a database containing persona descriptions. ",
)
展示思考和推理过程:
handler = query_engine_agent.run(
"Search the database for 'science fiction' and return some persona descriptions."
)
async for ev in handler.stream_events():
if isinstance(ev, ToolCallResult):
print("")
print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
elif isinstance(ev, AgentStream): # showing the thought process
print(ev.delta, end="", flush=True)
resp = await handler
resp
创建 Multi-agent systems
AgentWorkflow 类也直接支持多智能体系统。通过给每个智能体一个名称和描述,系统维护一个单一的活动说话者,每个智能体都有能力将任务交接给另一个智能体。
通过缩小每个智能体的作用范围,我们可以帮助提高它们在响应用户消息时的总体准确性。
LlamaIndex 中的智能体也可以直接用作其他智能体的工具:
from llama_index.core.agent.workflow import (
AgentWorkflow,
ReActAgent,
)
# Define some tools
def add(a: int, b: int) -> int:
"""Add two numbers."""
return a + b
def subtract(a: int, b: int) -> int:
"""Subtract two numbers."""
return a - b
# Create agent configs
# NOTE: we can use FunctionAgent or ReActAgent here.
# FunctionAgent works for LLMs with a function calling API.
# ReActAgent works for any LLM.
calculator_agent = ReActAgent(
name="calculator",
description="Performs basic arithmetic operations",
system_prompt="You are a calculator assistant. Use your tools for any math operation.",
tools=[add, subtract], # 将agent 用作tool
llm=llm,
)
query_agent = ReActAgent(
name="info_lookup",
description="Looks up information about XYZ",
system_prompt="Use your tool to query a RAG system to answer information about XYZ",
tools=[query_engine_tool], # 将agent 用作tool
llm=llm,
)
# Create and run the workflow
agent = AgentWorkflow(agents=[calculator_agent, query_agent], root_agent="calculator")
# Run the system
handler = agent.run(user_msg="Can you add 5 and 3?")
# 展示思考和推理过程:
async for ev in handler.stream_events():
if isinstance(ev, ToolCallResult):
print("")
print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
elif isinstance(ev, AgentStream): # showing the thought process
print(ev.delta, end="", flush=True)
resp = await handler
resp