8. 具有长记忆的Agent|电子爱好者

admin管理员组
文章数量:1579410

探索我们如何构建增强检索的对话代理。

我们在前面的章节已经见识了检索增强和对话式agent是多么强大。当我们把他们在一起使用时，他们变得更加有吸引力。

对话式agent可能会在数据时效性、特定领域知识或访问内部文档方便遇到困难。通过将agent和检索增强工具相结合，我们就不再有这些问题了。

在另外一方面，在不使用agent的情况下，使用“原生”检索增强意味着我们将在每次查询时检索上下文。同样，这并不总是理想的，因为并不是每次查询都需要访问外部知识。

将这些方法结合起来，我们就能兼得两者的优势。在这个笔记中，我们将学习如何做到这一点。

在开始之前，我们需要安装将我们在笔记中将使用的lib。

pip install -qU \
    openai==1.6.1 \
    pinecone-client==3.1.0 \
    langchain==0.1.1 \
    langchain-community==0.0.13 \
    tiktoken==0.5.2 \
    datasets==2.12.0

构建知识库

我们从构建知识库开始。我们将使用一个基本准备好的数据集，名为Stanford Question-Answering Dataset(SQuAD)，托管在Hugging Face数据集上。我们按照下面的方法来下载：

from datasets import load_dataset

data = load_dataset('squad', split='train')
data

这个数据集包含重复的上下文，可以像下面这样去掉：

data = data.to_pandas()
data.head()

data.drop_duplicates(subset='context', keep='first', inplace=True)
data.head()

初始化嵌入模型和向量DB

我们将使用通过LangChain初始化的OpenAI的text-embedding-ada-002模型、以及Pinecone向量数据库。我们首先初始化嵌入模型，位次我们需要一个OpenAI API密钥。

（需要注意的是，OpenAI是一个收费的服务，因此运行这个笔记的剩余部分会带来一些小的支出）

import os
from getpass import getpass
from langchain.embeddings.openai import OpenAIEmbeddings

# get API key from top-right dropdown on OpenAI website
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") or getpass("Enter your OpenAI API key: ")
model_name = 'text-embedding-ada-002'

embed = OpenAIEmbeddings(
    model=model_name,
    openai_api_key=OPENAI_API_KEY
)

现在我们创建我们的向量DB来存储我们的向量。为了做这件事情，我们需要一个免费的Pinecone API密钥----该API密钥可以在Pinecone控制面左侧导航栏中的“API keys”按钮中找到。

from pinecone import Pinecone

# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = os.getenv("PINECONE_API_KEY") or getpass("Enter your Pinecone API key: ")

# configure client
pc = Pinecone(api_key=api_key)

现在我们可以设置我们的索引规范，这个使得我们能够定义用来部署我们的index的云提供商和region。可以从这里看到所有可用提供商和region列表。

from pinecone import ServerlessSpec

spec = ServerlessSpec(
    cloud="aws", region="us-east-1"
)

创建一个index，我们设置dimension等于Ada-002（1536）的dimensionality，并且使用与Ada-002匹配的metric（可以是cosine或者dotproduct）。我们同时将我们的spec传递给索引的初始化。

import time

index_name = "langchain-retrieval-agent"
existing_indexes = [
    index_info["name"] for index_info in pc.list_indexes()
]

# check if index already exists (it shouldn't if this is first time)
if index_name not in existing_indexes:
    # if does not exist, create index
    pc.create_index(
        index_name,
        dimension=1536,  # dimensionality of ada 002
        metric='dotproduct',
        spec=spec
    )
    # wait for index to be initialized
    while not pc.describe_index(index_name).status['ready']:
        time.sleep(1)

# connect to index
index = pc.Index(index_name)
time.sleep(1)
# view index stats
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

我们可以看到新的Pinecone索引total_vector_count为0，因为我们还没有添加任何向量。

索引

我们可以使用LangChain的向量存储对象来执行索引任务。但是，直接通过Pinecone的python客户端来做这个事情会更加快。我们将以100个或更多为一批进行操作。

from tqdm.auto import tqdm

batch_size = 100

texts = []
metadatas = []

for i in tqdm(range(0, len(data), batch_size)):
    # get end of batch
    i_end = min(len(data), i+batch_size)
    batch = data.iloc[i:i_end]
    # first get metadata fields for this record
    metadatas = [{
        'title': record['title'],
        'text': record['context']
    } for j, record in batch.iterrows()]
    # get the list of contexts / documents
    documents = batch['context']
    # create document embeddings
    embeds = embed.embed_documents(documents)
    # get IDs
    ids = batch['id']
    # add everything to pinecone
    index.upsert(vectors=zip(ids, embeds, metadatas))

我们已经将所有事情索引，现在我们可以像下面这样检查我们索引里向量的数量：

index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 18891}},
 'total_vector_count': 18891}

创建一个向量存储并且查询

现在我们已经构建了我们的索引，可以回到LangChain。我们使用我们刚刚构建的相同索引来初始化一个向量存储。像下面这样来做：

from langchain.vectorstores import Pinecone

text_field = "text"  # the metadata field that contains our text

# initialize the vector store object
vectorstore = Pinecone(
    index, embed.embed_query, text_field
)

/Users/jamesbriggs/opt/anaconda3/envs/ml/lib/python3.9/site-packages/langchain_community/vectorstores/pinecone.py:74: UserWarning: Passing in `embedding` as a Callable is deprecated. Please pass in an Embeddings object instead.
  warnings.warn(

在上面的例子中，我们可以使用similarity_search方法做一个语义搜索(没有生成组件)。

query = "when was the college of engineering in the University of Notre Dame established?"

vectorstore.similarity_search(
    query,  # our search query
    k=3  # return 3 most relevant docs
)

[Document(page_content="In 1919 Father James Burns became president of Notre Dame, and in three years he produced an academic revolution that brought the school up to national standards by adopting the elective system and moving away from the university's traditional scholastic and classical emphasis. By contrast, the Jesuit colleges, bastions of academic conservatism, were reluctant to move to a system of electives. Their graduates were shut out of Harvard Law School for that reason. Notre Dame continued to grow over the years, adding more colleges, programs, and sports teams. By 1921, with the addition of the College of Commerce, Notre Dame had grown from a small college to a university with five colleges and a professional law school. The university continued to expand and add new residence halls and buildings with each subsequent president.", metadata={'title': 'University_of_Notre_Dame'}),
 Document(page_content='The College of Engineering was established in 1920, however, early courses in civil and mechanical engineering were a part of the College of Science since the 1870s. Today the college, housed in the Fitzpatrick, Cushing, and Stinson-Remick Halls of Engineering, includes five departments of study – aerospace and mechanical engineering, chemical and biomolecular engineering, civil engineering and geological sciences, computer science and engineering, and electrical engineering – with eight B.S. degrees offered. Additionally, the college offers five-year dual degree programs with the Colleges of Arts and Letters and of Business awarding additional B.A. and Master of Business Administration (MBA) degrees, respectively.', metadata={'title': 'University_of_Notre_Dame'}),
 Document(page_content='Since 2005, Notre Dame has been led by John I. Jenkins, C.S.C., the 17th president of the university. Jenkins took over the position from Malloy on July 1, 2005. In his inaugural address, Jenkins described his goals of making the university a leader in research that recognizes ethics and building the connection between faith and studies. During his tenure, Notre Dame has increased its endowment, enlarged its student body, and undergone many construction projects on campus, including Compton Family Ice Arena, a new architecture hall, additional residence halls, and the Campus Crossroads, a $400m enhancement and expansion of Notre Dame Stadium.', metadata={'title': 'University_of_Notre_Dame'})]

看着像是我们得到了一个好的答案。让我们来看看我们如何将这个集成到对话式agent里。

初始化对话式agent

我们需要一个聊天LLM，对话式记忆，以及一个RetrievalQAchain初始化对话式agent。我们使用下面的方式来创建：

from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.chains import RetrievalQA

# chat completion llm
llm = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    model_name='gpt-3.5-turbo',
    temperature=0.0
)
# conversational memory
conversational_memory = ConversationBufferWindowMemory(
    memory_key='chat_history',
    k=5,
    return_messages=True
)
# retrieval qa chain
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

使用这些，我们可以使用run方法得到一个答案：

qa.run(query)

'The College of Engineering at the University of Notre Dame was established in 1920.'

但是对于我们的对话是agent还没有准备好。为此，我们需要将检索chain转换为一个工具。我们按照下面来做这个：

from langchain.agents import Tool

tools = [
    Tool(
        name='Knowledge Base',
        func=qa.run,
        description=(
            'use this tool when answering general knowledge queries to get '
            'more information about the topic'
        )
    )
]

现在，我们可以像下面这样初始化agent：

from langchain.agents import initialize_agent

agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=conversational_memory
)

通过这个，我们的检索增强对话式agent已经准备好，并且我们可以开始使用它.

使用对话式agent

我们简单的直接调用agent来进行查询：

agent(query)

> Entering new AgentExecutor chain...
{
    "action": "Knowledge Base",
    "action_input": "When was the College of Engineering in the University of Notre Dame established?"
}
Observation: The College of Engineering at the University of Notre Dame was established in 1920.
Thought:{
    "action": "Final Answer",
    "action_input": "The College of Engineering at the University of Notre Dame was established in 1920."
}

> Finished chain.
{'input': 'when was the college of engineering in the University of Notre Dame established?',
 'chat_history': [],
 'output': 'The College of Engineering at the University of Notre Dame was established in 1920.'}

看着很棒，如果我们询问一个非通用的知识类问题会怎样？

agent("what is 2 * 7?")

> Entering new AgentExecutor chain...
{
    "action": "Final Answer",
    "action_input": "The product of 2 multiplied by 7 is 14."
}

> Finished chain.
{'input': 'what is 2 * 7?',
 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?'),
  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.')],
 'output': 'The product of 2 multiplied by 7 is 14.'}

完美，agent能够识别这个问题无需参考他的通用知识工具。让我们试试更多的问题。

agent("can you tell me some facts about the University of Notre Dame?")

> Entering new AgentExecutor chain...
{
    "action": "Knowledge Base",
    "action_input": "University of Notre Dame"
}
Observation: The University of Notre Dame is a Catholic research university located in South Bend, Indiana, in the United States. It is known for its strong academic programs, including undergraduate colleges in Arts and Letters, Science, Engineering, Business, and the Architecture School. The university also has a graduate program with over 50 master's, doctoral, and professional degree programs. Notre Dame is recognized as one of the top universities in the United States and has a strong alumni network. It is also known for its iconic landmarks, such as the Golden Dome and the Basilica. The university is committed to research and has various institutes dedicated to different fields of study. Notre Dame is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.
Thought:{
    "action": "Final Answer",
    "action_input": "The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs in various fields, including Arts and Letters, Science, Engineering, Business, and Architecture. Notre Dame is known for its academic excellence, iconic landmarks like the Golden Dome and the Basilica, and its commitment to research. It is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change."
}

> Finished chain.
{'input': 'can you tell me some facts about the University of Notre Dame?',
 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?'),
  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.'),
  HumanMessage(content='what is 2 * 7?'),
  AIMessage(content='The product of 2 multiplied by 7 is 14.')],
 'output': 'The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs in various fields, including Arts and Letters, Science, Engineering, Business, and Architecture. Notre Dame is known for its academic excellence, iconic landmarks like the Golden Dome and the Basilica, and its commitment to research. It is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.'}

agent("can you summarize these facts in two short sentences")

> Entering new AgentExecutor chain...
{
    "action": "Final Answer",
    "action_input": "The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs and is known for its iconic landmarks and commitment to research."
}

> Finished chain.
{'input': 'can you summarize these facts in two short sentences',
 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?'),
  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.'),
  HumanMessage(content='what is 2 * 7?'),
  AIMessage(content='The product of 2 multiplied by 7 is 14.'),
  HumanMessage(content='can you tell me some facts about the University of Notre Dame?'),
  AIMessage(content='The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs in various fields, including Arts and Letters, Science, Engineering, Business, and Architecture. Notre Dame is known for its academic excellence, iconic landmarks like the Golden Dome and the Basilica, and its commitment to research. It is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.')],
 'output': 'The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs and is known for its iconic landmarks and commitment to research.'}

很棒！我们也可以询问参考对话历史交互，agent也可以将历史对话作为一个信息来源。

这个就是使用OPenAI、Pinecone（最佳组合）和LangChain来构建检索增强对话式代理的示例的全部内容。完成后，我们删除Pinecone的索引来节省资源：

pc.delete_index(index_name)

https://colab.research.google/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb

本文标签：记忆 Agent

版权声明：本文标题：8. 具有长记忆的Agent 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://m.elefans.com/dongtai/1727845539a1133044.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

电子爱好者 - 最新技术资讯及电子产品介绍！

8. 具有长记忆的Agent

构建知识库

初始化嵌入模型和向量DB

索引

创建一个向量存储并且查询

初始化对话式agent

使用对话式agent

更多相关文章

中国各省地图记忆方法

picture 记忆方法

记忆宫殿怎么建立_好的建立方法有哪些

记忆宫殿怎么建立

Trados 2014 SP2 安装方法

[ins账号昵称大全2022]2022ins资源福利账号

聪字该怎么组词和造句

2023年《西出玉门》电视剧在哪里可以看

Zabbix之zabbix-agent（windows环境）安装及配置

如何删除恶意隐身的电脑软件如UniAccess Agent,Security Assistant Agent,origin,solidwork

数据埋点-浏览器User-Agent大全

整理时下流行的浏览器User-Agent大全

一键找回珍贵记忆：家庭用户最爱的数据恢复软件

【ai_agent】从零写一个agent框架（三）实现几个示例中的service：llm，tool等

【66个开源+44个闭源Agent项目】

AI Agent: AI的下一个风口 自然语言带来交互革命

五笔输入法之记忆方法，一张图

游览器user agent 及手机UserAgent库查询工具

【Agent应用】营销大师 | 文案创作助手

搜索引擎爬虫蜘蛛的User-Agent收集

发表评论

推荐文章

《计算机系统与网络安全》第七章 身份认证

计算机内存不足16g内存,ANSYS计算闪退，电脑16G运行内存，提示RAM内存不足，怎么回事？ - 仿真模拟 - 小木虫 - 学术 科研 互动社区...

cisco路由器密码恢复过程

高级计算机网络

谷歌浏览器 chrome 查看当前页面cookie

热门文章

抖音直播下载方法（附视频下载方法）

Guideline 4.0 - Design

7个适用于安卓手机到 PC电脑端的最佳屏幕镜像应用程序

Ubuntu进emergency mode 解决方案

Web搜索引擎技术

美食信息推荐系统

正则表达式的贪婪，勉强，独占模式[转]

Day778.正则表达式问题 -Java 性能调优实战

java 正则 规则_Java基础-正则表达式(Regular Expression)语法规则简介

java正则表达式语法详解及其使用代码实例 (转)

最新文章

浏览器验证码图片（缩略图）显示不出来问题解决办法

谷歌浏览器被2345主页强制绑定

解决Edge及Chrome等浏览器主页被篡改2345导航页

关于Google浏览器添加QQ电脑管家广告过滤插件出现2345主页拦截问题

判断浏览器中是否安装了某插件

怎么开启2345加速浏览器的过滤弹窗广告

打开火狐浏览器之后主页自动跳转到2345网站首页

浏览器被恶意设置主页http:www.2345.com?kunown的解决方法

2345 网址导航劫持 解决办法

谷歌浏览器打开后同时弹出百度搜索和2345问题解决

您需要来自Administration的权限才能对此文件夹进行更改’怎么删除文件，window10删除2345流氓软件

浏览器自动调html5,HTML5 浏览器支持

浏览器无法找到css或者js文件

WebSocket

浏览器提示：正在下载代理脚本

小米手机肿么还原时钟

15000流明是多少瓦

一般普通投影机功率多大?

苹果绿联转换器有些投影机不能用

坚果V9投影机具体参数?

有关九年级作文850字精选

80后90后_高一作文

中级卫生专业资格中医全科学主治医师中级模拟题2021年(9)案与解析

(精品)师范大学招考硕士研究生课程八六0试卷

ZXMVC8900(V3

【模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313】模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313 官方免费下载

【生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD】生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD 官方免费下载

【模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311】模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311 官方免费下载

【模拟人生4（The Sims 4）科幻风宇宙飞船家庭住宅MOD V20190311】模拟人生4（The Sims 4）科幻风宇宙飞船家庭住宅MOD V20190311 官方免费下载

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改 官方免费下载

如何实现高效的treenode搜索算法

treenode与链表有何本质区别

AI Agent: AI的下一个风口自然语言带来交互革命

《计算机系统与网络安全》第七章身份认证

计算机内存不足16g内存,ANSYS计算闪退，电脑16G运行内存，提示RAM内存不足，怎么回事？ - 仿真模拟 - 小木虫 - 学术科研互动社区...

java 正则规则_Java基础-正则表达式(Regular Expression)语法规则简介

2345 网址导航劫持解决办法

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载