admin管理员组

文章数量:1590379

分类目录:《大模型从入门到应用》总目录

LangChain系列文章:

  • 基础知识
  • 快速入门
    • 安装与环境配置
    • 链(Chains)、代理(Agent:)和记忆(Memory)
    • 快速开发聊天模型
  • 模型(Models)
    • 基础知识
    • 大型语言模型(LLMs)
      • 基础知识
      • LLM的异步API、自定义LLM包装器、虚假LLM和人类输入LLM(Human Input LLM)
      • 缓存LLM的调用结果
      • 加载与保存LLM类、流式传输LLM与Chat Model响应和跟踪tokens使用情况
    • 聊天模型(Chat Models)
      • 基础知识
      • 使用少量示例和响应流式传输
    • 文本嵌入模型
      • Aleph Alpha、Amazon Bedrock、Azure OpenAI、Cohere等
      • Embaas、Fake Embeddings、Google Vertex AI PaLM等
  • 提示(Prompts)
    • 基础知识
    • 提示模板
      • 基础知识
      • 连接到特征存储
      • 创建自定义提示模板和含有Few-Shot示例的提示模板
      • 部分填充的提示模板和提示合成
      • 序列化提示信息
    • 示例选择器(Example Selectors)
    • 输出解析器(Output Parsers)
  • 记忆(Memory)
    • 基础知识
    • 记忆的类型
      • 会话缓存记忆、会话缓存窗口记忆和实体记忆
      • 对话知识图谱记忆、对话摘要记忆和会话摘要缓冲记忆
      • 对话令牌缓冲存储器和基于向量存储的记忆
    • 将记忆添加到LangChain组件中
    • 自定义对话记忆与自定义记忆类
    • 聊天消息记录
    • 记忆的存储与应用
  • 索引(Indexes)
    • 基础知识
    • 文档加载器(Document Loaders)
    • 文本分割器(Text Splitters)
    • 向量存储器(Vectorstores)
    • 检索器(Retrievers)
  • 链(Chains)
    • 基础知识
    • 通用功能
      • 自定义Chain和Chain的异步API
      • LLMChain和RouterChain
      • SequentialChain和TransformationChain
      • 链的保存(序列化)与加载(反序列化)
    • 链与索引
      • 文档分析和基于文档的聊天
      • 问答的基础知识
      • 图问答(Graph QA)和带来源的问答(Q&A with Sources)
      • 检索式问答
      • 文本摘要(Summarization)、HyDE和向量数据库的文本生成
  • 代理(Agents)
    • 基础知识
    • 代理类型
    • 自定义代理(Custom Agent)
    • 自定义MRKL代理
    • 带有ChatModel的LLM聊天自定义代理和自定义多操作代理(Custom MultiAction Agent)
    • 工具
      • 基础知识
      • 自定义工具(Custom Tools)
      • 多输入工具和工具输入模式
      • 人工确认工具验证和Tools作为OpenAI函数
    • 工具包(Toolkit)
    • 代理执行器(Agent Executor)
      • 结合使用Agent和VectorStore
      • 使用Agents的异步API和创建ChatGPT克隆
      • 处理解析错误、访问中间步骤和限制最大迭代次数
      • 为代理程序设置超时时间和限制最大迭代次数和为代理程序和其工具添加共享内存
    • 计划与执行
  • 回调函数(Callbacks)

对话令牌缓冲存储器ConversationTokenBufferMemory

ConversationTokenBufferMemory在内存中保留了最近的一些对话交互,并使用标记长度来确定何时刷新交互,而不是交互数量。

from langchain.memory import ConversationTokenBufferMemory
from langchain.llms import OpenAI
llm = OpenAI()
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=10)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
memory.load_memory_variables({})

输出:

{‘history’: ‘Human: not much you\nAI: not much’}

我们还可以将历史记录作为消息列表获取,如果我们正在使用聊天模型,将非常有用:

memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=10, return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
在链式模型中的应用

让我们通过一个例子来演示如何在链式模型中使用它,同样设置verbose=True,以便我们可以看到提示信息。

from langchain.chains import ConversationChain
conversation_with_summary = ConversationChain(
    llm=llm, 
    # We set a very low max_token_limit for the purposes of testing.
    memory=ConversationTokenBufferMemory(llm=OpenAI(), max_token_limit=60),
    verbose=True
)
conversation_with_summary.predict(input="Hi, what's up?")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, what's up?
AI:

> Finished chain.

输出:

" Hi there! I'm doing great, just enjoying the day. How about you?"

输入:

conversation_with_summary.predict(input="Just working on writing some documentation!")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, what's up?
AI:  Hi there! I'm doing great, just enjoying the day. How about you?
Human: Just working on writing some documentation!
AI:

> Finished chain.

输出:

    ' Sounds like a productive day! What kind of documentation are you writing?'

输入:

conversation_with_summary.predict(input="For LangChain! Have you heard of it?")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, what's up?
AI:  Hi there! I'm doing great, just enjoying the day. How about you?
Human: Just working on writing some documentation!
AI:  Sounds like a productive day! What kind of documentation are you writing?
Human: For LangChain! Have you heard of it?
AI:

> Finished chain.

输出:

    " Yes, I have heard of LangChain! It is a decentralized language-learning platform that connects native speakers and learners in real time. Is that the documentation you're writing about?"

输入:

# 我们可以看到这里缓冲区被更新了
conversation_with_summary.predict(input="Haha nope, although a lot of people confuse it for that")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: For LangChain! Have you heard of it?
AI:  Yes, I have heard of LangChain! It is a decentralized language-learning platform that connects native speakers and learners in real time. Is that the documentation you're writing about?
Human: Haha nope, although a lot of people confuse it for that
AI:

> Finished chain.

输出:

" Oh, I see. Is there another language learning platform you're referring to?"

基于向量存储的记忆VectorStoreRetrieverMemory

VectorStoreRetrieverMemory将内存存储在VectorDB中,并在每次调用时查询最重要的前 K K K个文档。与大多数其他Memory类不同,它不明确跟踪交互的顺序。在这种情况下,“文档”是先前的对话片段。这对于提及AI在对话中早些时候得知的相关信息非常有用。

from datetime import datetime
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate
初始化VectorStore

根据我们选择的存储方式,此步骤可能会有所不同,我们可以查阅相关的VectorStore文档以获取更多详细信息。

import faiss

from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS

embedding_size = 1536 # Dimensions of the OpenAIEmbeddings
index = faiss.IndexFlatL2(embedding_size)
embedding_fn = OpenAIEmbeddings().embed_query
vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {})
创建VectorStoreRetrieverMemory

记忆对象是从VectorStoreRetriever实例化的。

# In actual usage, you would set `k` to be a higher value, but we use k=1 to show that the vector lookup still returns the semantically relevant information
retriever = vectorstore.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)

# When added to an agent, the memory object can save pertinent information from conversations or used tools
memory.save_context({"input": "My favorite food is pizza"}, {"output": "thats good to know"})
memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."})
memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"}) # 
# Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant
# to a 1099 than the other documents, despite them both containing numbers.
print(memory.load_memory_variables({"prompt": "what sport should i watch?"})["history"])

输出:

input: My favorite sport is soccer
output: ...
在对话链中使用

让我们通过一个示例来演示,在此示例中我们继续设置verbose=True以便查看提示。

llm = OpenAI(temperature=0) # Can be any valid LLM
_DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Relevant pieces of previous conversation:
{history}

(You do not need to use these pieces of information if not relevant)

Current conversation:
Human: {input}
AI:"""
PROMPT = PromptTemplate(
    input_variables=["history", "input"], template=_DEFAULT_TEMPLATE
)
conversation_with_summary = ConversationChain(
    llm=llm, 
    prompt=PROMPT,
    # We set a very low max_token_limit for the purposes of testing.
    memory=memory,
    verbose=True
)
conversation_with_summary.predict(input="Hi, my name is Perry, what's up?")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Relevant pieces of previous conversation:
input: My favorite food is pizza
output: thats good to know

(You do not need to use these pieces of information if not relevant)

Current conversation:
Human: Hi, my name is Perry, what's up?
AI:

> Finished chain.

输出:

" Hi Perry, I'm doing well. How about you?"

输入:

# Here, the basketball related content is surfaced
conversation_with_summary.predict(input="what's my favorite sport?")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Relevant pieces of previous conversation:
input: My favorite sport is soccer
output: ...

(You do not need to use these pieces of information if not relevant)

Current conversation:
Human: what's my favorite sport?
AI:

> Finished chain.

输出:

  ' You told me earlier that your favorite sport is soccer.'

输入:

# Even though the language model is stateless, since relavent memory is fetched, it can "reason" about the time.
# Timestamping memories and data is useful in general to let the agent determine temporal relevance
conversation_with_summary.predict(input="Whats my favorite food")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Relevant pieces of previous conversation:
input: My favorite food is pizza
output: thats good to know

(You do not need to use these pieces of information if not relevant)

Current conversation:
Human: Whats my favorite food
AI:

> Finished chain.

输出:

  ' You said your favorite food is pizza.'

输入:

# The memories from the conversation are automatically stored,
# since this query best matches the introduction chat above,
# the agent is able to 'remember' the user's name.
conversation_with_summary.predict(input="What's my name?")

日志输出:

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Relevant pieces of previous conversation:
input: Hi, my name is Perry, what's up?
response:  Hi Perry, I'm doing well. How about you?

(You do not need to use these pieces of information if not relevant)

Current conversation:
Human: What's my name?
AI:

> Finished chain.

输出:

' Your name is Perry.'

参考文献:
[1] LangChain官方网站:https://www.langchain/
[2] LangChain 🦜️🔗 中文网,跟着LangChain一起学LLM/GPT开发:https://www.langchain/
[3] LangChain中文网 - LangChain 是一个用于开发由语言模型驱动的应用程序的框架:http://wwwlangchain/

本文标签: 记忆令牌向量入门模型