Skip to main content

如何为聊天机器人添加记忆功能

聊天机器人的一个关键特性是能够将先前对话内容作为上下文使用。这种状态管理可以采取多种形式,包括:

  • 简单地将之前的消息填充到聊天模型的提示中。
  • 上述方法的基础上,修剪旧消息以减少模型需要处理的干扰信息量。
  • 更复杂的修改方式,例如为长时间运行的对话合成摘要。

我们将在下面详细介绍几种技术!

本操作指南之前构建了一个使用 RunnableWithMessageHistory 的聊天机器人。你可以在 v0.2 文档 中找到该版本的教程。

RunnableWithMessageHistory 相比,LangGraph 实现提供了许多优势,包括能够持久化应用程序状态的任意组件(而不仅仅是消息)。

准备工作

你需要安装一些包,选择你的聊天模型,并设置其环境变量。

yarn add @langchain/core @langchain/langgraph

让我们设置一个聊天模型,用于下面的示例。

Pick your chat model:

Install dependencies

yarn add @langchain/groq 

Add environment variables

GROQ_API_KEY=your-api-key

Instantiate the model

import { ChatGroq } from "@langchain/groq";

const model = new ChatGroq({
model: "llama-3.3-70b-versatile",
temperature: 0
});

消息传递

最简单的记忆形式是将聊天历史消息传递到一个链中。以下是一个示例:

import { HumanMessage, AIMessage } from "@langchain/core/messages";
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
],
new MessagesPlaceholder("messages"),
]);

const chain = prompt.pipe(llm);

await chain.invoke({
messages: [
new HumanMessage(
"Translate this sentence from English to French: I love programming."
),
new AIMessage("J'adore la programmation."),
new HumanMessage("What did you just say?"),
],
});
AIMessage {
"id": "chatcmpl-ABSxUXVIBitFRBh9MpasB5jeEHfCA",
"content": "I said \"J'adore la programmation,\" which means \"I love programming\" in French.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 18,
"promptTokens": 58,
"totalTokens": 76
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 58,
"output_tokens": 18,
"total_tokens": 76
}
}

我们可以看到,通过将之前的对话传递给一个链式结构,它可以将其用作回答问题的上下文。这是聊天机器人记忆功能的基本概念——本指南的其余部分将演示传递或重新格式化消息的便捷技术。

自动历史记录管理

前面的示例显式地将消息传递给链(和模型)。这是一种完全可以接受的方法,但它确实需要对外部的新消息进行管理。LangChain 还提供了一种使用 LangGraph 的持久化功能来构建具有记忆能力的应用程序的方法。您可以通过在编译图时提供一个 checkpointer 来启用 LangGraph 应用程序中的持久化功能。

import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";

// Define the function that calls the model
const callModel = async (state: typeof MessagesAnnotation.State) => {
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability.";
const messages = [
{ role: "system", content: systemPrompt },
...state.messages,
];
const response = await llm.invoke(messages);
return { messages: response };
};

const workflow = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);

// Add simple in-memory checkpointer
const memory = new MemorySaver();
const app = workflow.compile({ checkpointer: memory });

我们会将最新的输入传递到此处的对话中,并让 LangGraph 使用检查点来跟踪对话历史记录:

await app.invoke(
{
messages: [
{
role: "user",
content: "Translate to French: I love programming.",
},
],
},
{
configurable: { thread_id: "1" },
}
);
{
messages: [
HumanMessage {
"id": "227b82a9-4084-46a5-ac79-ab9a3faa140e",
"content": "Translate to French: I love programming.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 35,
"totalTokens": 40
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 35,
"output_tokens": 5,
"total_tokens": 40
}
}
]
}
await app.invoke(
{
messages: [
{
role: "user",
content: "What did I just ask you?",
},
],
},
{
configurable: { thread_id: "1" },
}
);
{
messages: [
HumanMessage {
"id": "1a0560a4-9dcb-47a1-b441-80717e229706",
"content": "Translate to French: I love programming.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 35,
"totalTokens": 40
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "4f233a7d-4b08-4f53-bb60-cf0141a59721",
"content": "What did I just ask you?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxVs5QnlPfbihTOmJrCVg1Dh7Ol",
"content": "You asked me to translate \"I love programming\" into French.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 13,
"promptTokens": 55,
"totalTokens": 68
},
"finish_reason": "stop",
"system_fingerprint": "fp_9f2bfdaa89"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 55,
"output_tokens": 13,
"total_tokens": 68
}
}
]
}

修改聊天历史

修改存储的聊天消息可以帮助你的聊天机器人处理各种情况。以下是一些示例:

裁剪消息

LLM 和聊天模型具有有限的上下文窗口,即使你没有直接触及限制,你可能也希望限制模型需要处理的干扰信息量。一种解决方案是在将消息传递给模型之前先对其进行裁剪。让我们以上面声明的 app 为例说明:

const demoEphemeralChatHistory = [
{ role: "user", content: "Hey there! I'm Nemo." },
{ role: "assistant", content: "Hello!" },
{ role: "user", content: "How are you today?" },
{ role: "assistant", content: "Fine thanks!" },
];

await app.invoke(
{
messages: [
...demoEphemeralChatHistory,
{ role: "user", content: "What's my name?" },
],
},
{
configurable: { thread_id: "2" },
}
);
{
messages: [
HumanMessage {
"id": "63057c3d-f980-4640-97d6-497a9f83ddee",
"content": "Hey there! I'm Nemo.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "c9f0c20a-8f55-4909-b281-88f2a45c4f05",
"content": "Hello!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "fd7fb3a0-7bc7-4e84-99a9-731b30637b55",
"content": "How are you today?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "09b0debb-1d4a-4856-8821-b037f5d96ecf",
"content": "Fine thanks!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "edc13b69-25a0-40ac-81b3-175e65dc1a9a",
"content": "What's my name?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxWKCTdRuh2ZifXsvFHSo5z5I0J",
"content": "Your name is Nemo! How can I assist you today, Nemo?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 14,
"promptTokens": 63,
"totalTokens": 77
},
"finish_reason": "stop",
"system_fingerprint": "fp_a5d11b2ef2"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 63,
"output_tokens": 14,
"total_tokens": 77
}
}
]
}

我们可以看到应用记住了预加载的名称。

但假设我们有一个非常小的上下文窗口,我们希望将传递给模型的消息数量裁剪为仅保留最近的两条消息。我们可以使用内置的trimMessages工具,在消息到达我们的提示词之前根据其令牌数对其进行裁剪。在这个例子中,我们将每条消息计为 1 个“令牌”,并且只保留最后两条消息:

import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";

// Define trimmer
// count each message as 1 "token" (tokenCounter: (msgs) => msgs.length) and keep only the last two messages
const trimmer = trimMessages({
strategy: "last",
maxTokens: 2,
tokenCounter: (msgs) => msgs.length,
});

// Define the function that calls the model
const callModel2 = async (state: typeof MessagesAnnotation.State) => {
const trimmedMessages = await trimmer.invoke(state.messages);
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability.";
const messages = [
{ role: "system", content: systemPrompt },
...trimmedMessages,
];
const response = await llm.invoke(messages);
return { messages: response };
};

const workflow2 = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel2)
.addEdge(START, "model")
.addEdge("model", END);

// Add simple in-memory checkpointer
const app2 = workflow2.compile({ checkpointer: new MemorySaver() });

让我们调用这个新应用并检查响应

await app2.invoke(
{
messages: [
...demoEphemeralChatHistory,
{ role: "user", content: "What is my name?" },
],
},
{
configurable: { thread_id: "3" },
}
);
{
messages: [
HumanMessage {
"id": "0d9330a0-d9d1-4aaf-8171-ca1ac6344f7c",
"content": "What is my name?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "3a24e88b-7525-4797-9fcd-d751a378d22c",
"content": "Fine thanks!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "276039c8-eba8-4c68-b015-81ec7704140d",
"content": "How are you today?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "2ad4f461-20e1-4982-ba3b-235cb6b02abd",
"content": "Hello!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"id": "52213cae-953a-463d-a4a0-a7368c9ee4db",
"content": "Hey there! I'm Nemo.",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxWe9BRDl1pmzkNIDawWwU3hvKm",
"content": "I'm sorry, but I don't have access to personal information about you unless you've shared it with me during our conversation. How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 30,
"promptTokens": 39,
"totalTokens": 69
},
"finish_reason": "stop",
"system_fingerprint": "fp_3537616b13"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 39,
"output_tokens": 30,
"total_tokens": 69
}
}
]
}

我们可以看到 trimMessages 被调用了,并且只有两个最近的消息会被传递给模型。在这种情况下,这意味着模型忘记了我们给它的名称。

查看更多内容请访问我们的消息裁剪指南

总结记忆

我们还可以以其他方式使用相同的模式。例如,我们可以在调用应用程序之前,使用额外的 LLM 调用来生成对话的摘要。让我们重新创建我们的聊天记录:

const demoEphemeralChatHistory2 = [
{ role: "user", content: "Hey there! I'm Nemo." },
{ role: "assistant", content: "Hello!" },
{ role: "user", content: "How are you today?" },
{ role: "assistant", content: "Fine thanks!" },
];

现在,让我们更新模型调用函数,将之前的交互内容提炼成一个摘要:

import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
import { RemoveMessage } from "@langchain/core/messages";

// Define the function that calls the model
const callModel3 = async (state: typeof MessagesAnnotation.State) => {
const systemPrompt =
"You are a helpful assistant. " +
"Answer all questions to the best of your ability. " +
"The provided chat history includes a summary of the earlier conversation.";
const systemMessage = { role: "system", content: systemPrompt };
const messageHistory = state.messages.slice(0, -1); // exclude the most recent user input

// Summarize the messages if the chat history reaches a certain size
if (messageHistory.length >= 4) {
const lastHumanMessage = state.messages[state.messages.length - 1];
// Invoke the model to generate conversation summary
const summaryPrompt =
"Distill the above chat messages into a single summary message. " +
"Include as many specific details as you can.";
const summaryMessage = await llm.invoke([
...messageHistory,
{ role: "user", content: summaryPrompt },
]);

// Delete messages that we no longer want to show up
const deleteMessages = state.messages.map(
(m) => new RemoveMessage({ id: m.id })
);
// Re-add user message
const humanMessage = { role: "user", content: lastHumanMessage.content };
// Call the model with summary & response
const response = await llm.invoke([
systemMessage,
summaryMessage,
humanMessage,
]);
return {
messages: [summaryMessage, humanMessage, response, ...deleteMessages],
};
} else {
const response = await llm.invoke([systemMessage, ...state.messages]);
return { messages: response };
}
};

const workflow3 = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel3)
.addEdge(START, "model")
.addEdge("model", END);

// Add simple in-memory checkpointer
const app3 = workflow3.compile({ checkpointer: new MemorySaver() });

看看它是否记得我们给它的名字:

await app3.invoke(
{
messages: [
...demoEphemeralChatHistory2,
{ role: "user", content: "What did I say my name was?" },
],
},
{
configurable: { thread_id: "4" },
}
);
{
messages: [
AIMessage {
"id": "chatcmpl-ABSxXjFDj6WRo7VLSneBtlAxUumPE",
"content": "Nemo greeted the assistant and asked how it was doing, to which the assistant responded that it was fine.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 22,
"promptTokens": 60,
"totalTokens": 82
},
"finish_reason": "stop",
"system_fingerprint": "fp_e375328146"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 60,
"output_tokens": 22,
"total_tokens": 82
}
},
HumanMessage {
"id": "8b1309b7-c09e-47fb-9ab3-34047f6973e3",
"content": "What did I say my name was?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"id": "chatcmpl-ABSxYAQKiBsQ6oVypO4CLFDsi1HRH",
"content": "You mentioned that your name is Nemo.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 8,
"promptTokens": 73,
"totalTokens": 81
},
"finish_reason": "stop",
"system_fingerprint": "fp_52a7f40b0b"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 73,
"output_tokens": 8,
"total_tokens": 81
}
}
]
}

请注意,再次调用应用程序将继续累积历史记录,直到达到指定的消息数量(在我们的例子中是四条)。此时,我们将根据初始摘要和新消息生成另一个摘要,依此类推。


Was this page helpful?


You can also leave detailed feedback on GitHub.