Skip to main content

从ConversationTokenBufferMemory迁移

如果您正在尝试弃用以下列出的旧内存类之一,请遵循本指南:

内存类型描述
ConversationTokenBufferMemory在对话中保留最近的消息,前提是对话中的总 token 数不超过一定限制。

ConversationTokenBufferMemory 在原始对话历史记录的基础上应用额外的处理,以将对话历史记录修剪为适合聊天模型上下文窗口大小的尺寸。

可以使用 LangChain 内置的trimMessages函数来完成此处理功能。

info

我们将从探索一个简单的方法开始,该方法包括对整个对话历史应用处理逻辑。

虽然这种方法易于实现,但它有一个缺点:随着对话的增长,延迟也会增加,因为每次对话回合都会对之前的所有对话内容重新应用相同的处理逻辑。

更高级的策略则侧重于增量更新对话历史,以避免重复处理。

例如,LangGraph的总结处理指南展示了 如何维护对话的持续摘要,同时丢弃较旧的消息,确保在后续对话中不会重新处理这些旧消息。

环境设置

依赖项

yarn add @langchain/openai @langchain/core zod

环境变量

process.env.OPENAI_API_KEY = "YOUR_OPENAI_API_KEY";
Details

重新实现 ConversationTokenBufferMemory 逻辑

在这里,我们将使用trimMessages来保留系统消息和最近的对话消息,确保对话中的总 token 数量不超过特定限制。

import {
AIMessage,
HumanMessage,
SystemMessage,
} from "@langchain/core/messages";

const messages = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("why is 42 always the answer?"),
new AIMessage(
"Because it's the only number that's constantly right, even when it doesn't add up!"
),
new HumanMessage("What did the cow say?"),
];
import { trimMessages } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";

const selectedMessages = await trimMessages(messages, {
// Please see API reference for trimMessages for other ways to specify a token counter.
tokenCounter: new ChatOpenAI({ model: "gpt-4o" }),
maxTokens: 80, // <-- token limit
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
strategy: "last",
includeSystem: true, // <-- Keep the system message
});

for (const msg of selectedMessages) {
console.log(msg);
}
SystemMessage {
"content": "you're a good assistant, you always respond with a joke.",
"additional_kwargs": {},
"response_metadata": {}
}
HumanMessage {
"content": "and who is harrison chasing anyways",
"additional_kwargs": {},
"response_metadata": {}
}
AIMessage {
"content": "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
HumanMessage {
"content": "why is 42 always the answer?",
"additional_kwargs": {},
"response_metadata": {}
}
AIMessage {
"content": "Because it's the only number that's constantly right, even when it doesn't add up!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
HumanMessage {
"content": "What did the cow say?",
"additional_kwargs": {},
"response_metadata": {}
}

使用 LangGraph 的现代用法

下面的示例展示了如何使用 LangGraph 添加简单的对话预处理逻辑。

note

如果你希望避免每次都在整个对话历史记录上执行计算,可以参考 摘要指南,该指南展示了 如何丢弃较旧的消息,确保它们在后续对话轮次中不会被重新处理。

Details
import { v4 as uuidv4 } from "uuid";
import { ChatOpenAI } from "@langchain/openai";
import {
StateGraph,
MessagesAnnotation,
END,
START,
MemorySaver,
} from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";

// Define a chat model
const model = new ChatOpenAI({ model: "gpt-4o" });

// Define the function that calls the model
const callModel = async (
state: typeof MessagesAnnotation.State
): Promise<Partial<typeof MessagesAnnotation.State>> => {
const selectedMessages = await trimMessages(state.messages, {
tokenCounter: (messages) => messages.length, // Simple message count instead of token count
maxTokens: 5, // Allow up to 5 messages
strategy: "last",
startOn: "human",
includeSystem: true,
allowPartial: false,
});

const response = await model.invoke(selectedMessages);

// With LangGraph, we're able to return a single message, and LangGraph will concatenate
// it to the existing list
return { messages: [response] };
};

// Define a new graph
const workflow = new StateGraph(MessagesAnnotation)
// Define the two nodes we will cycle between
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);

const app = workflow.compile({
// Adding memory is straightforward in LangGraph!
// Just pass a checkpointer to the compile method.
checkpointer: new MemorySaver(),
});

// The thread id is a unique key that identifies this particular conversation
// ---
// NOTE: this must be `thread_id` and not `threadId` as the LangGraph internals expect `thread_id`
// ---
const thread_id = uuidv4();
const config = { configurable: { thread_id }, streamMode: "values" as const };

const inputMessage = {
role: "user",
content: "hi! I'm bob",
};
for await (const event of await app.stream(
{ messages: [inputMessage] },
config
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}

// Here, let's confirm that the AI remembers our name!
const followUpMessage = {
role: "user",
content: "what was my name?",
};

// ---
// NOTE: You must pass the same thread id to continue the conversation
// we do that here by passing the same `config` object to the `.stream` call.
// ---
for await (const event of await app.stream(
{ messages: [followUpMessage] },
config
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
hi! I'm bob
Hello, Bob! How can I assist you today?
what was my name?
You mentioned that your name is Bob. How can I help you today?

使用预构建的 langgraph agent

本示例展示了如何使用 Agent Executor 与通过 createReactAgent 函数构建的预构建 agent。

如果您正在使用 旧版 LangChain 预构建 agent, 则可以用新的 LangGraph 预构建 agent 替换原有代码, 它利用了聊天模型的原生工具调用能力,开箱即用,效果可能更佳。

Details
import { z } from "zod";
import { v4 as uuidv4 } from "uuid";
import { BaseMessage, trimMessages } from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { ChatOpenAI } from "@langchain/openai";
import { MemorySaver } from "@langchain/langgraph";
import { createReactAgent } from "@langchain/langgraph/prebuilt";

const getUserAge = tool(
(name: string): string => {
// This is a placeholder for the actual implementation
if (name.toLowerCase().includes("bob")) {
return "42 years old";
}
return "41 years old";
},
{
name: "get_user_age",
description: "Use this tool to find the user's age.",
schema: z.string().describe("the name of the user"),
}
);

const memory = new MemorySaver();
const model2 = new ChatOpenAI({ model: "gpt-4o" });

const stateModifier = async (
messages: BaseMessage[]
): Promise<BaseMessage[]> => {
// We're using the message processor defined above.
return trimMessages(messages, {
tokenCounter: (msgs) => msgs.length, // <-- .length will simply count the number of messages rather than tokens
maxTokens: 5, // <-- allow up to 5 messages.
strategy: "last",
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
includeSystem: true, // <-- Keep the system message
allowPartial: false,
});
};

const app2 = createReactAgent({
llm: model2,
tools: [getUserAge],
checkpointSaver: memory,
messageModifier: stateModifier,
});

// The thread id is a unique key that identifies
// this particular conversation.
// We'll just generate a random uuid here.
const threadId2 = uuidv4();
const config2 = {
configurable: { thread_id: threadId2 },
streamMode: "values" as const,
};

// Tell the AI that our name is Bob, and ask it to use a tool to confirm
// that it's capable of working like an agent.
const inputMessage2 = {
role: "user",
content: "hi! I'm bob. What is my age?",
};

for await (const event of await app2.stream(
{ messages: [inputMessage2] },
config2
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}

// Confirm that the chat bot has access to previous conversation
// and can respond to the user saying that the user's name is Bob.
const followUpMessage2 = {
role: "user",
content: "do you remember my name?",
};

for await (const event of await app2.stream(
{ messages: [followUpMessage2] },
config2
)) {
const lastMessage = event.messages[event.messages.length - 1];
console.log(lastMessage.content);
}
hi! I'm bob. What is my age?

42 years old
Hi Bob! You are 42 years old.
do you remember my name?
Yes, your name is Bob! If there's anything else you'd like to know or discuss, feel free to ask.

LCEL:添加预处理步骤

添加复杂对话管理的最简单方法是在聊天模型前面引入一个预处理步骤,并将完整的对话历史传递给该预处理步骤。

这种方法在概念上比较简单,并且在很多情况下都适用。例如,如果您使用的是 RunnableWithMessageHistory,而不是将聊天模型进行包装,而是使用预处理器来包装聊天模型。

这种方法的明显缺点是,由于以下两个原因,随着对话历史的增长,延迟会开始增加:

  1. 随着对话变长,可能需要从您用于存储对话历史的存储(如果不是在内存中存储的话)中获取更多数据。
  2. 预处理逻辑最终会进行大量冗余计算,重复对话之前步骤中的计算。
caution

如果您想使用聊天模型的工具调用功能,请记住在向其添加历史预处理步骤之前,将工具绑定到模型上!

Details
import { ChatOpenAI } from "@langchain/openai";
import {
AIMessage,
HumanMessage,
SystemMessage,
BaseMessage,
trimMessages,
} from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { z } from "zod";

const model3 = new ChatOpenAI({ model: "gpt-4o" });

const whatDidTheCowSay = tool(
(): string => {
return "foo";
},
{
name: "what_did_the_cow_say",
description: "Check to see what the cow said.",
schema: z.object({}),
}
);

const messageProcessor = trimMessages({
tokenCounter: (msgs) => msgs.length, // <-- .length will simply count the number of messages rather than tokens
maxTokens: 5, // <-- allow up to 5 messages.
strategy: "last",
// The startOn is specified
// to make sure we do not generate a sequence where
// a ToolMessage that contains the result of a tool invocation
// appears before the AIMessage that requested a tool invocation
// as this will cause some chat models to raise an error.
startOn: "human",
includeSystem: true, // <-- Keep the system message
allowPartial: false,
});

// Note that we bind tools to the model first!
const modelWithTools = model3.bindTools([whatDidTheCowSay]);

const modelWithPreprocessor = messageProcessor.pipe(modelWithTools);

const fullHistory = [
new SystemMessage("you're a good assistant, you always respond with a joke."),
new HumanMessage("i wonder why it's called langchain"),
new AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
new HumanMessage("and who is harrison chasing anyways"),
new AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
new HumanMessage("why is 42 always the answer?"),
new AIMessage(
"Because it's the only number that's constantly right, even when it doesn't add up!"
),
new HumanMessage("What did the cow say?"),
];

// We pass it explicitly to the modelWithPreprocessor for illustrative purposes.
// If you're using `RunnableWithMessageHistory` the history will be automatically
// read from the source that you configure.
const result = await modelWithPreprocessor.invoke(fullHistory);
console.log(result);
AIMessage {
"id": "chatcmpl-AB6uzWscxviYlbADFeDlnwIH82Fzt",
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "call_TghBL9dzqXFMCt0zj0VYMjfp",
"type": "function",
"function": "[Object]"
}
]
},
"response_metadata": {
"tokenUsage": {
"completionTokens": 16,
"promptTokens": 95,
"totalTokens": 111
},
"finish_reason": "tool_calls",
"system_fingerprint": "fp_a5d11b2ef2"
},
"tool_calls": [
{
"name": "what_did_the_cow_say",
"args": {},
"type": "tool_call",
"id": "call_TghBL9dzqXFMCt0zj0VYMjfp"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 95,
"output_tokens": 16,
"total_tokens": 111
}
}

如果你需要实现更高效的逻辑,并且希望目前使用 RunnableWithMessageHistory 实现此目标的方法是继承 BaseChatMessageHistory 并为 addMessages 定义适当的逻辑(不要简单地追加历史记录,而是重写它)。

除非你有充分的理由实现此解决方案,否则应使用 LangGraph。

下一步

探索使用 LangGraph 的持久化功能:

使用简单 LCEL 添加持久化功能(对于更复杂的使用场景,请优先使用 LangGraph):

处理消息历史记录:


Was this page helpful?


You can also leave detailed feedback on GitHub.