构建一个聊天机器人
本教程之前使用 RunnableWithMessageHistory 构建了一个聊天机器人。您可以在 v0.2 文档 中访问本教程的这个版本。
LangGraph 实现相比 RunnableWithMessageHistory 提供了许多优势,包括能够持久化应用程序状态的任意组件(而不仅仅是消息)。
概述
我们将介绍如何设计和实现一个由大型语言模型(LLM)驱动的聊天机器人。 这个聊天机器人将能够进行对话并记住之前的交互。
请注意,我们构建的这个聊天机器人将仅使用语言模型进行对话。 您可能还会对以下几个相关概念感兴趣:
- 对话式检索增强生成(Conversational RAG):在外部数据源上启用聊天机器人体验
- 代理(Agents):构建能够执行操作的聊天机器人
本教程将涵盖基础知识,这些知识将对这两个更高级的主题有所帮助,但如果您愿意,也可以直接跳转到那里。
准备工作
Jupyter Notebook
本指南(以及文档中的大多数其他指南)使用 Jupyter notebooks 并假定读者也使用它。Jupyter notebooks 非常适合学习如何与大型语言模型系统协作,因为很多时候事情可能会出错(输出意外、API 不可用等),在交互式环境中学习指南是更好地理解它们的好方法。
这本教程和其他教程可能最适合在 Jupyter notebook 中运行。有关安装说明,请参见此处。
安装
在本教程中,我们需要安装 @langchain/core 和 langgraph:
- npm
- yarn
- pnpm
npm i @langchain/core @langchain/langgraph uuid
yarn add @langchain/core @langchain/langgraph uuid
pnpm add @langchain/core @langchain/langgraph uuid
更多详细信息,请参见我们的安装指南。
LangSmith
您使用 LangChain 构建的许多应用程序将包含多个步骤和多次调用 LLM。 随着这些应用程序变得越来越复杂,能够检查您的链或代理内部发生了什么变得至关重要。 最佳方式是使用 LangSmith。
在上面的链接注册后,请确保设置您的环境变量以开始记录追踪信息:
process.env.LANGSMITH_TRACING = "true";
process.env.LANGSMITH_API_KEY = "...";
快速入门
首先,让我们学习如何单独使用一个语言模型。LangChain 支持许多不同的语言模型,您可以根据需要选择使用以下任意一个!
Pick your chat model:
- Groq
- OpenAI
- Anthropic
- Google Gemini
- FireworksAI
- MistralAI
- VertexAI
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/groq
yarn add @langchain/groq
pnpm add @langchain/groq
Add environment variables
GROQ_API_KEY=your-api-key
Instantiate the model
import { ChatGroq } from "@langchain/groq";
const llm = new ChatGroq({
model: "llama-3.3-70b-versatile",
temperature: 0
});
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
Add environment variables
OPENAI_API_KEY=your-api-key
Instantiate the model
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0
});
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
Add environment variables
ANTHROPIC_API_KEY=your-api-key
Instantiate the model
import { ChatAnthropic } from "@langchain/anthropic";
const llm = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0
});
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/google-genai
yarn add @langchain/google-genai
pnpm add @langchain/google-genai
Add environment variables
GOOGLE_API_KEY=your-api-key
Instantiate the model
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
const llm = new ChatGoogleGenerativeAI({
model: "gemini-2.0-flash",
temperature: 0
});
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/community
yarn add @langchain/community
pnpm add @langchain/community
Add environment variables
FIREWORKS_API_KEY=your-api-key
Instantiate the model
import { ChatFireworks } from "@langchain/community/chat_models/fireworks";
const llm = new ChatFireworks({
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
temperature: 0
});
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/mistralai
yarn add @langchain/mistralai
pnpm add @langchain/mistralai
Add environment variables
MISTRAL_API_KEY=your-api-key
Instantiate the model
import { ChatMistralAI } from "@langchain/mistralai";
const llm = new ChatMistralAI({
model: "mistral-large-latest",
temperature: 0
});
Install dependencies
- npm
- yarn
- pnpm
npm i @langchain/google-vertexai
yarn add @langchain/google-vertexai
pnpm add @langchain/google-vertexai
Add environment variables
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
Instantiate the model
import { ChatVertexAI } from "@langchain/google-vertexai";
const llm = new ChatVertexAI({
model: "gemini-1.5-flash",
temperature: 0
});
我们首先直接使用该模型。ChatModel 是 LangChain “Runnables”
的实例,这意味着它们提供了一个标准接口用于与它们进行交互。要简单地调用模型,我们可以将消息列表传递给
.invoke 方法。
await llm.invoke([{ role: "user", content: "Hi im bob" }]);
AIMessage {
"id": "chatcmpl-AekDrrCyaBauLYHuVv3dkacxW2G1J",
"content": "Hi Bob! How can I help you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 10,
"completionTokens": 10,
"totalTokens": 20
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 10,
"completion_tokens": 10,
"total_tokens": 20,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 10,
"input_tokens": 10,
"total_tokens": 20,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
模型本身没有任何状态的概念。例如,如果你提出一个后续问题:
await llm.invoke([{ role: "user", content: "Whats my name" }]);
AIMessage {
"id": "chatcmpl-AekDuOk1LjOdBVLtuCvuHjAs5aoad",
"content": "I'm sorry, but I don't have access to personal information about users unless you've shared it with me in this conversation. How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 10,
"completionTokens": 30,
"totalTokens": 40
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 10,
"completion_tokens": 30,
"total_tokens": 40,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 30,
"input_tokens": 10,
"total_tokens": 40,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
让我们看一下这个示例 LangSmith 追踪
我们可以看到,它没有将之前的对话轮次纳入上下文,因此无法回答问题。 这会导致非常糟糕的聊天机器人体验!
为了解决这个问题,我们需要将整个对话历史传递给模型。让我们看看这样做的时候会发生什么:
await llm.invoke([
{ role: "user", content: "Hi! I'm Bob" },
{ role: "assistant", content: "Hello Bob! How can I assist you today?" },
{ role: "user", content: "What's my name?" },
]);
AIMessage {
"id": "chatcmpl-AekDyJdj6y9IREyNIf3tkKGRKhN1Z",
"content": "Your name is Bob! How can I help you today, Bob?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 33,
"completionTokens": 14,
"totalTokens": 47
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 33,
"completion_tokens": 14,
"total_tokens": 47,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 14,
"input_tokens": 33,
"total_tokens": 47,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
现在我们可以看到我们得到了一个很好的回复!
这就是聊天机器人能够进行对话交互的基本原理。 那么我们该如何最好地实现这一点呢?
消息持久化
LangGraph 实现了一个内置的持久化层,使其非常适合支持多轮对话的聊天应用。
将我们的聊天模型封装在一个最小的 LangGraph 应用中,可以自动持久化消息历史记录,从而简化多轮应用的开发。
LangGraph 配备了一个简单的内存检查点工具,我们在下面使用它。
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
// Define the function that calls the model
const callModel = async (state: typeof MessagesAnnotation.State) => {
const response = await llm.invoke(state.messages);
return { messages: response };
};
// Define a new graph
const workflow = new StateGraph(MessagesAnnotation)
// Define the node and edge
.addNode("model", callModel)
.addEdge(START, "model")
.addEdge("model", END);
// Add memory
const memory = new MemorySaver();
const app = workflow.compile({ checkpointer: memory });
我们现在需要创建一个 config,每次都要将其传递给可运行对象。此 config
包含不直接属于输入但仍然有用的信息。在这种情况下,我们想要包含一个
thread_id。它应该如下所示:
import { v4 as uuidv4 } from "uuid";
const config = { configurable: { thread_id: uuidv4() } };
这使我们能够使用单个应用程序支持多个对话线程,当您的应用程序具有多个用户时,这是一个常见需求。
然后我们可以调用该应用程序:
const input = [
{
role: "user",
content: "Hi! I'm Bob.",
},
];
const output = await app.invoke({ messages: input }, config);
// The output contains all messages in the state.
// This will log the last message in the conversation.
console.log(output.messages[output.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekEFPclmrO7YfAe7J0zUAanS4ifx",
"content": "Hi Bob! How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 12,
"completionTokens": 10,
"totalTokens": 22
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 12,
"completion_tokens": 10,
"total_tokens": 22,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 10,
"input_tokens": 12,
"total_tokens": 22,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
const input2 = [
{
role: "user",
content: "What's my name?",
},
];
const output2 = await app.invoke({ messages: input2 }, config);
console.log(output2.messages[output2.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekEJgCfLodGCcuLgLQdJevH7CpCJ",
"content": "Your name is Bob! How can I help you today, Bob?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 34,
"completionTokens": 14,
"totalTokens": 48
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 34,
"completion_tokens": 14,
"total_tokens": 48,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 14,
"input_tokens": 34,
"total_tokens": 48,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
太好了!我们的聊天机器人现在可以记住关于我们的信息。如果我们更改配置以引用不同的
thread_id,我们会看到它重新开始对话。
const config2 = { configurable: { thread_id: uuidv4() } };
const input3 = [
{
role: "user",
content: "What's my name?",
},
];
const output3 = await app.invoke({ messages: input3 }, config2);
console.log(output3.messages[output3.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekELvPXLtjOKgLN63mQzZwvyo12J",
"content": "I'm sorry, but I don't have access to personal information about individuals unless you share it with me. How can I assist you today?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 11,
"completionTokens": 27,
"totalTokens": 38
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 11,
"completion_tokens": 27,
"total_tokens": 38,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_39a40c96a0"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 27,
"input_tokens": 11,
"total_tokens": 38,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
但是,我们始终可以回溯到原始对话(因为我们将其持久化存储在数据库中)
const output4 = await app.invoke({ messages: input2 }, config);
console.log(output4.messages[output4.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekEQ8Z5JmYquSfzPsCWv1BDTKZSh",
"content": "Your name is Bob. Is there something specific you would like to talk about?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 60,
"completionTokens": 16,
"totalTokens": 76
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 60,
"completion_tokens": 16,
"total_tokens": 76,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_39a40c96a0"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 16,
"input_tokens": 60,
"total_tokens": 76,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
这就是我们如何支持聊天机器人与多个用户进行对话!
目前,我们所做的只是在模型周围添加了一个简单的持久化层。我们可以通过添加提示模板(prompt template)来使其变得更加复杂和个性化。
提示模板
提示模板有助于将原始用户信息转换为 LLM 可以处理的格式。在这种情况下,原始用户输入只是一个消息,我们将其传递给 LLM。现在我们让这个过程稍微复杂一些。首先,让我们添加一条带有自定义指令的系统消息(system message),但仍然接收消息作为输入。接下来,我们将添加除消息之外的更多输入。
要添加系统消息,我们将创建一个ChatPromptTemplate。我们将使用MessagesPlaceholder将所有消息传递进去。
import { ChatPromptTemplate } from "@langchain/core/prompts";
const promptTemplate = ChatPromptTemplate.fromMessages([
[
"system",
"You talk like a pirate. Answer all questions to the best of your ability.",
],
["placeholder", "{messages}"],
]);
现在我们可以更新我们的应用程序以包含此模板:
import {
START,
END,
MessagesAnnotation,
StateGraph,
MemorySaver,
} from "@langchain/langgraph";
// Define the function that calls the model
const callModel2 = async (state: typeof MessagesAnnotation.State) => {
const prompt = await promptTemplate.invoke(state);
const response = await llm.invoke(prompt);
// Update message history with response:
return { messages: [response] };
};
// Define a new graph
const workflow2 = new StateGraph(MessagesAnnotation)
// Define the (single) node in the graph
.addNode("model", callModel2)
.addEdge(START, "model")
.addEdge("model", END);
// Add memory
const app2 = workflow2.compile({ checkpointer: new MemorySaver() });
我们以相同的方式调用应用程序:
const config3 = { configurable: { thread_id: uuidv4() } };
const input4 = [
{
role: "user",
content: "Hi! I'm Jim.",
},
];
const output5 = await app2.invoke({ messages: input4 }, config3);
console.log(output5.messages[output5.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekEYAQVqh9OFZRGdzGiPz33WPf1v",
"content": "Ahoy, Jim! A pleasure to meet ye, matey! What be on yer mind this fine day?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 32,
"completionTokens": 23,
"totalTokens": 55
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 32,
"completion_tokens": 23,
"total_tokens": 55,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_39a40c96a0"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 23,
"input_tokens": 32,
"total_tokens": 55,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
const input5 = [
{
role: "user",
content: "What is my name?",
},
];
const output6 = await app2.invoke({ messages: input5 }, config3);
console.log(output6.messages[output6.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekEbrpFI3K8BxemHZ5fG4xF2tT8x",
"content": "Ye be callin' yerself Jim, if I heard ye right, savvy? What else can I do fer ye, me hearty?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 68,
"completionTokens": 29,
"totalTokens": 97
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 68,
"completion_tokens": 29,
"total_tokens": 97,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 29,
"input_tokens": 68,
"total_tokens": 97,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
太棒了!现在让我们让提示词稍微复杂一点。假设提示模板现在看起来像这样:
const promptTemplate2 = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
],
["placeholder", "{messages}"],
]);
请注意,我们已在提示中添加了新的 language
输入。我们的应用程序现在有两个参数——输入的 messages 和
language。我们需要更新应用程序的状态以反映此更改:
import {
START,
END,
StateGraph,
MemorySaver,
MessagesAnnotation,
Annotation,
} from "@langchain/langgraph";
// Define the State
const GraphAnnotation = Annotation.Root({
...MessagesAnnotation.spec,
language: Annotation<string>(),
});
// Define the function that calls the model
const callModel3 = async (state: typeof GraphAnnotation.State) => {
const prompt = await promptTemplate2.invoke(state);
const response = await llm.invoke(prompt);
return { messages: [response] };
};
const workflow3 = new StateGraph(GraphAnnotation)
.addNode("model", callModel3)
.addEdge(START, "model")
.addEdge("model", END);
const app3 = workflow3.compile({ checkpointer: new MemorySaver() });
const config4 = { configurable: { thread_id: uuidv4() } };
const input6 = {
messages: [
{
role: "user",
content: "Hi im bob",
},
],
language: "Spanish",
};
const output7 = await app3.invoke(input6, config4);
console.log(output7.messages[output7.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekF4R7ioefFo6PmOYo3YuCbGpROq",
"content": "¡Hola, Bob! ¿Cómo puedo ayudarte hoy?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 32,
"completionTokens": 11,
"totalTokens": 43
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 32,
"completion_tokens": 11,
"total_tokens": 43,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_39a40c96a0"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 11,
"input_tokens": 32,
"total_tokens": 43,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
请注意,整个状态都会被持久化,因此如果没有更改需求,我们可以省略诸如
language 之类的参数:
const input7 = {
messages: [
{
role: "user",
content: "What is my name?",
},
],
};
const output8 = await app3.invoke(input7, config4);
console.log(output8.messages[output8.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekF8yN7H81ITccWlBzSahmduP69T",
"content": "Tu nombre es Bob. ¿En qué puedo ayudarte, Bob?",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 56,
"completionTokens": 13,
"totalTokens": 69
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 56,
"completion_tokens": 13,
"total_tokens": 69,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 13,
"input_tokens": 56,
"total_tokens": 69,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
为了帮助您了解内部发生的情况,请查看 此 LangSmith 跟踪。
管理对话历史
构建聊天机器人时需要理解的一个重要概念是如何管理对话历史。如果任其发展,消息列表将无限增长,并可能导致超出 LLM 的上下文窗口限制。因此,重要的是增加一个步骤来限制传入消息的大小。
需要注意的是,此步骤应在应用提示模板之前,但在从消息历史中加载先前消息之后进行。
我们可以通过在提示模板前添加一个简单的步骤来实现这一点,该步骤适当地修改messages键,然后将这个新的链封装到消息历史类中。
LangChain 提供了一些内建的帮助函数来管理消息列表。在本例中,我们将使用trimMessages工具来减少发送给模型的消息数量。该修剪器允许我们指定需要保留的 token 数量,以及其他参数,例如是否始终保留系统消息以及是否允许部分消息:
import {
SystemMessage,
HumanMessage,
AIMessage,
trimMessages,
} from "@langchain/core/messages";
const trimmer = trimMessages({
maxTokens: 10,
strategy: "last",
tokenCounter: (msgs) => msgs.length,
includeSystem: true,
allowPartial: false,
startOn: "human",
});
const messages = [
new SystemMessage("you're a good assistant"),
new HumanMessage("hi! I'm bob"),
new AIMessage("hi!"),
new HumanMessage("I like vanilla ice cream"),
new AIMessage("nice"),
new HumanMessage("whats 2 + 2"),
new AIMessage("4"),
new HumanMessage("thanks"),
new AIMessage("no problem!"),
new HumanMessage("having fun?"),
new AIMessage("yes!"),
];
await trimmer.invoke(messages);
[
SystemMessage {
"content": "you're a good assistant",
"additional_kwargs": {},
"response_metadata": {}
},
HumanMessage {
"content": "I like vanilla ice cream",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"content": "nice",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"content": "whats 2 + 2",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"content": "4",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"content": "thanks",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"content": "no problem!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
},
HumanMessage {
"content": "having fun?",
"additional_kwargs": {},
"response_metadata": {}
},
AIMessage {
"content": "yes!",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"invalid_tool_calls": []
}
]
在我们的链中使用它时,只需在将 messages
输入传递给提示之前运行修剪器即可。
const callModel4 = async (state: typeof GraphAnnotation.State) => {
const trimmedMessage = await trimmer.invoke(state.messages);
const prompt = await promptTemplate2.invoke({
messages: trimmedMessage,
language: state.language,
});
const response = await llm.invoke(prompt);
return { messages: [response] };
};
const workflow4 = new StateGraph(GraphAnnotation)
.addNode("model", callModel4)
.addEdge(START, "model")
.addEdge("model", END);
const app4 = workflow4.compile({ checkpointer: new MemorySaver() });
现在如果我们尝试问模型我们的名字,它不会知道,因为我们已经删掉了聊天记录中的那部分内容:
const config5 = { configurable: { thread_id: uuidv4() } };
const input8 = {
messages: [...messages, new HumanMessage("What is my name?")],
language: "English",
};
const output9 = await app4.invoke(input8, config5);
console.log(output9.messages[output9.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekHyVN7f0Pnuyc2RHVL8CxKmFfMQ",
"content": "I don't know your name. You haven't shared it yet!",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 97,
"completionTokens": 12,
"totalTokens": 109
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 97,
"completion_tokens": 12,
"total_tokens": 109,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 12,
"input_tokens": 97,
"total_tokens": 109,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
但如果我们在最近的几条消息中询问信息,它是能记住的:
const config6 = { configurable: { thread_id: uuidv4() } };
const input9 = {
messages: [...messages, new HumanMessage("What math problem did I ask?")],
language: "English",
};
const output10 = await app4.invoke(input9, config6);
console.log(output10.messages[output10.messages.length - 1]);
AIMessage {
"id": "chatcmpl-AekI1jwlErzHuZ3BhAxr97Ct818Pp",
"content": "You asked what 2 + 2 equals.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 99,
"completionTokens": 10,
"totalTokens": 109
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 99,
"completion_tokens": 10,
"total_tokens": 109,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"system_fingerprint": "fp_6fc10e10eb"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 10,
"input_tokens": 99,
"total_tokens": 109,
"input_token_details": {
"audio": 0,
"cache_read": 0
},
"output_token_details": {
"audio": 0,
"reasoning": 0
}
}
}
如果你查看 LangSmith,你可以确切地看到 LangSmith 追踪 中底层发生了什么。
下一步
既然你已经了解了如何在 LangChain 中创建聊天机器人的基础知识,你可能还会对以下更高级的教程感兴趣:
如果你想深入了解具体细节,以下内容也值得查阅:
- 流式传输:流式传输对于聊天应用至关重要
- 如何添加消息历史记录:深入了解与消息历史相关的所有内容
- 如何管理大量消息历史记录:管理大量聊天历史的更多技巧
- LangGraph 主文档:深入了解使用 LangGraph 构建应用的更多细节