Skip to main content

如何流式传输聊天模型响应

所有聊天模型都实现了Runnable 接口,该接口提供了标准 Runnable 方法的默认实现(即 invokebatchstreamstreamEvents)。本指南介绍了如何使用这些方法从聊天模型中流式传输输出。

tip

默认实现支持逐令牌流式传输,而是会返回一个 AsyncGenerator,该生成器将在一个数据块中返回所有模型输出。此实现旨在确保模型可以与其他模型互换,因为它支持相同的标准接口。

是否能够逐令牌流式传输输出取决于提供方是否实现了逐令牌流式传输支持。

你可以在此处查看哪些集成支持逐令牌流式传输

流式处理

以下示例中,我们使用 --- 来帮助可视化标记之间的分隔符。

Pick your chat model:

Install dependencies

yarn add @langchain/groq 

Add environment variables

GROQ_API_KEY=your-api-key

Instantiate the model

import { ChatGroq } from "@langchain/groq";

const model = new ChatGroq({
model: "llama-3.3-70b-versatile",
temperature: 0
});
const stream = await model.stream(
"Write me a 1 verse song about goldfish on the moon"
);

for await (const chunk of stream) {
console.log(`${chunk.content}\n---`);
}

---
Here's
---
a one
---
-
---
verse song about goldfish on
---
the moon:

Verse
---
:
Swimming
---
through the stars
---
,
---
in
---
a cosmic
---
lag
---
oon
---

Little
---
golden
---
scales
---
,
---
reflecting the moon
---

No
---
gravity to
---
hold them,
---
they
---
float with
---
glee
Goldfish
---
astron
---
auts, on a lunar
---
sp
---
ree
---

Bub
---
bles rise
---
like
---
com
---
ets, in the
---
star
---
ry night
---

Their fins like
---
tiny
---
rockets, a
---
w
---
ondrous sight
Who
---
knew
---
these
---
small
---
creatures
---
,
---
could con
---
quer space?
---

Goldfish on the moon,
---
with
---
such
---
fis
---
hy grace
---

---

---

流式事件

聊天模型还支持标准的 streamEvents() 方法,用于从链内部流式传输更细粒度的事件。

当你从一个包含多个步骤的大型 LLM 应用程序(例如,由提示词、聊天模型和解析器组成的链)中流式传输输出时,此方法非常有用:

const eventStream = await model.streamEvents(
"Write me a 1 verse song about goldfish on the moon",
{
version: "v2",
}
);

const events = [];
for await (const event of eventStream) {
events.push(event);
}

events.slice(0, 3);
[
{
event: "on_chat_model_start",
data: { input: "Write me a 1 verse song about goldfish on the moon" },
name: "ChatAnthropic",
tags: [],
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
}
},
{
event: "on_chat_model_stream",
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
name: "ChatAnthropic",
tags: [],
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
},
data: {
chunk: AIMessageChunk {
lc_serializable: true,
lc_kwargs: {
content: "",
additional_kwargs: [Object],
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
response_metadata: {}
},
lc_namespace: [ "langchain_core", "messages" ],
content: "",
name: undefined,
additional_kwargs: {
id: "msg_01JaaH9ZUXg7bUnxzktypRak",
type: "message",
role: "assistant",
model: "claude-3-5-sonnet-20240620"
},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
usage_metadata: undefined
}
}
},
{
event: "on_chat_model_stream",
run_id: "d60a87d6-acf0-4ae1-bf27-e570aa101960",
name: "ChatAnthropic",
tags: [],
metadata: {
ls_provider: "openai",
ls_model_name: "claude-3-5-sonnet-20240620",
ls_model_type: "chat",
ls_temperature: 1,
ls_max_tokens: 2048,
ls_stop: undefined
},
data: {
chunk: AIMessageChunk {
lc_serializable: true,
lc_kwargs: {
content: "Here's",
additional_kwargs: {},
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
response_metadata: {}
},
lc_namespace: [ "langchain_core", "messages" ],
content: "Here's",
name: undefined,
additional_kwargs: {},
response_metadata: {},
id: undefined,
tool_calls: [],
invalid_tool_calls: [],
tool_call_chunks: [],
usage_metadata: undefined
}
}
}
]

下一步

您现在已经了解了几种可以流式传输聊天模型响应的方式。 接下来,请查看本指南以了解更多关于使用其他 LangChain 模块进行流式传输的内容。


Was this page helpful?


You can also leave detailed feedback on GitHub.