Skip to main content

回退机制(Fallbacks)

在使用语言模型时,您可能会遇到由底层 API 引发的问题,例如速率限制或服务停机。 当您将 LLM 应用程序投入生产环境时,为错误准备应急方案变得越来越重要。 这就是为什么我们引入了“回退机制(fallbacks)”的概念。

关键在于,回退机制不仅可以应用于 LLM 层面,还可以应用在整个“可运行对象(runnable)”层面。 这很重要,因为不同的模型往往需要不同的提示(prompt)。因此,当您调用 OpenAI 失败时,您不仅仅想把同样的提示发送给 Anthropic —— 您可能更希望使用一个不同的提示模板。

处理 LLM API 错误

这可能是回退机制最常见的使用场景之一。调用 LLM API 时可能由于多种原因失败 —— API 可能宕机、您可能达到了速率限制,或者出现其他各种问题。

重要提示: 默认情况下,LangChain 的许多 LLM 封装器会捕获错误并自动重试。 在使用回退机制时,您很可能希望关闭这些自动重试功能。否则第一个封装器会不断尝试重试,而不是直接失败。

:::提示 请参阅安装集成包的一般说明部分。 :::

npm install @langchain/anthropic @langchain/openai @langchain/core
import { ChatOpenAI } from "@langchain/openai";
import { ChatAnthropic } from "@langchain/anthropic";

// Use a fake model name that will always throw an error
const fakeOpenAIModel = new ChatOpenAI({
model: "potato!",
maxRetries: 0,
});

const anthropicModel = new ChatAnthropic({});

const modelWithFallback = fakeOpenAIModel.withFallbacks([anthropicModel]);

const result = await modelWithFallback.invoke("What is your name?");

console.log(result);

/*
AIMessage {
content: ' My name is Claude. I was created by Anthropic.',
additional_kwargs: {}
}
*/

API Reference:

为 RunnableSequences 设置回退机制

我们还可以为可运行序列(RunnableSequences)创建回退机制,这些回退机制本身也是序列。 在这里,我们使用两个不同的模型来实现:ChatOpenAI,然后是普通 OpenAI(非聊天模型)。 由于 OpenAI 并非聊天模型,您可能需要使用不同的提示。

import { ChatOpenAI, OpenAI } from "@langchain/openai";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate, PromptTemplate } from "@langchain/core/prompts";

const chatPrompt = ChatPromptTemplate.fromMessages<{ animal: string }>([
[
"system",
"You're a nice assistant who always includes a compliment in your response",
],
["human", "Why did the {animal} cross the road?"],
]);

// Use a fake model name that will always throw an error
const fakeOpenAIChatModel = new ChatOpenAI({
model: "potato!",
maxRetries: 0,
});

const prompt =
PromptTemplate.fromTemplate(`Instructions: You should always include a compliment in your response.

Question: Why did the {animal} cross the road?

Answer:`);

const openAILLM = new OpenAI({});

const outputParser = new StringOutputParser();

const badChain = chatPrompt.pipe(fakeOpenAIChatModel).pipe(outputParser);

const goodChain = prompt.pipe(openAILLM).pipe(outputParser);

const chain = badChain.withFallbacks([goodChain]);

const result = await chain.invoke({
animal: "dragon",
});

console.log(result);

/*
I don't know, but I'm sure it was an impressive sight. You must have a great imagination to come up with such an interesting question!
*/

API Reference:

处理长输入

LLM 的一大限制因素是它们的上下文窗口长度。 有时您可以提前计算并跟踪发送给 LLM 的提示长度, 但在某些难以或复杂的情况下,您可以回退到具有更长上下文长度的模型。

import { ChatOpenAI } from "@langchain/openai";

// Use a model with a shorter context window
const shorterLlm = new ChatOpenAI({
model: "gpt-3.5-turbo",
maxRetries: 0,
});

const longerLlm = new ChatOpenAI({
model: "gpt-3.5-turbo-16k",
});

const modelWithFallback = shorterLlm.withFallbacks([longerLlm]);

const input = `What is the next number: ${"one, two, ".repeat(3000)}`;

try {
await shorterLlm.invoke(input);
} catch (e) {
// Length error
console.log(e);
}

const result = await modelWithFallback.invoke(input);

console.log(result);

/*
AIMessage {
content: 'The next number is one.',
name: undefined,
additional_kwargs: { function_call: undefined }
}
*/

API Reference:

回退到更强大的模型

很多时候我们要求模型以特定格式输出(如 JSON)。GPT-3.5 类模型可以做到这一点,但有时会失败。 这自然引出了回退机制 —— 我们可以先使用一个更快更便宜的模型,如果解析失败,则回退到 GPT-4。

import { z } from "zod";
import { OpenAI, ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { StructuredOutputParser } from "@langchain/core/output_parsers";

const prompt = PromptTemplate.fromTemplate(
`Return a JSON object containing the following value wrapped in an "input" key. Do not return anything else:\n{input}`
);

const badModel = new OpenAI({
maxRetries: 0,
model: "gpt-3.5-turbo-instruct",
});

const normalModel = new ChatOpenAI({
model: "gpt-4",
});

const outputParser = StructuredOutputParser.fromZodSchema(
z.object({
input: z.string(),
})
);

const badChain = prompt.pipe(badModel).pipe(outputParser);

const goodChain = prompt.pipe(normalModel).pipe(outputParser);

try {
const result = await badChain.invoke({
input: "testing0",
});
} catch (e) {
console.log(e);
/*
OutputParserException [Error]: Failed to parse. Text: "

{ "name" : " Testing0 ", "lastname" : " testing ", "fullname" : " testing ", "role" : " test ", "telephone" : "+1-555-555-555 ", "email" : " [email protected] ", "role" : " test ", "text" : " testing0 is different than testing ", "role" : " test ", "immediate_affected_version" : " 0.0.1 ", "immediate_version" : " 1.0.0 ", "leading_version" : " 1.0.0 ", "version" : " 1.0.0 ", "finger prick" : " no ", "finger prick" : " s ", "text" : " testing0 is different than testing ", "role" : " test ", "immediate_affected_version" : " 0.0.1 ", "immediate_version" : " 1.0.0 ", "leading_version" : " 1.0.0 ", "version" : " 1.0.0 ", "finger prick" :". Error: SyntaxError: Unexpected end of JSON input
*/
}

const chain = badChain.withFallbacks([goodChain]);

const result = await chain.invoke({
input: "testing",
});

console.log(result);

/*
{ input: 'testing' }
*/

API Reference:


Was this page helpful?


You can also leave detailed feedback on GitHub.