如何缓存聊天模型的响应

前提条件

本指南假定您熟悉以下概念：

LangChain 为聊天模型提供了一个可选的缓存层。这有两个好处：

如果经常多次请求相同的补全结果，可以通过减少向 LLM 提供商发起的 API 调用次数来节省费用。可以通过减少向 LLM 提供商发起的 API 调用次数来加快应用程序的速度。

import { ChatOpenAI } from "@langchain/openai";

// 为了使缓存效果更加明显，我们使用一个较慢的模型。
const model = new ChatOpenAI({
  model: "gpt-4",
  cache: true,
});

内存缓存

默认的缓存是存储在内存中的。这意味着如果你重启你的应用程序，缓存将被清除。

console.time();

// 第一次调用时，尚未缓存，因此应该较慢
const res = await model.invoke("给我讲个笑话！");
console.log(res);

console.timeEnd();

/*
  AIMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "为什么科学家不相信原子？\\n\\n因为它们构成了一切！",
      additional_kwargs: { function_call: undefined, tool_calls: undefined }
    },
    lc_namespace: [ 'langchain_core', 'messages' ],
    content: "为什么科学家不相信原子？\\n\\n因为它们构成了一切！",
    name: undefined,
    additional_kwargs: { function_call: undefined, tool_calls: undefined }
  }
  default: 2.224s
*/

console.time();

// 第二次调用时已缓存，因此更快
const res2 = await model.invoke("给我讲个笑话！");
console.log(res2);

console.timeEnd();
/*
  AIMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "为什么科学家不相信原子？\\n\\n因为它们构成了一切！",
      additional_kwargs: { function_call: undefined, tool_calls: undefined }
    },
    lc_namespace: [ 'langchain_core', 'messages' ],
    content: "为什么科学家不相信原子？\\n\\n因为它们构成了一切！",
    name: undefined,
    additional_kwargs: { function_call: undefined, tool_calls: undefined }
  }
  default: 181.98ms
*/

使用 Redis 缓存

LangChain 还提供了基于 Redis 的缓存。如果你想在多个进程或服务器之间共享缓存，这非常有用。要使用它，你需要安装 redis 包：

npm
Yarn
pnpm

npm install ioredis @langchain/community @langchain/core

yarn add ioredis @langchain/community @langchain/core

pnpm add ioredis @langchain/community @langchain/core

然后，当实例化 LLM 时，可以传递一个 cache 选项。例如：

import { ChatOpenAI } from "@langchain/openai";
import { Redis } from "ioredis";
import { RedisCache } from "@langchain/community/caches/ioredis";

const client = new Redis("redis://localhost:6379");

const cache = new RedisCache(client, {
  ttl: 60, // Optional key expiration value
});

const model = new ChatOpenAI({ model: "gpt-4o-mini", cache });

const response1 = await model.invoke("Do something random!");
console.log(response1);
/*
  AIMessage {
    content: "Sure! I'll generate a random number for you: 37",
    additional_kwargs: {}
  }
*/

const response2 = await model.invoke("Do something random!");
console.log(response2);
/*
  AIMessage {
    content: "Sure! I'll generate a random number for you: 37",
    additional_kwargs: {}
  }
*/

await client.disconnect();

API Reference:

ChatOpenAI from @langchain/openai
RedisCache from @langchain/community/caches/ioredis

使用 Upstash Redis 缓存

LangChain 提供了基于 Upstash Redis 的缓存。与基于 Redis 的缓存一样，这种缓存对于在多个进程或服务器之间共享缓存也很有用。Upstash Redis 客户端使用 HTTP，并支持边缘环境。要使用它，你需要安装 @upstash/redis 包：

npm
Yarn
pnpm

npm install @upstash/redis

yarn add @upstash/redis

pnpm add @upstash/redis

你还需要一个 Upstash 账号和一个 Redis 数据库来连接。完成之后，获取你的 REST URL 和 REST Token。

然后，当实例化 LLM 时，可以传递一个 cache 选项。例如：

import { ChatOpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
  config: {
    url: "UPSTASH_REDIS_REST_URL",
    token: "UPSTASH_REDIS_REST_TOKEN",
  },
  ttl: 3600,
});

const model = new ChatOpenAI({ model: "gpt-4o-mini", cache });

API Reference:

ChatOpenAI from @langchain/openai
UpstashRedisCache from @langchain/community/caches/upstash_redis

你也可以直接传入一个之前创建的 @upstash/redis 客户端实例：

import { Redis } from "@upstash/redis";
import https from "https";

import { ChatOpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// const client = new Redis({
//   url: process.env.UPSTASH_REDIS_REST_URL!,
//   token: process.env.UPSTASH_REDIS_REST_TOKEN!,
//   agent: new https.Agent({ keepAlive: true }),
// });

// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
  agent: new https.Agent({ keepAlive: true }),
});

const cache = new UpstashRedisCache({ client });
const model = new ChatOpenAI({ model: "gpt-4o-mini", cache });

API Reference:

ChatOpenAI from @langchain/openai
UpstashRedisCache from @langchain/community/caches/upstash_redis

使用 Vercel KV 缓存

LangChain 提供了基于 Vercel KV 的缓存。与基于 Redis 的缓存一样，这种缓存在多个进程或服务器之间共享缓存也很有用。Vercel KV 客户端使用 HTTP，并支持边缘环境。要使用它，你需要安装 @vercel/kv 包：

npm
Yarn
pnpm

npm install @vercel/kv

yarn add @vercel/kv

pnpm add @vercel/kv

你还需要一个 Vercel 账号和一个 KV 数据库来连接。完成之后，获取你的 REST URL 和 REST Token。

然后，当实例化 LLM 时，可以传递一个 cache 选项。例如：

import { ChatOpenAI } from "@langchain/openai";
import { VercelKVCache } from "@langchain/community/caches/vercel_kv";
import { createClient } from "@vercel/kv";

// See https://vercel.com/docs/storage/vercel-kv/kv-reference#createclient-example for connection options
const cache = new VercelKVCache({
  client: createClient({
    url: "VERCEL_KV_API_URL",
    token: "VERCEL_KV_API_TOKEN",
  }),
  ttl: 3600,
});

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  cache,
});

API Reference:

ChatOpenAI from @langchain/openai
VercelKVCache from @langchain/community/caches/vercel_kv

使用 Cloudflare KV 缓存

info

该集成仅支持在 Cloudflare Workers 中使用。

如果你将项目部署为 Cloudflare Worker，你可以使用 LangChain 提供的基于 Cloudflare KV 的 LLM 缓存。

关于如何在 Cloudflare 中设置 KV，请参见官方文档。

注意： 如果你使用的是 TypeScript，可能需要安装类型定义文件（如果尚未安装）：

npm
Yarn
pnpm

npm install -S @cloudflare/workers-types

yarn add @cloudflare/workers-types

pnpm add @cloudflare/workers-types

import type { KVNamespace } from "@cloudflare/workers-types";

import { ChatOpenAI } from "@langchain/openai";
import { CloudflareKVCache } from "@langchain/cloudflare";

export interface Env {
  KV_NAMESPACE: KVNamespace;
  OPENAI_API_KEY: string;
}

export default {
  async fetch(_request: Request, env: Env) {
    try {
      const cache = new CloudflareKVCache(env.KV_NAMESPACE);
      const model = new ChatOpenAI({
        cache,
        model: "gpt-3.5-turbo",
        apiKey: env.OPENAI_API_KEY,
      });
      const response = await model.invoke("How are you today?");
      return new Response(JSON.stringify(response), {
        headers: { "content-type": "application/json" },
      });
    } catch (err: any) {
      console.log(err.message);
      return new Response(err.message, { status: 500 });
    }
  },
};

API Reference:

ChatOpenAI from @langchain/openai
CloudflareKVCache from @langchain/cloudflare

文件系统缓存

danger

不建议在生产环境中使用此缓存。它仅用于本地开发。

LangChain 提供了一个简单的文件系统缓存。默认情况下，缓存存储在一个临时目录中，但你也可以指定一个自定义目录。

const cache = await LocalFileCache.create();

下一步

现在你已经学会了如何缓存模型响应以节省时间和费用。

接下来，查看其他关于聊天模型的指南，例如如何让模型返回结构化输出或如何创建自己的自定义聊天模型。

如何缓存聊天模型的响应

内存缓存

使用 Redis 缓存

API Reference:

使用 Upstash Redis 缓存

API Reference:

API Reference:

使用 Vercel KV 缓存

API Reference:

使用 Cloudflare KV 缓存

API Reference:

文件系统缓存

下一步

Was this page helpful?

You can also leave detailed feedback on GitHub.

如何缓存聊天模型的响应

内存缓存​

使用 Redis 缓存​

API Reference:

使用 Upstash Redis 缓存​

API Reference:

API Reference:

使用 Vercel KV 缓存​

API Reference:

使用 Cloudflare KV 缓存​

API Reference:

文件系统缓存​

下一步​

Was this page helpful?

You can also leave detailed feedback on GitHub.

内存缓存

使用 Redis 缓存

使用 Upstash Redis 缓存

使用 Vercel KV 缓存

使用 Cloudflare KV 缓存

文件系统缓存

下一步