如何缓存模型响应

LangChain 提供了一个可选的缓存层用于 LLM。这有以下两个好处：

如果你经常多次请求相同的补全内容，它可以通过减少你向 LLM 提供商发出的 API 调用次数来为你节省费用。
它可以通过减少你向 LLM 提供商发出的 API 调用次数来加快你的应用程序速度。

npm
Yarn
pnpm

npm install @langchain/openai @langchain/core

yarn add @langchain/openai @langchain/core

pnpm add @langchain/openai @langchain/core

import { OpenAI } from "@langchain/openai";

const model = new OpenAI({
  model: "gpt-3.5-turbo-instruct",
  cache: true,
});

内存缓存

默认的缓存是存储在内存中的。这意味着如果你重启你的应用程序，缓存将被清除。

console.time();

// 第一次时，尚未缓存，因此应该耗时较长
const res = await model.invoke("Tell me a long joke");

console.log(res);

console.timeEnd();

/*
  一个男人走进一家酒吧，看到柜台上有一个装满钱的罐子。他好奇地问酒保这是怎么回事。

  酒保解释说：“我们为顾客提供了一个挑战。如果你能完成三个任务，你就能赢得罐子里所有的钱。”

  这个男人很感兴趣，便问任务是什么。

  酒保回答说：“首先，你必须不皱眉头地喝下一整瓶龙舌兰酒。其次，后院有一只患有牙痛的比特犬，你必须拔掉它的牙齿。第三，楼上有一位老太太，她从未体验过高潮。你必须让她达到高潮。”

  这个男人想了一下，然后自信地说：“我来做。”

  他拿起龙舌兰酒一口喝光，没有皱眉。然后他走到后院，在挣扎了几分钟后，手里拿着比特犬的牙齿出来了。

  酒吧里爆发出欢呼声，酒保带着这个男人上楼到老太太的房间。几分钟后，这个男人微笑着走出来，老太太则高兴地咯咯笑。

  酒保把罐子里的钱递给这个男人，并问道：“怎么

  默认耗时: 4.187s
*/

console.time();

// 第二次时，已经缓存了，所以更快
const res2 = await model.invoke("Tell me a joke");

console.log(res2);

console.timeEnd();

/*
  一个男人走进一家酒吧，看到柜台上有一个装满钱的罐子。他好奇地问酒保这是怎么回事。

  酒保解释说：“我们为顾客提供了一个挑战。如果你能完成三个任务，你就能赢得罐子里所有的钱。”

  这个男人很感兴趣，便问任务是什么。

  酒保回答说：“首先，你必须不皱眉头地喝下一整瓶龙舌兰酒。其次，后院有一只患有牙痛的比特犬，你必须拔掉它的牙齿。第三，楼上有一位老太太，她从未体验过高潮。你必须让她达到高潮。”

  这个男人想了一下，然后自信地说：“我来做。”

  他拿起龙舌兰酒一口喝光，没有皱眉。然后他走到后院，在挣扎了几分钟后，手里拿着比特犬的牙齿出来了。

  酒吧里爆发出欢呼声，酒保带着这个男人上楼到老太太的房间。几分钟后，这个男人微笑着走出来，老太太则高兴地咯咯笑。

  酒保把罐子里的钱递给这个男人，并问道：“怎么

  默认耗时: 175.74ms
*/

使用 Momento 缓存

LangChain 还提供了基于 Momento 的缓存。Momento 是一个分布式、无服务器的缓存，不需要任何设置或基础设施维护。鉴于 Momento 兼容 Node.js、浏览器和边缘环境，请确保安装相应的包。

对于 Node.js 安装：

npm
Yarn
pnpm

npm install @gomomento/sdk

yarn add @gomomento/sdk

pnpm add @gomomento/sdk

对于 浏览器/边缘工作者 安装：

npm
Yarn
pnpm

npm install @gomomento/sdk-web

yarn add @gomomento/sdk-web

pnpm add @gomomento/sdk-web

接下来你需要注册并创建一个 API 密钥。完成之后，在实例化 LLM 时传递一个 cache 选项，如下所示：

import { OpenAI } from "@langchain/openai";
import {
  CacheClient,
  Configurations,
  CredentialProvider,
} from "@gomomento/sdk";
import { MomentoCache } from "@langchain/community/caches/momento";

// See https://github.com/momentohq/client-sdk-javascript for connection options
const client = new CacheClient({
  configuration: Configurations.Laptop.v1(),
  credentialProvider: CredentialProvider.fromEnvironmentVariable({
    environmentVariableName: "MOMENTO_API_KEY",
  }),
  defaultTtlSeconds: 60 * 60 * 24,
});
const cache = await MomentoCache.fromProps({
  client,
  cacheName: "langchain",
});

const model = new OpenAI({ cache });

API Reference:

OpenAI from @langchain/openai
MomentoCache from @langchain/community/caches/momento

使用 Redis 缓存

LangChain 还提供基于 Redis 的缓存。如果你希望在多个进程或服务器之间共享缓存，这将非常有用。要使用它，你需要安装 redis 包：

npm
Yarn
pnpm

npm install ioredis

yarn add ioredis

pnpm add ioredis

然后，你可以在实例化 LLM 时传递一个 cache 选项。例如：

import { OpenAI } from "@langchain/openai";
import { RedisCache } from "@langchain/community/caches/ioredis";
import { Redis } from "ioredis";

// 有关连接选项，请参见 https://github.com/redis/ioredis
const client = new Redis({});

const cache = new RedisCache(client);

const model = new OpenAI({ cache });

使用 Upstash Redis 缓存

LangChain 提供了基于 Upstash Redis 的缓存。与基于 Redis 的缓存一样，这种缓存适用于希望在多个进程或服务器之间共享缓存的场景。Upstash Redis 客户端使用 HTTP，并支持边缘环境。要使用它，你需要安装 @upstash/redis 包：

npm
Yarn
pnpm

npm install @upstash/redis

yarn add @upstash/redis

pnpm add @upstash/redis

你还需要一个 Upstash 账号和一个 Redis 数据库来连接。完成之后，获取你的 REST URL 和 REST Token。

然后，你可以在实例化 LLM 时传递一个 cache 选项。例如：

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
  config: {
    url: "UPSTASH_REDIS_REST_URL",
    token: "UPSTASH_REDIS_REST_TOKEN",
  },
  ttl: 3600,
});

const model = new OpenAI({ cache });

API Reference:

OpenAI from @langchain/openai
UpstashRedisCache from @langchain/community/caches/upstash_redis

你也可以直接传入一个之前创建的 @upstash/redis 客户端实例：

import { Redis } from "@upstash/redis";
import https from "https";

import { OpenAI } from "@langchain/openai";
import { UpstashRedisCache } from "@langchain/community/caches/upstash_redis";

// const client = new Redis({
//   url: process.env.UPSTASH_REDIS_REST_URL!,
//   token: process.env.UPSTASH_REDIS_REST_TOKEN!,
//   agent: new https.Agent({ keepAlive: true }),
// });

// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
  agent: new https.Agent({ keepAlive: true }),
});

const cache = new UpstashRedisCache({ client });
const model = new OpenAI({ cache });

API Reference:

OpenAI from @langchain/openai
UpstashRedisCache from @langchain/community/caches/upstash_redis

使用 Vercel KV 缓存

LangChain 提供了基于 Vercel KV 的缓存。与基于 Redis 的缓存类似，这种缓存适用于希望在多个进程或服务器之间共享缓存的场景。Vercel KV 客户端使用 HTTP，并支持边缘环境。要使用它，你需要安装 @vercel/kv 包：

npm
Yarn
pnpm

npm install @vercel/kv

yarn add @vercel/kv

pnpm add @vercel/kv

你还需要一个 Vercel 账号和一个 KV 数据库来连接。完成后，获取你的 REST URL 和 REST Token。

然后，你可以在实例化 LLM 时传递一个 cache 选项。例如：

import { OpenAI } from "@langchain/openai";
import { VercelKVCache } from "@langchain/community/caches/vercel_kv";
import { createClient } from "@vercel/kv";

// See https://vercel.com/docs/storage/vercel-kv/kv-reference#createclient-example for connection options
const cache = new VercelKVCache({
  client: createClient({
    url: "VERCEL_KV_API_URL",
    token: "VERCEL_KV_API_TOKEN",
  }),
  ttl: 3600,
});

const model = new OpenAI({ cache });

API Reference:

OpenAI from @langchain/openai
VercelKVCache from @langchain/community/caches/vercel_kv

使用 Cloudflare KV 缓存

info

此集成仅在 Cloudflare Workers 中受支持。

如果你将项目部署为 Cloudflare Worker，则可以使用 LangChain 的基于 Cloudflare KV 的 LLM 缓存。

有关如何在 Cloudflare 中设置 KV 的信息，请参见官方文档。

注意： 如果你使用的是 TypeScript，可能需要安装类型定义（如果尚未安装）：

npm
Yarn
pnpm

npm install -S @cloudflare/workers-types

yarn add @cloudflare/workers-types

pnpm add @cloudflare/workers-types

import type { KVNamespace } from "@cloudflare/workers-types";

import { OpenAI } from "@langchain/openai";
import { CloudflareKVCache } from "@langchain/cloudflare";

export interface Env {
  KV_NAMESPACE: KVNamespace;
  OPENAI_API_KEY: string;
}

export default {
  async fetch(_request: Request, env: Env) {
    try {
      const cache = new CloudflareKVCache(env.KV_NAMESPACE);
      const model = new OpenAI({
        cache,
        model: "gpt-3.5-turbo-instruct",
        apiKey: env.OPENAI_API_KEY,
      });
      const response = await model.invoke("How are you today?");
      return new Response(JSON.stringify(response), {
        headers: { "content-type": "application/json" },
      });
    } catch (err: any) {
      console.log(err.message);
      return new Response(err.message, { status: 500 });
    }
  },
};

API Reference:

OpenAI from @langchain/openai
CloudflareKVCache from @langchain/cloudflare

使用文件系统缓存

danger

此缓存不推荐用于生产环境。它仅用于本地开发。

LangChain 提供了一个简单的文件系统缓存。
默认情况下，缓存存储在一个临时目录中，但你也可以指定一个自定义目录。

const cache = await LocalFileCache.create();

下一步

你现在已了解如何缓存模型响应以节省时间和费用。

接下来，查看其他关于 LLM 的指南，例如如何创建自己的自定义 LLM 类。

如何缓存模型响应

内存缓存

使用 Momento 缓存

API Reference:

使用 Redis 缓存

使用 Upstash Redis 缓存

API Reference:

API Reference:

使用 Vercel KV 缓存

API Reference:

使用 Cloudflare KV 缓存

API Reference:

使用文件系统缓存

下一步

Was this page helpful?

You can also leave detailed feedback on GitHub.

如何缓存模型响应

内存缓存​

使用 Momento 缓存​

API Reference:

使用 Redis 缓存​

使用 Upstash Redis 缓存​

API Reference:

API Reference:

使用 Vercel KV 缓存​

API Reference:

使用 Cloudflare KV 缓存​

API Reference:

使用文件系统缓存​

下一步​

Was this page helpful?

You can also leave detailed feedback on GitHub.

内存缓存

使用 Momento 缓存

使用 Redis 缓存

使用 Upstash Redis 缓存

使用 Vercel KV 缓存

使用 Cloudflare KV 缓存

使用文件系统缓存

下一步