Skip to main content

Xata

Xata 是一个基于 PostgreSQL 的无服务器数据平台。它提供了类型安全的 TypeScript/JavaScript SDK 用于与数据库交互,并提供了用于管理数据的 UI。

Xata 拥有一个原生的向量类型,可以添加到任意表中,并支持相似性搜索。LangChain 可以直接将向量插入 Xata,并对其执行最近邻查询,这样你就可以在 Xata 上使用所有 LangChain 嵌入(Embeddings)的集成。

配置

安装 Xata CLI

npm install @xata.io/cli -g

创建一个用作向量存储的数据库

Xata UI 中创建一个新的数据库。你可以随意命名,但在这个示例中我们将使用 langchain
创建一个表,同样可以随意命名,但我们将使用 vectors。通过 UI 添加以下列:

  • content 类型为 "Text"。用于存储 Document.pageContent 的值。
  • embedding 类型为 "Vector"。使用你计划使用的模型的维度(例如 OpenAI 使用 1536)。
  • 其他你想用作元数据的列。这些列的值将从 Document.metadata 对象中获取。例如,如果在 Document.metadata 对象中存在 title 属性,则可以在表中创建一个 title 列,它将自动填充对应值。

初始化项目

在你的项目中运行:

xata init

然后选择你之前创建的数据库。这一步会生成一个 xata.tsxata.js 文件,用于定义与数据库交互的客户端。更多关于使用 Xata JavaScript/TypeScript SDK 的细节,请参考 Xata 入门文档

使用

:::提示 请参阅安装集成包的一般说明部分。 :::

npm install @langchain/openai @langchain/community @langchain/core

示例:使用 OpenAI 和 Xata 作为向量存储的问答聊天机器人

此示例使用 VectorDBQAChain 在 Xata 中存储的文档中进行搜索,然后将这些文档作为上下文传递给 OpenAI 模型,以回答用户提出的问题。

import { XataVectorSearch } from "@langchain/community/vectorstores/xata";
import { OpenAIEmbeddings, OpenAI } from "@langchain/openai";
import { BaseClient } from "@xata.io/client";
import { VectorDBQAChain } from "langchain/chains";
import { Document } from "@langchain/core/documents";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/xata

// if you use the generated client, you don't need this function.
// Just import getXataClient from the generated xata.ts instead.
const getXataClient = () => {
if (!process.env.XATA_API_KEY) {
throw new Error("XATA_API_KEY not set");
}

if (!process.env.XATA_DB_URL) {
throw new Error("XATA_DB_URL not set");
}
const xata = new BaseClient({
databaseURL: process.env.XATA_DB_URL,
apiKey: process.env.XATA_API_KEY,
branch: process.env.XATA_BRANCH || "main",
});
return xata;
};

export async function run() {
const client = getXataClient();

const table = "vectors";
const embeddings = new OpenAIEmbeddings();
const store = new XataVectorSearch(embeddings, { client, table });

// Add documents
const docs = [
new Document({
pageContent: "Xata is a Serverless Data platform based on PostgreSQL",
}),
new Document({
pageContent:
"Xata offers a built-in vector type that can be used to store and query vectors",
}),
new Document({
pageContent: "Xata includes similarity search",
}),
];

const ids = await store.addDocuments(docs);

// eslint-disable-next-line no-promise-executor-return
await new Promise((r) => setTimeout(r, 2000));

const model = new OpenAI();
const chain = VectorDBQAChain.fromLLM(model, store, {
k: 1,
returnSourceDocuments: true,
});
const response = await chain.invoke({ query: "What is Xata?" });

console.log(JSON.stringify(response, null, 2));

await store.delete({ ids });
}

API Reference:

示例:带元数据过滤的相似性搜索

此示例展示了如何使用 LangChain.js 和 Xata 实现语义搜索。在运行前,请确保在 Xata 的 vectors 表中添加了一个类型为 String 的 author 列。

import { XataVectorSearch } from "@langchain/community/vectorstores/xata";
import { OpenAIEmbeddings } from "@langchain/openai";
import { BaseClient } from "@xata.io/client";
import { Document } from "@langchain/core/documents";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/xata
// Also, add a column named "author" to the "vectors" table.

// if you use the generated client, you don't need this function.
// Just import getXataClient from the generated xata.ts instead.
const getXataClient = () => {
if (!process.env.XATA_API_KEY) {
throw new Error("XATA_API_KEY not set");
}

if (!process.env.XATA_DB_URL) {
throw new Error("XATA_DB_URL not set");
}
const xata = new BaseClient({
databaseURL: process.env.XATA_DB_URL,
apiKey: process.env.XATA_API_KEY,
branch: process.env.XATA_BRANCH || "main",
});
return xata;
};

export async function run() {
const client = getXataClient();
const table = "vectors";
const embeddings = new OpenAIEmbeddings();
const store = new XataVectorSearch(embeddings, { client, table });
// Add documents
const docs = [
new Document({
pageContent: "Xata works great with Langchain.js",
metadata: { author: "Xata" },
}),
new Document({
pageContent: "Xata works great with Langchain",
metadata: { author: "Langchain" },
}),
new Document({
pageContent: "Xata includes similarity search",
metadata: { author: "Xata" },
}),
];
const ids = await store.addDocuments(docs);

// eslint-disable-next-line no-promise-executor-return
await new Promise((r) => setTimeout(r, 2000));

// author is applied as pre-filter to the similarity search
const results = await store.similaritySearchWithScore("xata works great", 6, {
author: "Langchain",
});

console.log(JSON.stringify(results, null, 2));

await store.delete({ ids });
}

API Reference:

相关内容


Was this page helpful?


You can also leave detailed feedback on GitHub.