Skip to main content

MariaDB

兼容性

仅适用于 Node.js。

需要 MariaDB 11.7 或更高版本

本指南提供了关于 mariadb 向量存储 的快速入门概述。如需了解所有 MariaDB 存储 功能和配置的详细文档,请前往 API 参考文档

概览

集成详情

Python 支持包的最新版本
MariaDBStore@langchain/communityNPM - 版本

配置

要使用 MariaDBVector 向量存储,你需要安装 MariaDB 11.7 或更高版本,并将 mariadb 连接器作为对等依赖使用。

本指南还将使用 OpenAI 嵌入,这要求你安装 @langchain/openai 集成包。如果你愿意,也可以使用 其他支持的嵌入模型

我们还将使用 uuid 包以所需格式生成 ID。

:::提示 请参阅安装集成包的一般说明部分。 :::

yarn add @langchain/community @langchain/openai @langchain/core mariadb uuid

配置实例

创建一个名为 docker-compose.yml 的文件,内容如下:

# 运行此命令以启动数据库:
# docker-compose up --build
version: "3"
services:
db:
hostname: 127.0.0.1
image: mariadb/mariadb:11.7-rc
ports:
- 3306:3306
restart: always
environment:
- MARIADB_DATABASE=api
- MARIADB_USER=myuser
- MARIADB_PASSWORD=ChangeMe
- MARIADB_ROOT_PASSWORD=ChangeMe
volumes:
- ./init.sql:/docker-entrypoint-initdb.d/init.sql

然后在同一目录中,运行 docker compose up 命令以启动容器。

凭据

要连接你的 MariaDB 实例,你需要相应的凭据。有关支持的选项完整列表,请参阅 mariadb 文档

如果本指南使用 OpenAI 嵌入,你需要设置你的 OpenAI 密钥:

process.env.OPENAI_API_KEY = "YOUR_API_KEY";

如果你想自动追踪模型调用,也可以取消注释以下 LangSmith API 密钥的设置:

// process.env.LANGCHAIN_TRACING_V2="true"
// process.env.LANGCHAIN_API_KEY="your-api-key"

实例化

要实例化向量存储,请调用.initialize()静态方法。这将自动检查传入的config中指定的tableName对应的表是否存在。如果不存在,将根据所需的列创建该表。

import { OpenAIEmbeddings } from "@langchain/openai";

import {
DistanceStrategy,
MariaDBStore,
} from "@langchain/community/vectorstores/mariadb";
import { PoolConfig } from "mariadb";

const config = {
connectionOptions: {
type: "mariadb",
host: "127.0.0.1",
port: 3306,
user: "myuser",
password: "ChangeMe",
database: "api",
} as PoolConfig,
distanceStrategy: "EUCLIDEAN" as DistanceStrategy,
};
const vectorStore = await MariaDBStore.initialize(
new OpenAIEmbeddings(),
config
);

管理向量存储

向向量存储中添加项目

import { v4 as uuidv4 } from "uuid";
import type { Document } from "@langchain/core/documents";

const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { source: "https://example.com" },
};

const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { source: "https://example.com" },
};

const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { source: "https://example.com" },
};

const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

const ids = [uuidv4(), uuidv4(), uuidv4(), uuidv4()];

// ids are not mandatory, but that's for the example
await vectorStore.addDocuments(documents, { ids: ids });

从向量存储中删除项目

const id4 = ids[ids.length - 1];

await vectorStore.delete({ ids: [id4] });

查询向量存储

一旦创建了向量存储并添加了相关文档,您很可能希望在链或代理运行期间查询它。

直接查询

执行一个简单的相似性搜索可以按如下方式进行:

const similaritySearchResults = await vectorStore.similaritySearch(
"biology",
2,
{ year: 2021 }
);
for (const doc of similaritySearchResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"year": 2021}]
* Mitochondria are made out of lipids [{"year": 2022}]

上述过滤器语法可以更复杂:

# name = 'martin' OR firstname = 'john'
let res = await vectorStore.similaritySearch("biology", 2, {"$or": [{"name":"martin"}, {"firstname": "john"}] });

如果你想执行相似性搜索并获得相应的分数,可以运行:

const similaritySearchWithScoreResults =
await vectorStore.similaritySearchWithScore("biology", 2);

for (const [doc, score] of similaritySearchWithScoreResults) {
console.log(
`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
doc.metadata
)}]`
);
}
* [SIM=0.835] The powerhouse of the cell is the mitochondria [{"source":"https://example.com"}]
* [SIM=0.852] Mitochondria are made out of lipids [{"source":"https://example.com"}]

通过转换为检索器进行查询

您还可以将向量存储转换为检索器,以便在您的链中更方便地使用。

const retriever = vectorStore.asRetriever({
// Optional filter
// filter: filter,
k: 2,
});
await retriever.invoke("biology");
[
Document {
pageContent: 'The powerhouse of the cell is the mitochondria',
metadata: { source: 'https://example.com' },
id: undefined
},
Document {
pageContent: 'Mitochondria are made out of lipids',
metadata: { source: 'https://example.com' },
id: undefined
}
]

检索增强生成的用法

有关如何将此向量存储用于检索增强生成(RAG)的指南,请参阅以下部分:

高级:重用连接

你可以通过创建一个连接池,然后直接通过构造函数创建新的 MariaDBStore 实例来重用连接。

请注意,在使用构造函数之前,你应该至少调用一次 .initialize() 方法来正确设置数据库表结构。

import { OpenAIEmbeddings } from "@langchain/openai";
import { MariaDBStore } from "@langchain/community/vectorstores/mariadb";
import mariadb from "mariadb";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/mariadb

const reusablePool = mariadb.createPool({
host: "127.0.0.1",
port: 3306,
user: "myuser",
password: "ChangeMe",
database: "api",
});

const originalConfig = {
pool: reusablePool,
tableName: "testlangchainjs",
collectionName: "sample",
collectionTableName: "collections",
columns: {
idColumnName: "id",
vectorColumnName: "vect",
contentColumnName: "content",
metadataColumnName: "metadata",
},
};

// Set up the DB.
// Can skip this step if you've already initialized the DB.
// await MariaDBStore.initialize(new OpenAIEmbeddings(), originalConfig);
const mariadbStore = new MariaDBStore(new OpenAIEmbeddings(), originalConfig);

await mariadbStore.addDocuments([
{ pageContent: "what's this", metadata: { a: 2 } },
{ pageContent: "Cat drinks milk", metadata: { a: 1 } },
]);

const results = await mariadbStore.similaritySearch("water", 1);

console.log(results);

/*
[ Document { pageContent: 'Cat drinks milk', metadata: { a: 1 } } ]
*/

const mariadbStore2 = new MariaDBStore(new OpenAIEmbeddings(), {
pool: reusablePool,
tableName: "testlangchainjs",
collectionTableName: "collections",
collectionName: "some_other_collection",
columns: {
idColumnName: "id",
vectorColumnName: "vector",
contentColumnName: "content",
metadataColumnName: "metadata",
},
});

const results2 = await mariadbStore2.similaritySearch("water", 1);

console.log(results2);

/*
[]
*/

await reusablePool.end();

关闭连接

完成操作后,请确保关闭连接以避免资源过度消耗:

await vectorStore.end();

API 参考

有关所有 MariaDBStore 功能和配置的详细文档,请访问 API 参考


Was this page helpful?


You can also leave detailed feedback on GitHub.