BM25
BM25,也称为 Okapi BM25,是信息检索系统中用于估计文档与给定搜索查询相关性的一种排序函数。
你可以将其作为检索流程的一部分使用,在从其他来源检索到一组初始文档后,作为后处理步骤对文档进行重新排序。
安装配置
BM25Retriever 从 @langchain/community
中导出。你需要像下面这样安装它:
:::提示 请参阅安装集成包的一般说明部分。 :::
- npm
- yarn
- pnpm
npm i @langchain/community @langchain/core
yarn add @langchain/community @langchain/core
pnpm add @langchain/community @langchain/core
此检索器使用了 此实现 的
Okapi BM25 代码。
使用方法
现在你可以使用先前检索到的文档创建一个新的检索器:
import { BM25Retriever } from "@langchain/community/retrievers/bm25";
const retriever = BM25Retriever.fromDocuments(
[
{ pageContent: "Buildings are made out of brick", metadata: {} },
{ pageContent: "Buildings are made out of wood", metadata: {} },
{ pageContent: "Buildings are made out of stone", metadata: {} },
{ pageContent: "Cars are made out of metal", metadata: {} },
{ pageContent: "Cars are made out of plastic", metadata: {} },
{ pageContent: "mitochondria is the powerhouse of the cell", metadata: {} },
{ pageContent: "mitochondria is made of lipids", metadata: {} },
],
{ k: 4 }
);
// Will return the 4 documents reranked by the BM25 algorithm
await retriever.invoke("mitochondria");
[
{ pageContent: 'mitochondria is made of lipids', metadata: {} },
{
pageContent: 'mitochondria is the powerhouse of the cell',
metadata: {}
},
{ pageContent: 'Buildings are made out of brick', metadata: {} },
{ pageContent: 'Buildings are made out of wood', metadata: {} }
]
Related
- Retriever conceptual guide
- Retriever how-to guides