IBM watsonx.ai
概述
这将帮助你快速开始使用 Watsonx 文档压缩器。关于 Watsonx 文档压缩器所有功能和配置的详细文档,请访问 API 参考。
集成详情
| 类 | 包 | PY 支持 | 包下载量 | 包最新版本 |
|---|---|---|---|---|
WatsonxRerank | @langchain/community | ✅ | ![]() | ![]() |
准备工作
要访问 IBM WatsonxAI 模型,你需要创建一个 IBM watsonx.ai 账户,获取 API
密钥或其他类型的凭证,并安装 @langchain/community 集成包。
凭证
前往 IBM Cloud 注册 IBM watsonx.ai 并生成一个 API 密钥或提供如下所示的其他身份验证方式。
IAM 身份验证
export WATSONX_AI_AUTH_TYPE=iam
export WATSONX_AI_APIKEY=<YOUR-APIKEY>
Bearer token 身份验证
export WATSONX_AI_AUTH_TYPE=bearertoken
export WATSONX_AI_BEARER_TOKEN=<YOUR-BEARER-TOKEN>
IBM watsonx.ai 软件身份验证
export WATSONX_AI_AUTH_TYPE=cp4d
export WATSONX_AI_USERNAME=<YOUR_USERNAME>
export WATSONX_AI_PASSWORD=<YOUR_PASSWORD>
export WATSONX_AI_URL=<URL>
一旦将这些设置放置到你的环境变量中并初始化对象,身份验证将自动进行。
也可以通过将这些值作为参数传递给新实例来完成身份验证。
IAM 身份验证
import { WatsonxLLM } from "@langchain/community/llms/ibm";
const props = {
version: "YYYY-MM-DD",
serviceUrl: "<SERVICE_URL>",
projectId: "<PROJECT_ID>",
watsonxAIAuthType: "iam",
watsonxAIApikey: "<YOUR-APIKEY>",
};
const instance = new WatsonxLLM(props);
Bearer token 身份验证
import { WatsonxLLM } from "@langchain/community/llms/ibm";
const props = {
version: "YYYY-MM-DD",
serviceUrl: "<SERVICE_URL>",
projectId: "<PROJECT_ID>",
watsonxAIAuthType: "bearertoken",
watsonxAIBearerToken: "<YOUR-BEARERTOKEN>",
};
const instance = new WatsonxLLM(props);
IBM watsonx.ai 软件身份验证
import { WatsonxLLM } from "@langchain/community/llms/ibm";
const props = {
version: "YYYY-MM-DD",
serviceUrl: "<SERVICE_URL>",
projectId: "<PROJECT_ID>",
watsonxAIAuthType: "cp4d",
watsonxAIUsername: "<YOUR-USERNAME>",
watsonxAIPassword: "<YOUR-PASSWORD>",
watsonxAIUrl: "<url>",
};
const instance = new WatsonxLLM(props);
如果你想从单个查询中获取自动化追踪,也可以取消下面 LangSmith API 密钥的注释以进行设置:
// process.env.LANGSMITH_API_KEY = "<YOUR API KEY HERE>";
// process.env.LANGSMITH_TRACING = "true";
安装
该文档压缩器位于 @langchain/community 包中:
:::提示 请参阅安装集成包的一般说明部分。 :::
- npm
- yarn
- pnpm
npm i @langchain/community @langchain/core
yarn add @langchain/community @langchain/core
pnpm add @langchain/community @langchain/core
实例化
现在我们可以实例化我们的压缩器:
import { WatsonxRerank } from "@langchain/community/document_compressors/ibm";
const watsonxRerank = new WatsonxRerank({
version: "2024-05-31",
serviceUrl: process.env.WATSONX_AI_SERVICE_URL,
projectId: process.env.WATSONX_AI_PROJECT_ID,
model: "cross-encoder/ms-marco-minilm-l-12-v2",
});
用法
首先,使用嵌入、文本分割器和向量存储设置一个基本的 RAG(检索增强生成)数据摄入管道。我们将使用它来针对选定的查询检索并重新排序一些文档:
import { readFileSync } from "node:fs";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { WatsonxEmbeddings } from "@langchain/community/embeddings/ibm";
import { CharacterTextSplitter } from "@langchain/textsplitters";
const embeddings = new WatsonxEmbeddings({
version: "YYYY-MM-DD",
serviceUrl: process.env.API_URL,
projectId: "<PROJECT_ID>",
spaceId: "<SPACE_ID>",
model: "ibm/slate-125m-english-rtrvr",
});
const textSplitter = new CharacterTextSplitter({
chunkSize: 400,
chunkOverlap: 0,
});
const query = "What did the president say about Ketanji Brown Jackson";
const text = readFileSync("state_of_the_union.txt", "utf8");
const docs = await textSplitter.createDocuments([text]);
const vectorStore = await MemoryVectorStore.fromDocuments(docs, embeddings);
const vectorStoreRetriever = vectorStore.asRetriever();
const result = await vectorStoreRetriever.invoke(query);
console.log(result);
[
Document {
pageContent: 'And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.',
metadata: { loc: [Object] },
id: undefined
},
Document {
pageContent: 'I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n' +
'\n' +
'I’ve worked on these issues a long time. \n' +
'\n' +
'I know what works: Investing in crime preventionand community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety.',
metadata: { loc: [Object] },
id: undefined
},
Document {
pageContent: 'We are the only nation on Earth that has always turned every crisis we have faced into an opportunity. \n' +
'\n' +
'The only nation that can be defined by a single word: possibilities. \n' +
'\n' +
'So on this night, in our 245th year as a nation, I have come to report on the State of the Union. \n' +
'\n' +
'And my report is this: the State of the Union is strong—because you, the American people, are strong.',
metadata: { loc: [Object] },
id: undefined
},
Document {
pageContent: 'And I’m taking robust action to make sure the pain of our sanctions is targeted at Russia’s economy. And I will use every tool at our disposal to protect American businesses and consumers. \n' +
'\n' +
'Tonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world.',
metadata: { loc: [Object] },
id: undefined
}
]
将选中的文档传递以重新排序,并为每个文档接收特定分数
import { WatsonxRerank } from "@langchain/community/document_compressors/ibm";
const reranker = new WatsonxRerank({
version: "2024-05-31",
serviceUrl: process.env.WATSONX_AI_SERVICE_URL,
projectId: process.env.WATSONX_AI_PROJECT_ID,
model: "cross-encoder/ms-marco-minilm-l-12-v2",
});
const compressed = await reranker.rerank(result, query);
console.log(compressed);
[
{ index: 0, relevanceScore: 0.726995587348938 },
{ index: 1, relevanceScore: 0.5758284330368042 },
{ index: 2, relevanceScore: 0.5479092597961426 },
{ index: 3, relevanceScore: 0.5468723773956299 }
]
或者你可以让文档随结果一起返回,为此请使用.compressDocuments()方法如下所示。
const compressedWithResults = await reranker.compressDocuments(result, query);
console.log(compressedWithResults);
[
Document {
pageContent: 'And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.',
metadata: { loc: [Object], relevanceScore: 0.726995587348938 },
id: undefined
},
Document {
pageContent: 'I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n' +
'\n' +
'I’ve worked on these issues a long time. \n' +
'\n' +
'I know what works: Investing in crime preventionand community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety.',
metadata: { loc: [Object], relevanceScore: 0.5758284330368042 },
id: undefined
},
Document {
pageContent: 'We are the only nation on Earth that has always turned every crisis we have faced into an opportunity. \n' +
'\n' +
'The only nation that can be defined by a single word: possibilities. \n' +
'\n' +
'So on this night, in our 245th year as a nation, I have come to report on the State of the Union. \n' +
'\n' +
'And my report is this: the State of the Union is strong—because you, the American people, are strong.',
metadata: { loc: [Object], relevanceScore: 0.5479092597961426 },
id: undefined
},
Document {
pageContent: 'And I’m taking robust action to make sure the pain of our sanctions is targeted at Russia’s economy. And I will use every tool at our disposal to protect American businesses and consumers. \n' +
'\n' +
'Tonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world.',
metadata: { loc: [Object], relevanceScore: 0.5468723773956299 },
id: undefined
}
]
API 参考
有关 Watsonx 文档压缩器所有功能和配置的详细文档,请访问 API 参考。

