Stagehand 工具包

Stagehand 工具包为你的 AI 代理提供了以下能力：

navigate(): 导航到特定的 URL。
act(): 执行浏览器自动化操作，如点击、输入和导航。
extract(): 使用 Zod 模式从网页中提取结构化数据。
observe(): 获取当前页面上可能的操作和元素列表。

安装配置

安装所需的包：

npm install @langchain/langgraph @langchain/community @langchain/core

创建一个 Stagehand 实例
如果你计划在本地运行浏览器，还需要安装 Playwright 的浏览器依赖项。

npx playwright install

设置你的模型提供商凭证：

对于 OpenAI：

export OPENAI_API_KEY="your-openai-api-key"

对于 Anthropic：

export ANTHROPIC_API_KEY="your-anthropic-api-key"

使用方法，独立运行，本地浏览器

import { StagehandToolkit } from "langchain/community/agents/toolkits/stagehand";
import { ChatOpenAI } from "@langchain/openai";
import { Stagehand } from "@browserbasehq/stagehand";

// 指定你的 Browserbase 凭证。
process.env.BROWSERBASE_API_KEY = "";
process.env.BROWSERBASE_PROJECT_ID = "";

// 指定 OpenAI API 密钥。
process.env.OPENAI_API_KEY = "";

const stagehand = new Stagehand({
  env: "LOCAL",
  headless: false,
  verbose: 2,
  debugDom: true,
  enableCaching: false,
});

// 创建一个包含 Stagehand 所有可用操作的 Stagehand 工具包。
const stagehandToolkit = await StagehandToolkit.fromStagehand(stagehand);

const navigateTool = stagehandToolkit.tools.find(
  (t) => t.name === "stagehand_navigate"
);
if (!navigateTool) {
  throw new Error("未找到导航工具");
}
await navigateTool.invoke("https://www.google.com");

const actionTool = stagehandToolkit.tools.find(
  (t) => t.name === "stagehand_act"
);
if (!actionTool) {
  throw new Error("未找到操作工具");
}
await actionTool.invoke('搜索 "OpenAI"');

const observeTool = stagehandToolkit.tools.find(
  (t) => t.name === "stagehand_observe"
);
if (!observeTool) {
  throw new Error("未找到观察工具");
}
const result = await observeTool.invoke(
  "在当前页面上可以执行哪些操作？"
);
const observations = JSON.parse(result);

// 根据需要处理观察结果
console.log(observations);

const currentUrl = stagehand.page.url();
expect(currentUrl).toContain("google.com/search?q=OpenAI");

在 LangGraph 代理中使用

import { Stagehand } from "@browserbasehq/stagehand";
import {
  StagehandActTool,
  StagehandNavigateTool,
} from "@langchain/community/agents/toolkits/stagehand";
import { ChatOpenAI } from "@langchain/openai";
import { createReactAgent } from "@langchain/langgraph/prebuilt";

async function main() {
  // Initialize Stagehand once and pass it to the tools
  const stagehand = new Stagehand({
    env: "LOCAL",
    enableCaching: true,
  });

  const actTool = new StagehandActTool(stagehand);
  const navigateTool = new StagehandNavigateTool(stagehand);

  // Initialize the model
  const model = new ChatOpenAI({
    modelName: "gpt-4",
    temperature: 0,
  });

  // Create the agent using langgraph
  const agent = createReactAgent({
    llm: model,
    tools: [actTool, navigateTool],
  });

  // Execute the agent using streams
  const inputs1 = {
    messages: [
      {
        role: "user",
        content: "Navigate to https://www.google.com",
      },
    ],
  };

  const stream1 = await agent.stream(inputs1, {
    streamMode: "values",
  });

  for await (const { messages } of stream1) {
    const msg =
      messages && messages.length > 0
        ? messages[messages.length - 1]
        : undefined;
    if (msg?.content) {
      console.log(msg.content);
    } else if (msg?.tool_calls && msg.tool_calls.length > 0) {
      console.log(msg.tool_calls);
    } else {
      console.log(msg);
    }
  }

  const inputs2 = {
    messages: [
      {
        role: "user",
        content: "Search for 'OpenAI'",
      },
    ],
  };

  const stream2 = await agent.stream(inputs2, {
    streamMode: "values",
  });

  for await (const { messages } of stream2) {
    const msg =
      messages && messages.length > 0
        ? messages[messages.length - 1]
        : undefined;
    if (msg?.content) {
      console.log(msg.content);
    } else if (msg?.tool_calls && msg.tool_calls.length > 0) {
      console.log(msg.tool_calls);
    } else {
      console.log(msg);
    }
  }
}

main();

API Reference:

StagehandActTool from @langchain/community/agents/toolkits/stagehand
StagehandNavigateTool from @langchain/community/agents/toolkits/stagehand
ChatOpenAI from @langchain/openai
createReactAgent from @langchain/langgraph/prebuilt

在 Browserbase 上使用 - 远程无头浏览器

如果你想远程运行浏览器，可以使用 Browserbase 平台。

你需要将 BROWSERBASE_API_KEY 环境变量设置为你的 Browserbase API 密钥。

export BROWSERBASE_API_KEY="your-browserbase-api-key"

你还需要将 BROWSERBASE_PROJECT_ID 设置为你的 Browserbase 项目 ID。

export BROWSERBASE_PROJECT_ID="your-browserbase-project-id"

然后使用 BROWSERBASE 环境初始化 Stagehand 实例。

const stagehand = new Stagehand({
  env: "BROWSERBASE",
});

Stagehand 工具包

安装配置

使用方法，独立运行，本地浏览器

在 LangGraph 代理中使用

API Reference:

在 Browserbase 上使用 - 远程无头浏览器

相关内容

Was this page helpful?

You can also leave detailed feedback on GitHub.

Stagehand 工具包

安装配置​

使用方法，独立运行，本地浏览器​

在 LangGraph 代理中使用​

API Reference:

在 Browserbase 上使用 - 远程无头浏览器​

相关内容​

Related​

Was this page helpful?

You can also leave detailed feedback on GitHub.

安装配置

使用方法，独立运行，本地浏览器

在 LangGraph 代理中使用

在 Browserbase 上使用 - 远程无头浏览器

相关内容

Related