Vibium MCP Tools Reference

The Vibium MCP server exposes a complete browser toolset that an AI agent calls to navigate, read, interact with, and screenshot a real Chrome browser. Vibium is AI-native browser automation on WebDriver BiDi — a single Go binary that auto-downloads Chrome for Testing and ships a Model Context Protocol (MCP) server, created by Jason Huggins (co-creator of Selenium and Appium). The tools cover the session lifecycle, navigation, page reading, element finding, interaction, screenshots, waiting, and tab management. Each tool advertises a JSON Schema inputSchema so the model knows exactly which arguments are required. You start the server with npx -y vibium mcp (Node) after npm install vibium, or use the Python client via pip install vibium. This reference documents the tool categories, their arguments, and how to list the live catalog for your exact version, since the set grows with each release.

How do you list the live MCP tool catalog?

The authoritative list for your installed version comes straight from the server. MCP hosts discover tools on startup by sending a tools/list request; you can do the same from a terminal:

echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | npx -y vibium mcp

Each entry returns a name, a description, and an inputSchema (JSON Schema). The schema is what the LLM reads to call the tool correctly — it lists parameter names, types, and which are required. Always treat this output as ground truth; the table below documents the stable, commonly available tools.

What are the session lifecycle tools?

These tools open and close the browser. An agent calls browser_launch first and browser_quit last; everything else happens in between.

Tool	Description	Key arguments
`browser_launch`	Start a browser session (visible by default)	`headless` (boolean, optional)
`browser_quit`	Close the browser session	—

Launch is visible by default, which is ideal while developing an agent so you can watch it work. Pass headless: true for CI or background runs. For host-level configuration, see set up Vibium MCP in Claude Code.

These tools move between pages and let the agent perceive what is on screen. Reading the page is how the model decides its next action.

Tool	Description	Key arguments
`browser_navigate`	Navigate to a URL	`url` (string, required)
`browser_get_url`	Get the current page URL	—
`browser_get_title`	Get the current page title	—
`browser_get_text`	Get text content of the page or an element	`selector` (string, optional)
`browser_get_html`	Get HTML (innerHTML or outerHTML)	`selector`, `outer` (optional)

Prefer browser_get_text over browser_get_html when feeding context to the model — text is cheaper and less noisy. The equivalent script-level commands are documented in navigate with go and get text.

What are the element-finding tools?

These tools locate elements so the agent can act on them. Vibium supports both CSS selectors and semantic targeting.

Tool	Description	Key arguments
`browser_find`	Find one element; returns tag, text, bounding box	`selector` (string, required)
`browser_find_all`	Find all matching elements	`selector` (string, required)

In the Python and JS clients, find() also accepts semantic keyword arguments — role, text, label, placeholder, testid — which align with the accessibility tree. That semantic matching is what makes agents resilient when class names or markup change. See find an element for the full matcher set.

What are the interaction tools?

These tools perform actions a user would: clicking, typing, hovering, scrolling, pressing keys, and choosing dropdown options. Vibium auto-waits for actionability before each one, so the agent rarely needs explicit waits.

Tool	Description	Key arguments
`browser_click`	Click an element by CSS selector	`selector` (string, required)
`browser_type`	Type text into an element	`selector`, `text` (required)
`browser_hover`	Hover over an element	`selector` (string, required)
`browser_scroll`	Scroll the page or an element	`selector`, direction/amount (optional)
`browser_keys`	Press a key or key combination	`keys` (string, required)
`browser_select`	Select a dropdown option	`selector`, `value`/`label` (required)

Auto-waiting means a click waits until the target is visible, stable, receiving events, and enabled. The script equivalents live in click and type text.

What are the screenshot and wait tools?

Screenshots give the agent (and you) visual proof, and the wait tool handles the rare case where the agent needs to block on a specific element state.

Tool	Description	Key arguments
`browser_screenshot`	Capture a screenshot	`selector`, `filename` (optional)
`browser_evaluate`	Execute JavaScript in the page	`script` (string, required)
`browser_wait`	Wait for an element state	`selector`, `state` (required)

By default, screenshots save to ~/Pictures/Vibium/ (macOS/Linux) or Pictures\Vibium\ on Windows; pass --screenshot-dir when registering the server to change this, or --screenshot-dir "" to return inline base64 only. Use browser_evaluate sparingly — for reading values the other tools cannot reach. See screenshot and wait for an element.

What are the tab management tools?

These tools handle multiple tabs, which agents need for flows that open new windows (auth popups, "open in new tab" links, comparison shopping).

Tool	Description	Key arguments
`browser_new_tab`	Open a new tab	`url` (optional)
`browser_list_tabs`	List all open tabs	—
`browser_switch_tab`	Switch to a tab by index or URL	`index` or `url`
`browser_close_tab`	Close a tab	`index` (optional)

How do you confirm a tool's exact arguments?

Inspect the tool's inputSchema from the live tools/list response. The schema is the contract: it names every parameter, its type, and which are required — exactly what the model uses to construct a valid call.

echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
  | npx -y vibium mcp | jq '.result.tools[] | {name, inputSchema}'

If a tool call fails, the schema is the first place to check — a missing required field or a wrong type is the usual cause. For deeper diagnosis, see how to debug the Vibium MCP server.

Next steps

Frequently asked questions

What tools does the Vibium MCP server expose?

Vibium's built-in MCP server exposes a browser toolset covering the session lifecycle (launch, quit), navigation, reading the page (text, HTML, URL, title), finding elements, interaction (click, type, hover, scroll, keys, select), screenshots, waiting, and tab management — enough for an agent to complete real web tasks.

How do I list the exact MCP tools Vibium provides?

Send a tools/list JSON-RPC request to the server: echo a request into 'npx -y vibium mcp' and read the result. Each tool returns its name, description, and inputSchema, which tells the agent the exact arguments and which are required.

What arguments do Vibium MCP tools take?

Each tool defines a JSON Schema inputSchema. For example, browser_navigate takes a url string, browser_click takes a CSS selector, and browser_type takes a selector plus text. The schema marks required versus optional fields, so the LLM calls each tool correctly.

Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.

Vibium MCP Tools Reference

How do you list the live MCP tool catalog?

What are the session lifecycle tools?

What are the navigation and page-reading tools?

What are the element-finding tools?

What are the interaction tools?

What are the screenshot and wait tools?

What are the tab management tools?

How do you confirm a tool's exact arguments?

Next steps

Frequently asked questions

Related guides

Agentic Web Testing with Vibium

How to Build an AI Agent That Browses the Web with Vibium

How to Debug the Vibium MCP Server

How to Give Claude Browser Access with Vibium