VLearnVibium

How to Use Vibium with LangChain

Use Vibium with LangChain to give an agent a real browser — wrap Vibium's Python API as LangChain tools, or load its built-in MCP server as a toolset.

By Pramod Dutta··5 min read·Verified with Vibium 26.2
▶ Animated overview · made with Remotion

You use Vibium with LangChain by exposing Vibium's browser actions as LangChain tools — either as thin @tool wrappers around the Python client, or by loading Vibium's built-in MCP server through langchain-mcp-adapters. Vibium is AI-native browser automation on WebDriver BiDi: a single Go binary that auto-downloads Chrome for Testing and ships an MCP server. Created by Jason Huggins (co-creator of Selenium and Appium), it auto-waits for elements to be actionable, so your LangChain agent fights fewer timing bugs. Install it with pip install vibium. Once wired in, a LangChain agent can navigate, read the accessibility tree, click, type, and screenshot a real Chrome browser as part of its reasoning loop. This guide covers both integration paths, the trade-offs, and runnable code using only verified Vibium APIs.

Why pair Vibium with LangChain?

LangChain orchestrates the reasoning loop — tool selection, memory, and chaining — while Vibium provides the hands: a real browser the agent can actually drive. Vibium runs on WebDriver BiDi, auto-downloads Chrome for Testing, and auto-waits for actionability, so each tool call is dependable rather than racy.

There is no dedicated langchain-vibium package, and you do not need one. Vibium's clean Python API and built-in MCP server make it straightforward to expose as LangChain tools using only documented methods.

How do you install Vibium and LangChain?

Install the Vibium Python client plus the LangChain packages you plan to use. Chrome for Testing downloads automatically on first launch.

pip install vibium langchain langchain-openai
# Optional, for the MCP path:
pip install langchain-mcp-adapters

You can pre-download the browser if you prefer:

vibium install

Option 1 — How do you wrap Vibium as LangChain tools?

Wrap each Vibium action in a LangChain @tool function. The agent then calls these tools by name. Keep one shared browser instance so state (the current page, cookies) persists across calls.

from langchain_core.tools import tool
from vibium import browser_sync as browser
 
vibe = browser.launch()
 
@tool
def open_url(url: str) -> str:
    """Navigate the browser to a URL."""
    vibe.go(url)
    return f"Loaded {url}"
 
@tool
def read_page() -> str:
    """Return the page's accessibility tree (roles, names, state)."""
    return str(vibe.a11y_tree())
 
@tool
def click(text: str) -> str:
    """Click a button or link by its visible text."""
    vibe.find(role="button", text=text).click()
    return f"Clicked '{text}'"
 
@tool
def type_text(selector: str, value: str) -> str:
    """Type text into the element matching a CSS selector."""
    vibe.find(selector).type(value)
    return f"Typed into {selector}"
 
@tool
def get_text(selector: str) -> str:
    """Read the text content of an element."""
    return vibe.find(selector).text()

Notice the docstrings — LangChain passes them to the model as tool descriptions, so write them as if instructing the agent. The find() method accepts both CSS selectors and semantic kwargs (role, text, label, placeholder, testid), which keeps the tool surface small but expressive. See find an element for the full matcher list.

How do you give those tools to a LangChain agent?

Bind the tools to a model and run an agent executor. The agent reads the page, decides the next action, and calls the matching tool until it reaches the goal.

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
 
tools = [open_url, read_page, click, type_text, get_text]
model = ChatOpenAI(model="gpt-4o")
agent = create_react_agent(model, tools)
 
result = agent.invoke({
    "messages": [("user",
        "Go to https://example.com and tell me the page heading.")]
})
print(result["messages"][-1].content)
 
vibe.quit()

The agent will call open_url, then get_text (or read_page), and answer. Because Vibium auto-waits for elements, you do not need retry logic inside the tools. Always call vibe.quit() when the run ends to close the browser cleanly.

Option 2 — How do you use Vibium's MCP server with LangChain?

Instead of hand-writing wrappers, load Vibium's built-in MCP server and let langchain-mcp-adapters convert every Vibium tool into a LangChain tool automatically. This gives you the full catalog — browser_navigate, browser_find, browser_click, browser_type, browser_screenshot, and more — with zero wrapper code to maintain.

import asyncio
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
 
async def main():
    client = MultiServerMCPClient({
        "vibium": {
            "command": "npx",
            "args": ["-y", "vibium", "mcp"],
            "transport": "stdio",
        }
    })
    tools = await client.get_tools()
 
    agent = create_react_agent(ChatOpenAI(model="gpt-4o"), tools)
    result = await agent.ainvoke({
        "messages": [("user",
            "Open https://example.com, screenshot it, and report the title.")]
    })
    print(result["messages"][-1].content)
 
asyncio.run(main())

The same vibium mcp command used in Claude Code and Cursor works here. The full set of tools the adapter exposes is documented in the Vibium MCP tools reference.

Which approach should you choose?

Pick based on control versus convenience — both are first-class.

Factor@tool wrappersMCP via adapters
Setup effortWrite a few functionsOne client config
Tool coverageExactly what you defineFull Vibium catalog
Browser stateShared in-process instanceManaged by the MCP server
MaintenanceYou own the wrappersTracks Vibium releases
Reuse across hostsPython onlyAny MCP host (Claude Code, Cursor, custom)

Choose @tool wrappers when you want a tight, curated action set and direct control of the browser object in one Python process. Choose the MCP server when you want the complete toolset with no wrapper upkeep and the ability to reuse the same server across multiple agents and editors.

How do you make the agent verify its work?

Have the agent read text back or capture a screenshot after key actions, so it can confirm an outcome rather than assume success. With wrappers, expose a screenshot tool; with MCP, browser_screenshot is already available.

@tool
def snapshot(path: str = "step.png") -> str:
    """Save a screenshot of the current page and return the file path."""
    png = vibe.screenshot()
    with open(path, "wb") as f:
        f.write(png)
    return path

This closes the loop: the agent acts, then checks. Combined with Vibium's actionability waits, it produces stable runs even on JavaScript-heavy pages. For a concrete end-to-end flow, see automate login with Vibium.

Next steps

Frequently asked questions

How do I use Vibium with LangChain?

Two ways. Wrap Vibium's Python API in LangChain @tool functions so the agent can navigate, find, click, type, and screenshot, or load Vibium's built-in MCP server through langchain-mcp-adapters to get the tools automatically. Both give a LangChain agent a real Chrome browser.

Does Vibium have an official LangChain integration?

Vibium does not ship a dedicated LangChain package, but it works cleanly with LangChain in two supported ways: thin @tool wrappers around the vibium Python client, or its built-in MCP server consumed via langchain-mcp-adapters. Both approaches use only documented Vibium APIs.

Should I use Vibium tool wrappers or the MCP server with LangChain?

Use @tool wrappers when you want tight control over a small action set and shared browser state in one process. Use the MCP server when you want the full Vibium tool catalog with zero wrapper maintenance and the same server reusable across multiple agents and hosts.

Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.

Related guides