How Vibium Works: BiDi, the Go Binary, and MCP
How Vibium works: a single Go binary speaks WebDriver BiDi to Chrome, enforces auto-waiting, and exposes a built-in MCP server for AI agents. Architecture explained.
Vibium works through one small Go binary that sits between your code and Chrome. Your Python or JavaScript client sends commands to that binary, which launches Chrome, speaks WebDriver BiDi to it over a WebSocket, enforces auto-waiting server-side, and exposes a built-in MCP server for AI agents. One binary plays browser manager, protocol proxy, and AI gateway at once.
What are the moving parts of Vibium?
Vibium has three layers: a client library, a single Go binary, and the browser. The client is the API you write against, the binary does the real work, and Chrome is what gets driven.
| Component | Role | Interface |
|---|---|---|
| Client (Python / JS) | The developer-facing API you call | pip / npm package |
| Go binary | Browser management, BiDi proxy, auto-wait, MCP server | CLI, stdio, WebSocket |
| Chrome | The actual browser being automated | WebDriver BiDi |
The design goal is that the binary is invisible. You run pip install vibium or npm install vibium, call launch(), and everything below the client just works. See How to Install Vibium for that one-command setup.
What does the single Go binary do?
The Go binary is the heart of Vibium and handles four jobs at once. Shipping it as one compiled file is why Vibium has no tangled dependency tree:
- Browser management. It detects or downloads Chrome for Testing and launches it with BiDi enabled.
- BiDi proxy. It runs a local WebSocket server that routes your commands to the browser and relays responses back.
- Auto-wait. It runs actionability checks before each interaction so your scripts never need manual sleeps.
- MCP server. It speaks the Model Context Protocol over stdio so AI agents can use the browser as a tool.
Because all of this lives in one binary written once in Go, every client language gets identical behavior. That is why the sync and async APIs behave the same: they call the same binary.
What is WebDriver BiDi and why does Vibium use it?
WebDriver BiDi is a W3C standard for bidirectional browser automation, sent as JSON messages over a WebSocket. Classic WebDriver, standardized in 2018, used HTTP and was fundamentally one-way: the client asks, the browser answers. BiDi adds a second direction, so the browser can push events, console logs, and network activity to the client as they happen. A command looks like this:
{"id": 1, "method": "browsingContext.navigate", "params": {"url": "https://example.com"}}And the browser can send events unprompted:
{"method": "log.entryAdded", "params": {"level": "error", "text": "Uncaught TypeError..."}}Vibium chose BiDi because it is standards-based (governed by the W3C, not a single vendor), cross-browser by design, and event-driven, which is exactly what powers auto-waiting and real-time data collection.
How does Vibium auto-wait for elements?
Vibium enforces actionability inside the Go binary, not in your script. Before any click or type, the binary polls the page until the element passes a set of checks, up to a 30-second default timeout. The core checks are:
- Visible — the element has non-zero size and is not hidden by CSS.
- Stable — its bounding box has not changed across two samples taken 50ms apart.
- Receives events — a hit test at its center actually lands on the element, not an overlay.
- Enabled — it is not disabled or inside a disabled fieldset.
Different actions run different subsets; a click checks visible, stable, receives-events, and enabled, while filling a field also checks that it is editable. Because this runs server-side over a local WebSocket, polling is fast and every client gets the same timing. That is why your code can simply write vibe.find("#submit").click() with no retry loop, as shown in Your First Vibium Script in Python.
How does the built-in MCP server fit in?
The same Go binary hosts a Model Context Protocol server over stdio, so AI agents can drive the browser with zero glue code. MCP is the standard way agents like Claude Code call external tools. Because the MCP server is built into the binary that already manages Chrome and speaks BiDi, an agent gets the same auto-waiting and element-finding behavior your scripts do. Wiring it up is a single command, covered in Vibium MCP in Claude Code. This is what makes Vibium AI-native rather than a library you have to wrap.
How is this different from older tools?
Vibium folds the driver, the wait logic, and the AI interface into one binary built on a modern standard. Classic Selenium needs a separate ChromeDriver you version-match by hand, and tools like Playwright are built on the Chrome-specific CDP rather than the W3C BiDi standard. Vibium's bet is that a standards-based, single-binary, MCP-first design fits the AI-agent era better. The full trade-off is in Vibium vs Playwright.
Where do I go next?
Now that you understand the architecture, put it to use: write your first Python script or JavaScript script, choose between the sync and async APIs, or connect the MCP server to Claude Code.
Frequently asked questions
How does Vibium work under the hood?
Your Python or JavaScript client sends commands to a single Go binary. That binary launches Chrome, speaks the WebDriver BiDi protocol to it over a WebSocket, enforces auto-waiting on the server side, and also exposes a built-in MCP server so AI agents can drive the same browser.
What is WebDriver BiDi and why does Vibium use it?
WebDriver BiDi is a W3C bidirectional browser automation standard sent as JSON over a WebSocket. Unlike one-way classic WebDriver, the browser can push events to the client. Vibium uses BiDi so it is standards-based, cross-browser by design, and able to react to the page in real time.
What does the Vibium Go binary do?
The single Go binary manages the browser, proxies WebDriver BiDi commands to Chrome over a WebSocket, runs the auto-wait actionability checks, captures screenshots, and hosts an MCP server over stdio for AI agents. It is invisible to you; the client library just sends commands and gets results.
Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.
Related guides
Your First Vibium Script in JavaScript
Write your first Vibium script in JavaScript: launch Chrome, open a page, find and click elements, and save a screenshot with the sync API in minutes.
3 min read→Getting StartedYour First Vibium Script in Python
Write your first Vibium Python script: launch a browser, visit a page, find and click elements, and save a screenshot in about ten lines of code.
3 min read→Getting StartedHow to Install Vibium (Python & Node)
Install Vibium in seconds: pip install vibium for Python or npm install vibium for Node. Learn how the single Go binary auto-downloads Chrome for you.
4 min read→Getting StartedVibium Sync vs Async API
Vibium ships both a sync and an async API. Learn the difference, see Python and JavaScript examples, and choose the right one for your automation script.
4 min read→