VLearnVibium

Agentic Web Testing with Vibium

Agentic web testing with Vibium: let an AI agent explore, drive, and verify your app in a real browser via Vibium's MCP server and auto-waiting actions.

By Pramod Dutta··5 min read·Verified with Vibium 26.2
▶ Animated overview · made with Remotion

Agentic web testing with Vibium means letting an AI agent explore, drive, and verify your web app in a real browser — reading each page, deciding the next action, and checking the result — instead of running only pre-scripted steps. Vibium makes this practical because it is AI-native browser automation on WebDriver BiDi: a single Go binary that auto-downloads Chrome for Testing and ships a built-in Model Context Protocol (MCP) server, created by Jason Huggins (co-creator of Selenium and Appium). The agent calls Vibium's tools — navigate, find, click, type, screenshot — and Vibium auto-waits for actionability, so steps are reliable rather than racy. Install it with pip install vibium or npm install vibium. The result is testing that adapts to the UI: an agent can smoke-test a new build, walk a checkout, or reproduce a bug from a plain-English description, then confirm what actually happened on the page.

What is agentic web testing?

Agentic web testing puts an LLM in the loop as the tester. Rather than executing a fixed list of steps, the agent runs a perceive → decide → act → verify cycle: it reads the current page, chooses the next action toward a goal, performs it, and checks the outcome before continuing.

This differs from traditional automation, where every selector and assertion is written in advance. The agentic approach trades some determinism for adaptability — useful when the UI changes often, or when you want exploratory coverage that no one scripted by hand. Vibium supplies the browser the agent acts in.

How is agentic testing different from scripted testing?

The honest answer is that they solve different problems, and the best suites use both. Scripted tests are precise and repeatable; agentic tests are adaptive and exploratory.

AspectScripted Vibium testsAgentic testing
StepsFixed, written ahead of timeDecided at runtime by the agent
Reacts to UI changesBreaks until updatedAdapts within the same goal
DeterminismHigh — same run every timeLower — model chooses actions
Best forRegression, CI gatesExploration, smoke, changing flows
MaintenanceUpdate selectors on changeUpdate the goal/prompt

Neither replaces the other. Lock down critical regressions with deterministic scripts, and point agents at the parts of the app that move fast or need exploratory eyes.

How do you set up Vibium for an agent to test your app?

Register Vibium's MCP server with your agent host so the model gets a browser as a toolset. In Claude Code:

claude mcp add vibium -- npx -y vibium mcp

Start a fresh session so tool discovery runs, then confirm it connected:

claude mcp list

Now you can ask the agent to test in plain English: "Open the staging site, sign in with the test account, add an item to the cart, and tell me whether the cart count updates." The agent chains Vibium tools to carry it out. For the full host walkthrough see set up Vibium MCP in Claude Code, and for the available tools see the Vibium MCP tools reference.

How does the agent perceive the app under test?

Give the agent a semantic view of each page with Vibium's accessibility tree. The tree exposes roles, names, and state — far more stable to reason over than raw HTML, and it mirrors how a real user navigates.

from vibium import browser_sync as browser
 
vibe = browser.launch()
vibe.go("https://staging.example.com")
 
tree = vibe.a11y_tree()
# {"role": "WebArea", "children": [
#   {"role": "textbox", "name": "Email"},
#   {"role": "textbox", "name": "Password"},
#   {"role": "button", "name": "Sign in"}, ...]}

From the tree, the agent picks targets and acts using semantic matchers — role, text, label, placeholder, testid — which keeps tests resilient when class names or markup shift. See find an element.

How does the agent drive and verify a flow?

The agent acts, then verifies — reading text back and capturing a screenshot so it can confirm the app behaved, not just that the click landed. Here is the act-and-verify pattern a login smoke check follows:

vibe.find(role="textbox", label="Email").type("qa@example.com")
vibe.find(role="textbox", label="Password").type("secret")
vibe.find(role="button", text="Sign in").click()
 
# Verify the outcome, not just the action
heading = vibe.find("h1").text()
assert "Dashboard" in heading, f"login failed, saw: {heading}"
 
png = vibe.screenshot()          # visual evidence for the report
with open("after-login.png", "wb") as f:
    f.write(png)
 
vibe.quit()

Because Vibium auto-waits for each element to be visible, stable, and enabled before acting, the agent does not need retry loops or sleeps — the flakiness that plagues UI testing is handled at the protocol level. A full login example lives in automate login with Vibium.

How do you capture evidence and trace what the agent did?

Capture screenshots at each milestone and, for deeper review, record a trace of the whole session so you (or another reviewer) can scrub through what the agent saw and did. Screenshots from the MCP server save to ~/Pictures/Vibium/ by default. For a frame-by-frame timeline with network and DOM snapshots, Vibium's tracing produces a trace.zip you can open in the Vibium Trace viewer — invaluable when an agentic run finds a bug and you need to show exactly how it happened. Pairing visual evidence with assertions turns an agent's run into a reviewable test artifact rather than an opaque "it worked."

When should you use agentic web testing?

Reach for agentic testing when the value is in adaptability and exploration. It excels at smoke-testing fresh builds, walking flows whose UI changes weekly, reproducing user-reported bugs from a description, and discovering issues no scripted test anticipated. Keep deterministic Vibium scripts for the things that must never break — payment paths, auth, core regressions gating a release. A practical split: agents for breadth and change, scripts for depth and stability. As models and tooling mature, expect the agentic layer to take on more of the routine exploration while scripted suites guard the critical core. To build the agent itself, see build an AI agent that browses the web and use Vibium with LangChain.

Next steps

Frequently asked questions

What is agentic web testing?

Agentic web testing uses an AI agent to explore and verify a web app instead of running only pre-scripted steps. The agent reads each page, decides the next action, and checks outcomes in a real browser — adapting to the UI rather than following a fixed, brittle script.

How does Vibium enable agentic web testing?

Vibium ships a built-in MCP server that exposes browser actions as tools, so an AI agent can navigate, find, click, type, and screenshot a real Chrome browser. Auto-waiting for actionability makes each step reliable, so the agent verifies behavior without flaky timing fixes.

Does agentic testing replace traditional automated tests?

No. Agentic testing complements scripted tests. Use deterministic Vibium scripts for stable, repeatable regression checks, and use AI agents for exploration, smoke checks, and verifying flows that change often. Together they cover both predictable and evolving parts of an app.

Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.

Related guides