VLearnVibium

Fixing Flaky Vibium Tests: The Complete Guide

Fixing flaky Vibium tests: diagnose the root cause, replace sleeps with auto-waiting, stabilize selectors, mock the network, and gate CI on green.

By Pramod Dutta··15 min read·Verified with Vibium 26.2
▶ Animated overview · made with Remotion

Fixing flaky Vibium tests comes down to five moves: diagnose which check is failing, delete manual sleep() calls so Vibium's auto-waiting can do its job, pin selectors and state so nothing races, mock unstable network calls, and gate CI on a genuinely green run. Vibium is AI-native browser automation built on WebDriver BiDi, shipped as a single Go binary that auto-downloads Chrome for Testing — created by Jason Huggins, co-creator of Selenium and Appium. The single most important fact about flakiness in Vibium is that it already auto-waits: before every click, fill, or type it polls the target until it is visible, stable, enabled, and actually receiving events, up to a 30-second default. That means a "flaky" Vibium test is almost never random. It is a fixed sleep that fired too early, a selector that matched a moving or duplicated node, a live API that returned different data or latency, or state leaking from a previous test. This guide walks each cause with runnable fixes in Python and JavaScript, then shows how to lock reliability in with tracing, context isolation, and CI.

What causes flaky tests in Vibium?

Flaky tests fail intermittently without any change to the code under test, and in Vibium the causes cluster into four buckets. Naming the bucket is most of the fix, because each one has a different remedy — cranking a timeout will not fix a shared-state leak, and mocking the network will not fix a bad selector.

The pipeline above is the whole method: make the flake reproducible, find the actual failing condition, fix that root cause, isolate the test so nothing external can perturb it, then prove it green across many CI runs. The table below maps each common cause to its real fix.

Flakiness causeWhat it looks likeRoot fix
Manual sleep() racing the pagePasses locally, fails on slow CIDelete the sleep; let auto-waiting poll
Fragile selector"Element not found" or wrong element clickedUse role/text/testid semantic selectors
Moving or duplicated elementClick lands on the wrong node or missesTarget by shortest text; wait for the settled state
Un-mocked network callData or timing differs run to runroute() + fulfill() to return fixed JSON
Shared state between testsOrder-dependent pass/failFresh browser context per test
Unpinned viewportLayout-dependent assertions driftFix window_size / viewport

How does Vibium's auto-waiting prevent most flakiness?

Vibium's actionability checks are the built-in mechanism that removes most timing flakiness before you write a single line of waiting code. Before it performs an action, Vibium runs a poll loop that re-checks the element until every required condition passes or the timeout expires. For a click, the element must be visible, stable (its bounding box has not moved for 50ms), enabled, and it must receive events — a hit-test at the element's center must return that element, not an overlay on top of it.

This is why the classic flaky-test antidote from other tools — sprinkling sleep() everywhere — actively hurts you in Vibium. A fixed delay either waits too long (slow suite) or too short (flaky), whereas Vibium's wait resolves the instant the element is genuinely ready.

from vibium import browser_sync as browser
 
vibe = browser.launch()
vibe.go("https://example.com")
 
# No sleep needed — Vibium polls until the button is actionable, then clicks.
vibe.find('button[data-testid="submit"]').click()
 
vibe.quit()

The same behavior holds in JavaScript. Every action inherits the auto-wait, so you describe what to do, not when it is safe to do it.

const { browser } = require('vibium/sync')
 
const bro = browser.launch()
const page = bro.page()
page.go('https://example.com')
 
// Auto-waits until actionable; no timing code required.
page.find('button[data-testid="submit"]').click()
 
bro.close()

Because this logic lives server-side in the Go engine, every client — Python sync, Python async, JS sync, JS async — gets identical timing behavior. A flake that appears in one language is not a client quirk; it is a real page condition worth diagnosing.

How do I reproduce a flaky Vibium test on demand?

You cannot reliably fix a flake you cannot reproduce, so the first step is to force it to fail. Run the test in a tight loop, headless, and count the failures — a test that fails 3 times in 50 runs is far easier to debug once you can trigger it at will.

# Run a single pytest 50 times, stop on the first failure
for i in $(seq 1 50); do
  pytest tests/test_checkout.py::test_add_to_cart -q || { echo "FAILED on run $i"; break; }
done

Run this against your real target and note the failure rate. A high rate points to a systemic issue (a selector, a sleep); a low, timing-sensitive rate often points to network latency or an animation. Reproduce headless, because CI runs headless and some flakes only surface there.

If the test passes 50/50 locally but fails in CI, the difference is the environment — usually slower CPU (which exposes sleeps), a different viewport, or a live network. Those are covered below.

How do I debug what actually failed?

Capture the page state at the exact moment of failure, because a flake's stack trace rarely tells you why the element was not ready. Wrap the test body so that on any exception you save a full-page screenshot and the HTML before re-raising.

from vibium import browser_sync as browser
 
vibe = browser.launch()
try:
    vibe.go("https://example.com")
    vibe.find("#submit").click()
    assert vibe.find("h1").text() == "Thanks"
except Exception:
    png = vibe.screenshot(full_page=True)
    with open("/tmp/flake.png", "wb") as f:
        f.write(png)
    with open("/tmp/flake.html", "w") as f:
        f.write(vibe.content())
    raise
finally:
    vibe.quit()

Open the screenshot and ask the diagnostic questions: Is the element even on screen? Is it greyed out (disabled)? Is a banner or spinner sitting over it? Is the data different from what the assertion expects? The answer routes you to the right section of this guide. The screenshot command returns raw PNG bytes, and content() returns the live DOM — together they freeze the failure for inspection.

For a step-by-step timeline instead of a single frame, use tracing, which records actions with periodic screenshots so you can replay the run and see precisely where it diverged.

const { browser } = require('vibium/sync')
 
const bro = browser.launch()
const ctx = bro.defaultContext ? bro.defaultContext() : bro
ctx.tracing.start({ screenshots: true, snapshots: true })
 
const page = bro.page()
page.go('https://example.com')
page.find('#submit').click()
 
ctx.tracing.stop({ path: 'trace.zip' })
bro.close()

A trace turns "it fails sometimes" into "it fails at step 4 because the modal was still animating," which is a fixable statement.

How do I fix flaky selectors?

A selector that matches nothing — or matches the wrong node — is the most common non-timing flake, and the fix is to describe the element the way a user perceives it rather than by its position in the DOM. Positional CSS like div.form > button:nth-child(3) breaks the moment the markup shifts, producing an intermittent "element not found" as pages render in slightly different order.

Vibium's find() accepts semantic strategies — role, text, label, placeholder, and testid — that survive markup churn.

# Brittle: positional, breaks when the DOM is reshuffled
vibe.find("div.form > button:nth-child(3)").click()
 
# Resilient: matches by role and visible label
vibe.find(role="button", text="Submit").click()
 
# Most stable of all: pin to an explicit test id
vibe.find(testid="checkout").click()

When you pass text and several elements match, Vibium's pickBest() heuristic prefers the element with the shortest matching text — so a real <button>Submit</button> wins over a <div> that merely contains the word "Submit" in a paragraph. That single behavior eliminates a whole class of "clicked the wrong thing" flakes.

For a duplicated element (a row that appears in a list), scope the search to its container or index into the matches with first(), last(), or nth() instead of hoping the first global match is the right one. See selector best practices for the full decision tree.

How do I fix flaky clicks on overlays and moving elements?

When a click intermittently misses, Vibium's actionability checks are usually refusing to click through something — and that is correct behavior, not a bug to force past. If a cookie banner, modal, or toast covers your target, the receivesEvents check fails because a hit-test at the element's center returns the overlay. The fix is to dismiss the blocker first, then click the real element.

# Dismiss the overlay, then the real target becomes clickable
vibe.find(role="button", text="Accept").click()
vibe.find("#start-checkout").click()

If a blocker appears unpredictably between runs, dismiss it as the very first step of the flow so no later click is ever obstructed. For an element that keeps moving, the stable check waits for its bounding box to settle for 50ms — so usually you let Vibium wait. When an animation effectively never ends (a spinner, an infinite carousel), target the element that appears after motion stops and let find() auto-wait for that settled state instead. The flaky clicks guide covers each variant in depth.

How do I fix flaky tests caused by network calls?

An un-mocked API call is one of the highest-impact sources of flakiness, because a live backend introduces variable latency, rate limits, and changing data — any of which can flip a test between green and red. The durable fix is to intercept the request and return a fixed response, so the UI renders identical data on every run.

Vibium supports Playwright-style network interception: route() registers a handler for matching requests, and fulfill() returns a canned response.

const { browser } = require('vibium/sync')
 
const bro = browser.launch()
const page = bro.page()
 
// Return deterministic JSON instead of hitting the real API.
page.route('**/api/cart', route => {
  route.fulfill({
    status: 200,
    contentType: 'application/json',
    body: JSON.stringify({ items: 2, total: 49.98 }),
  })
})
 
page.go('https://store.example.com/cart')
page.find(text: '$49.98')   // stable every run — no live backend variance
bro.close()

When you do need the real request to complete before asserting — for a call you cannot or should not mock — wait for the response explicitly rather than sleeping a guessed number of milliseconds.

// Wait for the real network call, then assert on the rendered result.
page.find('#refresh').click()
page.waitForResponse('**/api/orders')
page.find('#order-count')

Waiting on the actual response event is deterministic; a sleep(2000) is a bet that the call finishes in under two seconds — a bet that loses on a slow CI runner. See monitor network requests and wait for AJAX for the surrounding patterns.

How do I wait for a condition that is not an element?

When the thing you are waiting for is a piece of page state rather than an element becoming clickable, use waitForFunction to poll an arbitrary JavaScript predicate instead of guessing a delay. Auto-waiting covers element actionability, but some flakes hinge on state that no single element expresses — a global flag a script sets, an item count reaching a value, or a spinner class being removed.

waitForFunction re-evaluates your expression in the page until it returns truthy, which turns "wait long enough for the app to finish" into "wait for exactly this condition."

const { browser } = require('vibium/sync')
 
const bro = browser.launch()
const page = bro.page()
page.go('https://app.example.com/dashboard')
 
// Wait until the app signals it has finished hydrating — not a fixed sleep.
page.waitForFunction('window.__APP_READY__ === true')
 
// Or wait until a live list has fully populated.
page.waitForFunction('document.querySelectorAll(".row").length >= 10')
 
page.find('#export').click()
bro.close()

This is the deterministic replacement for the most stubborn sleeps — the ones people add because "the page needs a second to settle." Instead of betting on a duration, you assert the precise post-condition and Vibium proceeds the moment it is true. Reach for waitForFunction only when no element-level wait fits; for the common case, letting find() auto-wait is simpler and just as reliable.

How do I stop time and animations from causing flakes?

Time-dependent UI — a countdown, a relative timestamp like "2 minutes ago," a toast that auto-dismisses, or an entrance animation — is a quiet but persistent flake source, because the exact moment your assertion runs relative to the clock changes run to run. Vibium can freeze and control the page clock so time-based UI becomes deterministic.

Installing the clock lets you pin Date.now(), freeze timers, and advance time on demand, so a "5 seconds remaining" label reads the same on every run regardless of how fast the machine is.

const { browser } = require('vibium/sync')
 
const bro = browser.launch()
const page = bro.page()
 
// Freeze the clock at a fixed instant before the app reads the time.
page.clock.install({ time: new Date('2026-06-23T12:00:00Z').getTime() })
 
page.go('https://app.example.com')
 
// "2 minutes ago" style labels are now stable — the clock will not tick.
page.find(text: '12:00 PM')
bro.close()

For animations specifically, the cleanest fix is often to disable them so nothing is mid-transition when Vibium captures or asserts. Injecting a stylesheet that zeroes transition and animation durations removes motion entirely, which pairs well with the stable actionability check by ensuring elements settle instantly.

// Kill all CSS animation/transition so elements never render mid-motion.
page.addStyle('*,*::before,*::after{transition:none!important;animation:none!important}')

Between controlling the clock and disabling animation, you remove the two most common time-driven flakes without a single sleep(). Use these when a test's failures correlate with timing rather than data or selectors.

How do I isolate state so tests stop interfering?

Order-dependent flakiness — where a test passes alone but fails in the suite — is almost always shared state: cookies, localStorage, or a logged-in session leaking from one test into the next. The fix is to give every test its own browser context, an isolated cookie jar and storage that starts empty.

from vibium import browser_sync as browser
 
# One browser process, a fresh context (clean state) per test.
bro = browser.launch()
 
def test_adds_to_cart():
    ctx = bro.new_context()
    vibe = ctx.new_page()
    vibe.go("https://store.example.com")
    # ... assert on a guaranteed-empty cart ...
    ctx.close()
 
def test_checkout_flow():
    ctx = bro.new_context()   # isolated from the previous test
    vibe = ctx.new_page()
    # ... starts with no leftover cart or session ...
    ctx.close()

A fresh context per test is cheap — it reuses the same browser process — and it removes an entire category of flakes in one move. If you use pytest, encode this as a fixture so isolation is automatic and impossible to forget; the pytest integration guide shows the exact fixture, including one that screenshots on failure.

When should I actually retry a flaky test?

Retry at the test level, never at the action level, and only after you have ruled out the fixable causes above. Retrying an individual click() masks a real problem — if the button was not ready, the right fix is a correct wait or selector, not a second attempt that hides the timing bug. But a small number of tests are legitimately non-deterministic (they depend on a third-party service you cannot mock, say), and for those a whole-test retry in your runner is a reasonable pressure valve.

SituationDo thisNot this
Element not ready yetLet auto-waiting pollRetry the click in a loop
Wrong or duplicated elementFix the selectorRetry until it "sticks"
Variable API data/latencyMock with route()/fulfill()Sleep longer
Truly external non-determinismRetry the whole test 1–2× in CIRetry each action

The rule of thumb: fix the root cause for the many, and reserve runner-level retries for the genuine few. Blanket retries turn a fast, trustworthy suite into a slow one that hides regressions.

How do I make Vibium tests reliable in CI?

CI is where flakiness surfaces most, because runners are slower and headless, so the durable fixes are to remove every source of non-determinism the environment can perturb. Run headless, pin the viewport so layout-dependent assertions do not drift, mock external calls, isolate each test in its own context, and save evidence on failure.

import os
from vibium import browser_sync as browser
 
# Headless + pinned viewport = deterministic layout on any runner.
vibe = browser.launch(headless=True, window_size=(1280, 800))
try:
    vibe.go("https://example.com")
    vibe.find(testid="submit").click()
    assert vibe.find("h1").text() == "Thanks"
except Exception:
    with open("artifacts/flake.png", "wb") as f:
        f.write(vibe.screenshot(full_page=True))
    raise
finally:
    vibe.quit()

Upload artifacts/flake.png (and a trace, if you record one) as CI artifacts so you can diagnose a failure straight from the build log without re-running locally — the single biggest time-saver when a flake only reproduces on the runner. Because Vibium ships as one Go binary that auto-downloads Chrome for Testing, there is no driver-version drift between your machine and CI, which removes a notorious source of "works on my machine" flakiness on its own. The GitHub Actions guide has a ready-to-copy workflow, and flake-free tests collects the broader discipline.

Checklist: is your Vibium test flake-resistant?

  • No sleep() calls — auto-waiting is smarter and faster than any fixed delay.
  • Semantic selectors (role, text, testid) instead of positional CSS.
  • Overlays dismissed first so clicks land on the real target.
  • External network mocked with route() + fulfill(), or waited on with waitForResponse.
  • Fresh context per test so state never leaks between tests.
  • Viewport pinned and runs headless to match CI.
  • Screenshot + trace on failure saved as artifacts for diagnosis.

Work down the list top-first — the earliest items fix the most flakes for the least effort. If you are ever unsure of an option's exact behavior, check the actionability explanation and the official docs at vibium.com rather than guessing.

Next steps

Frequently asked questions

Why are my Vibium tests flaky?

Most flaky Vibium tests come from timing and environment, not from Vibium itself. The usual causes are manual sleep() calls that race the page, selectors that match a moving or duplicated element, un-mocked network calls with variable latency, and shared state leaking between tests. Vibium auto-waits for actionability, so removing sleeps and isolating state fixes the majority of flakes.

Does Vibium have built-in retries for flaky tests?

Vibium auto-waits on every action: before a click, fill, or type it polls the element until it is visible, stable, enabled, and actually receiving events, up to a 30-second default. That built-in wait removes most timing flakiness without a retry loop. For the rare genuinely non-deterministic test, wrap the whole test in a retry at your test-runner level rather than retrying individual clicks.

How do I stop using sleep() in Vibium tests?

Delete the sleep and let find() and its actions auto-wait. Vibium polls actionability until the element is ready, so find('#submit').click() waits on its own. When you need to wait for a condition rather than an element, use waitForResponse for a network call or waitForFunction for arbitrary page state instead of a fixed delay.

How do I debug a Vibium test that only fails sometimes?

Capture evidence at the moment of failure: wrap the test body in try/except and call screenshot(full_page=True) plus page.content() before re-raising. Enable tracing to record a step-by-step timeline with screenshots. Then run the test in a loop 20 to 50 times headless to reproduce the flake reliably before you try to fix it.

Can un-mocked API calls make Vibium tests flaky?

Yes. A live backend adds variable latency, rate limits, and changing data, any of which can flip a test between pass and fail. Use page.route() to intercept the request and page.fulfill() to return a fixed JSON response, so the UI renders identical data every run. Mocking the network is one of the highest-impact fixes for flakiness.

How do I make Vibium tests reliable in CI?

Run headless with browser.launch(headless=True), pin the viewport so layout is deterministic, mock external network calls, and isolate each test in a fresh browser context so state never leaks. Save a screenshot and trace as CI artifacts on failure so you can diagnose a flake from the logs without re-running locally.

Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.

Related guides