VLearnVibium

How to Scrape Behind a Login with Vibium

Scrape data behind a login with Vibium in Python — log in once, then navigate authenticated pages and extract content while the session cookie carries you through.

By Pramod Dutta··3 min read·Verified with Vibium 26.2
▶ Animated overview · made with Remotion

To scrape behind a login with Vibium, log in once by typing your credentials and submitting, then navigate the protected pages and extract content with find() and findAll() while the session cookie carries your authentication. Vibium is AI-native browser automation built on WebDriver BiDi, shipped as a single Go binary that auto-downloads Chrome for Testing — pip install vibium and you are set. Because Vibium drives a real Chrome instance, the session cookie set at login is stored automatically, so every subsequent vibe.go() to a protected URL in the same session is already authenticated. That is the whole trick: authenticate once, then scrape freely. Created by Jason Huggins, co-creator of Selenium and Appium, Vibium auto-waits for each element to be actionable, so login forms and lazy-loaded protected content both work without manual sleeps. Always check the site's terms of service and respect its rate limits before scraping.

What is the scrape-behind-login script?

import os
from vibium import browser_sync as browser
 
vibe = browser.launch()
 
# 1. Log in once.
vibe.go("https://app.example.com/login")
vibe.find('input[name="email"]').type(os.environ["APP_USER"])
vibe.find('input[name="password"]').type(os.environ["APP_PASSWORD"])
vibe.find('button[type="submit"]').click()
 
# Wait for an element that only appears once authenticated.
vibe.find('[data-testid="dashboard"]')
 
# 2. The session cookie carries us — navigate to a protected page.
vibe.go("https://app.example.com/reports")
 
# 3. Scrape the protected content.
rows = vibe.findAll('[data-testid="report-row"]')
for row in rows:
    print(row.text())
 
vibe.quit()

The script logs in, waits for a dashboard element to confirm success, then navigates to a protected reports page and extracts every row. No re-login is needed between pages because the browser keeps the session.

How does each step work?

  1. vibe.go(login_url) — opens the login page and waits for it to load.
  2. find('input[...]').type(...) — enters credentials read from environment variables, never hard-coded.
  3. find('button[type="submit"]').click() — submits the form; Chrome stores the session cookie on success.
  4. vibe.find('[data-testid="dashboard"]') — waits for an element that only renders when logged in, confirming auth before you continue.
  5. vibe.go(protected_url) — navigates to a gated page; the stored cookie authenticates the request.
  6. vibe.findAll(...) — extracts the protected content into a list you can save or process.

Because find() auto-waits, you never scrape a page before login has completed. See How to automate a login flow for the login step in depth.

How do I keep my session alive across pages?

Once you submit the login form, the browser holds the session cookie for the rest of the run, so every vibe.go() to a same-origin protected URL stays authenticated:

# All three pages load authenticated, no re-login needed.
for path in ["/reports", "/invoices", "/settings"]:
    vibe.go(f"https://app.example.com{path}")
    title = vibe.find("h1")
    print(title.text())

This is the core advantage of driving a real browser: the cookie jar behaves exactly as it would for a logged-in human.

How do I verify the login succeeded before scraping?

Always confirm authentication before extracting data, otherwise you may scrape a login redirect instead of real content. Read an element that only logged-in users see, or check the URL:

vibe.find('button[type="submit"]').click()
 
# Wait for a logged-in-only element; find() raises if it never appears.
dashboard = vibe.find('[data-testid="dashboard"]')
assert "Welcome" in dashboard.text()
 
print(vibe.url())  # https://app.example.com/dashboard

If the credentials are wrong, the dashboard element never renders and the assertion fails fast — a clear signal instead of silently scraping an error page.

Tips for reliable authenticated scraping

  • Read credentials from environment variables (os.environ), never commit them to code.
  • Confirm login by waiting on a logged-in-only element before you scrape.
  • Respect rate limits — add pacing between requests so you do not hammer the server.
  • Review the site's terms of service; only scrape data you are authorized to access.
  • Persisting a session across separate script runs is possible via Vibium's cookie and storage APIs; see the official docs at vibium.com for the exact methods in your version.

Next steps

Frequently asked questions

How do I scrape behind a login with Vibium?

Log in first by finding the username and password fields, typing your credentials, and clicking submit. The browser keeps the session cookie, so any page you navigate to afterward in the same session is authenticated. Then use find() and findAll() to extract the protected content.

Does Vibium keep me logged in across pages?

Yes. Vibium drives a real Chrome browser, so once you submit the login form the session cookie is stored automatically. Every vibe.go() to a protected URL within the same session sends that cookie, which means you stay authenticated without re-entering credentials on each page.

Is it legal to scrape pages behind a login?

It depends on the site's terms of service and your jurisdiction. Scraping content you are authorized to access for your own account is generally lower risk, but always review the site's terms, respect rate limits, and never share or republish data you are not licensed to use.

Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.

Related guides