VLearnVibium

How to Scrape an Infinite-Scroll Page with Vibium

Scrape an infinite-scroll page with Vibium in Python — scroll with evaluate(), wait for new items to render, and loop until no more content loads.

By Pramod Dutta··4 min read·Verified with Vibium 26.2
▶ Animated overview · made with Remotion

To scrape an infinite-scroll page with Vibium, scroll to the bottom with vibe.evaluate(), wait for the new items to render, and repeat until the item count stops growing. Infinite-scroll feeds load more content as you near the bottom, so a single page read only captures the first batch. The reliable pattern is a loop: count the items with findAll(), scroll down, wait for the count to grow, and read again — stopping when two consecutive counts match, which means no more content loaded. Vibium's auto-waiting find() makes the "wait for new items" step clean, and evaluate() gives you the pixel-level scrolling that triggers each fetch. With a max-scroll safety cap, this handles both finite feeds (a news archive) and effectively endless ones (a social timeline) without hanging.

What is the infinite-scroll scraping script?

from vibium import browser_sync as browser
 
ITEM = ".feed-item"
 
vibe = browser.launch()
vibe.go("https://example.com/feed")
 
# Make sure the first batch is present before counting.
vibe.find(ITEM).wait_until("visible")
 
seen = 0
for _ in range(50):  # safety cap on scroll rounds
    count = len(vibe.findAll(ITEM))
    if count == seen:
        break  # no new items loaded — we've reached the end
    seen = count
    vibe.evaluate("window.scrollTo(0, document.body.scrollHeight)")
    vibe.wait(800)  # give the next batch time to fetch and render
 
items = [el.text() for el in vibe.findAll(ITEM)]
print(f"Scraped {len(items)} items")
 
vibe.quit()

The loop counts the current items, scrolls to the bottom to trigger the next fetch, then waits before the next iteration re-counts. When a round adds nothing — count == seen — the feed is exhausted and the loop breaks. The range(50) cap guarantees the script terminates even on an endless timeline.

How does each step work?

  1. vibe.find(ITEM).wait_until("visible") — waits for the first batch to render before any counting, so you don't start the loop against an empty DOM.
  2. len(vibe.findAll(ITEM)) — counts every currently rendered item. findAll() returns immediately with whatever matches right now.
  3. vibe.evaluate("window.scrollTo(0, document.body.scrollHeight)") — jumps to the bottom, which is what triggers the page's "load more" fetch.
  4. vibe.wait(800) — pauses briefly so the async request can complete and append the next batch.
  5. count == seen — the stop condition: if a full round added zero items, you've reached the end.

How do I wait for new items instead of a fixed delay?

A fixed wait() is simple but either wastes time or races a slow network. A more robust approach waits for the item count to actually increase, so the loop runs as fast as the page allows and never reads too early.

from vibium import browser_sync as browser
 
ITEM = ".feed-item"
 
vibe = browser.launch()
vibe.go("https://example.com/feed")
vibe.find(ITEM).wait_until("visible")
 
seen = 0
for _ in range(50):
    seen = len(vibe.findAll(ITEM))
    vibe.evaluate("window.scrollTo(0, document.body.scrollHeight)")
 
    # Poll until the count grows or we conclude the feed has ended.
    grew = vibe.wait_for_function(
        f"document.querySelectorAll('{ITEM}').length > {seen}"
    )
    if not grew:
        break  # count never increased — end of feed
 
items = vibe.findAll(ITEM)
print(f"Scraped {len(items)} items")
 
vibe.quit()

wait_for_function() polls a JavaScript condition until it's true or the timeout is hit, so the loop advances the instant the next batch appears rather than after a guessed delay. When the count can no longer grow, the wait stops and you break out. This adapts naturally to fast and slow connections alike.

How do I scrape data while scrolling, not just at the end?

For very long feeds, collect each item's data as you go and dedupe, so a crash mid-run doesn't lose everything and memory stays bounded.

results = []
seen_ids = set()
 
for _ in range(50):
    for card in vibe.findAll(".feed-item"):
        item_id = card.attr("data-id")
        if item_id and item_id not in seen_ids:
            seen_ids.add(item_id)
            results.append({"id": item_id, "text": card.text()})
 
    before = len(seen_ids)
    vibe.evaluate("window.scrollTo(0, document.body.scrollHeight)")
    vibe.wait(800)
    if len(vibe.findAll(".feed-item")) == before:
        break

Reading card.attr("data-id") gives each item a stable key, so the seen_ids set skips duplicates that re-render as you scroll. This incremental collection is the safest pattern for endless feeds — see scrape a table for the structured-data techniques that pair with it.

Tips for reliable infinite-scroll scraping

  • Always cap your scroll rounds so a truly endless feed can't hang the script.
  • Prefer waiting on a condition (wait_for_function or a new element) over a fixed sleep when speed matters.
  • Dedupe by a stable id because virtualized lists re-render the same items as you scroll.
  • Scroll the right container — if the feed scrolls inside a div, scroll that element via evaluate() rather than the window.

Next steps

Frequently asked questions

How do I scrape an infinite-scroll page with Vibium?

Scroll to the bottom with vibe.evaluate(), wait for new items to render, then count the items with findAll(). Loop this scroll-wait-count cycle, stopping when the count stops growing, which signals you have reached the end of the feed and loaded every item.

How do I know when an infinite-scroll page is finished loading?

Track the item count between scrolls with findAll(). After each scroll, wait briefly and re-count; when two consecutive counts are equal, no new content loaded and you have hit the end. A max-scroll cap is a good safety net for very long or endless feeds.

Why does my infinite-scroll scraper miss items?

Usually because it reads the DOM before the next batch renders. Scrolling triggers an async fetch, so count the items, scroll, then wait for the count to increase before reading again. Vibium's find() auto-waits for one new item, which paces the loop reliably.

Vibium is created by Jason Huggins. This is an independent tutorial — see the official Vibium site and GitHub repo for canonical docs.

Related guides