Vibium

Browser automation for AI agents and humans, built on WebDriver BiDi.
Quickstart View on GitHub
Start here
Read the Introduction for the elevator pitch, or jump straight to the Quickstart to run three commands.
Build a mental model
Getting Started walks through the moving parts. Core Concepts explains the design.
Worked example
The Tutorial fills out a real form end-to-end.
Reference
Browse the Command Reference and the Client Libraries overview.
What Vibium does

Vibium gives an agent or script a real browser session with a compact command surface. The usual loop is to open a page, map the interactive elements, choose the right reference, act on that element, then read back the visible state. The same session can produce screenshots, PDFs, extracted text, and recordings, so the browser state is useful both to humans reviewing a run and to agents planning the next step.
The documentation is organized around that loop. Start with installation when you need the binary and browser runtime, use the quickstart for a short live run, then move to the tutorial when you want a complete form-filling workflow. The command reference is intentionally direct: each page explains one command, its arguments, and a small example that can be copied into an agent or shell session.
Terminology and concepts

The Core Concepts page is the glossary-style reference for the terms used throughout these docs. It explains browser sessions, element references, maps, diffs, waits, assertions, captures, and recordings. Those terms appear in the CLI reference, the MCP integration guide, and the client library overview, so keeping the vocabulary explicit helps agents navigate the site without guessing from surrounding prose.
Agent-readable docs

The site publishes multiple machine-readable entrypoints. /llms.txt is the curated index of clean Markdown sources, /llms-full.txt is the expanded single-file context document, and each documentation route has a corresponding Markdown mirror for tools that prefer the content without navigation chrome. The Markdown mirrors keep frontmatter and sitemap links so crawlers can connect HTML pages, source Markdown, and the public route index.
Expanded Markdown context
# Vibium Documentation Context

Browser automation for AI agents and humans, built on WebDriver BiDi.

This file is a single-document concatenation of the Vibium documentation for agent consumption. It is generated from the Markdown sources by `scripts/build_llms_txt.py`. Edit the sources, not this file.

The spec-compliant llms.txt index is available at `/llms.txt`.

Generated: 2026-05-19 15:19:24 UTC


--- file: README.md ---

# Vibium Documentation

Browser automation for AI agents and humans, built on WebDriver BiDi.

This repository contains the user-facing documentation for [Vibium](https://github.com/VibiumDev/vibium).
The docs are plain Markdown so they render well on GitHub, on a static site generator,
and inside an agent context window.

## Contents

- [Introduction](docs/introduction.md)
- [Installation](docs/installation.md)
- [Quickstart](docs/quickstart.md) — a few lines, copy-paste.
- [Getting Started](docs/getting-started.md) — the mental model.
- [Tutorial: Filling a Form End-to-End](docs/tutorial.md) — a worked example.
- [Core Concepts](docs/concepts.md)
- [Command Reference](docs/commands/index.md)
- [MCP Server Integration](docs/mcp-integration.md)
- [Client Libraries](docs/client-libraries.md)
- [Troubleshooting](docs/troubleshooting.md)
- [FAQ](docs/faq.md)
- [Contributing](docs/contributing.md)

## Site generation

The top-level `Makefile` is the entrypoint for generated docs output:

- `make build` regenerates public site assets and LLM docs, syncs Markdown into
  Starlight, and builds the static site in `site/dist/`.
- `make rebuild` removes generated site output first, then runs the full build.
- `make serve` regenerates content and starts the local Astro dev server.
- `make clean` removes generated public assets, generated LLM assets, generated
  Starlight content, `site/dist/`, and the Astro cache.

Edit canonical brand imagery in `site/src/assets/brand/`. The favicon files in
`site/public/` are generated from the logomark and should not be edited by hand.

The generated `site/public/llms.txt` is a spec-compliant index served at
`/llms.txt`; it links to Markdown copies of the docs under `/llms/`. The
generated `site/public/llms-full.txt` keeps the single-file context form for
agents that prefer one large document. The same generator also writes
`robots.txt`, XML and Markdown sitemaps, and route-level Markdown mirrors used
by agent-readability crawlers.


--- file: docs/introduction.md ---

---
title: Introduction
---

Vibium is a browser automation tool designed for AI agents and humans.
It gives an agent (or a script) a real browser it can drive: navigate to pages,
fill forms, click buttons, extract text, capture screenshots, and record sessions.

## Why Vibium

- **AI-native**. Install Vibium as a skill and an agent immediately gains the
  full browser-automation toolkit, with command names and semantics designed to
  be intuitive for an LLM.
- **Zero configuration**. A single install pulls down Google Chrome for
  Testing. No driver binaries, no profile setup, no protocol shims to glue
  together.
- **Standards-based**. Built on the [WebDriver BiDi](https://w3c.github.io/webdriver-bidi/)
  protocol rather than a vendor-specific debugging protocol.
- **Lightweight**. A single ~10 MB binary with no runtime dependencies.
- **Multi-interface**. Use Vibium from the [CLI](commands/index.md), as an
  [MCP server](mcp-integration.md), or as a
  [client library](client-libraries.md) in JavaScript/TypeScript, Python, or
  Java.

## Who it is for

- Agents (Codex, Claude Code, Cline, Antigravity, Cursor, OpenCode, Pi, Amp)
  that need to act on real web pages.
- Test engineers writing AI-native end-to-end tests.
- Developers and humans who want a friendly CLI for ad-hoc browser tasks.

## Platform support

| Platform                       | Support target |
| ------------------------------ | -------------- |
| Linux (x64)                    | Yes            |
| macOS (x64, Intel)             | Yes            |
| macOS (arm64, Apple Silicon)   | Yes            |
| Windows (x64)                  | Yes            |

## Where next

- [Installation](installation.md) — install the binary and the browser.
- [Quickstart](quickstart.md) — open a page and take a screenshot in 30 seconds.
- [Getting Started](getting-started.md) — the mental model and the core command loop.
- [Tutorial](tutorial.md) — a worked end-to-end example.
- [Command Reference](commands/index.md) — every command, with examples.


--- file: docs/installation.md ---

---
title: Installation
---

Vibium ships as a single self-contained binary. The installer also downloads
a managed copy of Google Chrome for Testing on first use, so a fresh install
is a one-liner.

## Prerequisites

- Node.js 18+ (only required for the npm-based installer and the JS client)
- A supported platform: Linux x64, macOS x64/arm64, or Windows x64

You do **not** need a pre-installed browser; Vibium downloads Google Chrome
for Testing.

## Install the CLI

```sh
npm install -g vibium
```

This installs the `vibium` binary globally. The first time you run any command
that requires a browser, Vibium downloads its managed Google Chrome for
Testing build. On macOS, the browser appears as "Google Chrome for Testing".

### Zero-install with `npx`

If you don't want to install anything, every command works through `npx`:

```sh
npx -y vibium go https://example.com
npx -y vibium screenshot -o example.png
npx -y vibium text
```

`npx` fetches the package on demand and runs the binary. The first invocation
is a little slower while npm caches the package; subsequent calls are fast.
This is the most ergonomic way to try Vibium, run a one-off in CI, or
script a quick task on a machine where you can't (or don't want to) install
software globally.

For convenience in a shell, alias it:

```sh
alias vibium='npx -y vibium'
```

After that, every example in these docs that says `vibium ...` works as-is.

## Install as an agent skill

If you are setting up Vibium for an AI coding agent (for example Claude Code),
install it as a skill so the agent learns the full command set:

```sh
npx skills add https://github.com/VibiumDev/vibium --skill vibe-check
```

## Install a client library

Pick the language you want to drive Vibium from:

```sh
# JavaScript / TypeScript
npm install vibium

# Python
uv add vibium
```

Java (Gradle):

```gradle
implementation 'com.vibium:vibium:26.3.18'
```

Each client library bundles or locates the same `vibium` binary, so a single
install gives you both the CLI and the programmatic API.

## Verify the installation

```sh
vibium go https://example.com
vibium text
```

If `vibium text` prints the page text, the install succeeded.

## Custom binary path

The Python and Java clients respect the `VIBIUM_BIN_PATH` environment variable,
which lets you point at a custom build of the binary instead of the bundled
copy. This is mostly useful for contributors and CI.

```sh
export VIBIUM_BIN_PATH=/path/to/your/vibium
```

## Updating

Update via the same package manager you used to install:

```sh
npm update -g vibium
# or
uv add --upgrade-package vibium vibium
```

## Uninstalling

```sh
npm uninstall -g vibium
```

The bundled browser lives in Vibium's data directory; remove that directory
to fully reclaim disk space.


--- file: docs/quickstart.md ---

---
title: Quickstart
---

The shortest possible end-to-end session.

## CLI (installed)

```sh
npm install -g vibium
vibium go https://example.com
vibium screenshot -o example.png
vibium text
```

Open a page, save a screenshot, print the page text. That is the entire
quickstart.

> **Why does `vibium text` not need a URL or selector?** Vibium keeps a
> background browser running across commands; later commands act on the
> page opened by the most recent `vibium go`. See
> [Core Concepts](concepts.md) for the full mental model.

What you should see:

- `vibium go https://example.com` — exits with status 0; the browser
  navigates to the page. No output on stdout is normal.
- `vibium screenshot -o example.png` — writes the PNG to
  `~/Pictures/Vibium/example.png` and prints the saved path to stdout. The
  CLI manages screenshot storage for you; `-o` controls the *filename*, not
  the directory. `ls -lh ~/Pictures/Vibium/example.png` should show a
  non-empty PNG.
- `vibium text` — prints the visible page text to stdout. For
  `https://example.com` you'll see the heading "Example Domain" followed by
  the standard placeholder paragraph.

If any of those don't match, see [Troubleshooting](troubleshooting.md).

## CLI (zero-install with `npx`)

If you'd rather not install anything, the same flow works through `npx`:

```sh
npx -y vibium go https://example.com
npx -y vibium screenshot -o example.png
npx -y vibium text
```

This is great for CI jobs, throwaway scripts, and demos. To make the rest of
this guide copy-pasteable, alias it for the current shell:

```sh
alias vibium='npx -y vibium'
vibium go https://example.com
vibium text
```

## JavaScript / TypeScript

```js
import { writeFileSync } from 'node:fs'
import { browser } from 'vibium'

const browserSession = await browser.start()
const vibe = await browserSession.page()

await vibe.go('https://example.com')
const png = await vibe.screenshot()
writeFileSync('example.png', png)

await browserSession.stop()
```

`screenshot()` returns the PNG as bytes; you have to write them to disk
yourself (unlike the CLI's `-o` flag).

## Python

```python
from vibium import browser

browser_session = browser.start()
vibe = browser_session.page()

vibe.go("https://example.com")
text = vibe.text()
print(text)

browser_session.stop()
```

## Java

```java
var browserSession = Vibium.start();
var vibe = browserSession.page();
vibe.go("https://example.com");
var png = vibe.screenshot();
java.nio.file.Files.write(java.nio.file.Path.of("example.png"), png);
browserSession.stop();
```

## Agent skill for Codex

```sh
npm install -g vibium
npx skills add https://github.com/VibiumDev/vibium --skill vibe-check
```

After this, your agent can drive the browser by emitting `vibium ...` commands.

See the [Tutorial](tutorial.md) for a longer worked example.


--- file: docs/getting-started.md ---

---
title: Getting Started
---

This page walks through your first real session with Vibium. It assumes you
have either [installed](installation.md) the `vibium` binary globally or
are running it via `npx -y vibium ...`. The two are interchangeable; pick
whichever you prefer.

> **Tip.** If you don't want to install anything, alias `npx` for the
> session and the examples below work unchanged:
>
> ```sh
> alias vibium='npx -y vibium'
> ```

## The mental model

A Vibium session is a sequence of CLI commands that share a single browser.
Vibium runs a small daemon in the background to keep the browser alive between
commands, so each invocation is fast and you can interleave commands with
your own scripting.

A typical loop looks like:

1. `vibium go <url>` — open a page.
2. `vibium map` — list interactive elements with stable references like `@e1`,
   `@e2`, `@e3`.
3. `vibium click @e2` or `vibium fill @e3 "..."` — act on those references.
4. `vibium text` or `vibium screenshot` — read the result.

This is the same loop an agent uses; the references are designed to be easy
for an LLM to reason about and stable across commands.

## A first session

Open the Vibium homepage and grab a screenshot:

```sh
vibium go https://example.com
vibium screenshot -o example.png
```

List the interactive elements on the page:

```sh
vibium map
```

You should see lines like:

```
@e1  link   "More information..."  (https://www.iana.org/...)
```

Click the first link by reference:

```sh
vibium click @e1
```

Wait for the new page to settle, then read its text:

```sh
vibium wait text "IANA"
vibium text
```

## Finding things semantically

CSS selectors are brittle. Vibium prefers semantic locators that match what a
human would describe:

```sh
vibium find text "Sign in"
vibium find label "Email"
vibium find placeholder "Search..."
vibium find role button
```

Each `find` returns one or more `@e` references you can then `click`, `fill`,
or otherwise act on.

## What to read next

- [Tutorial](tutorial.md) — a complete form-filling walkthrough.
- [Core Concepts](concepts.md) — references, mapping, daemon mode.
- [Command Reference](commands/index.md) — every command in detail.

(If you skipped the [Quickstart](quickstart.md), it's a condensed
copy-paste version of this page.)


--- file: docs/tutorial.md ---

---
title: "Tutorial: Filling a Form End-to-End"
---

This tutorial walks through a realistic Vibium session: open a search engine,
type a query, submit the form, and capture the result. It exercises navigation,
mapping, semantic finding, form filling, waiting, and screenshots.

We'll use [https://duckduckgo.com](https://duckduckgo.com) as the target. Any
search engine will work; just adjust the labels.

If you have Vibium installed globally, the examples run as written. If you'd
rather stay zero-install, run the same commands through `npx`:

```sh
npx -y vibium go https://duckduckgo.com
npx -y vibium find placeholder "Search privately"
# ...etc.
```

Or alias for the session:

```sh
alias vibium='npx -y vibium'
```

## 1. Open the page

```sh
vibium go https://duckduckgo.com
```

Vibium starts the browser if it isn't already running and navigates to the URL.

## 2. Find the search input

There are two equally good ways to locate the search box:

```sh
# By placeholder text
vibium find placeholder "Search privately"

# Or by ARIA role
vibium find role combobox
```

Either returns a reference like `@e1`.

You could also call `vibium map` to list every interactive element on the
page and pick a reference manually.

## 3. Fill the input and submit

```sh
vibium fill @e1 "vibium browser automation"
vibium press Enter
```

`press` sends a literal key event, which is the simplest way to submit a form
that reacts to Enter.

The `@eN` reference comes from the most recent `find` or `map` output. If the
page navigates or re-renders, run `find` or `map` again before reusing it.

## 4. Wait for the results

The page transitions are asynchronous, so wait for something result-shaped to
appear before continuing. Any of these works:

```sh
vibium wait text "vibium"
vibium wait "h2"
```

## 5. Read and capture

```sh
vibium text > results.txt
vibium screenshot -o results.png
```

## 6. Record the whole session (optional)

If you want a replay of the whole interaction, wrap it in a recording. Run
the steps individually so each `find` result is visible before you act on
its reference:

```sh
vibium record start
vibium go https://duckduckgo.com
vibium find placeholder "Search privately"   # note the @eN it returns, e.g. @e1
vibium fill @e1 "vibium"
vibium press Enter
vibium wait text "vibium"
vibium record stop   # writes record.zip
```

`record.zip` contains the captured screenshots and is convenient for sharing
failures, debugging tests, attaching to a bug report, or playing back in the
[Vibium Record Player](https://player.vibium.dev/).

## What you just learned

- Drive a real browser with one command per step.
- Locate elements semantically (`find`), not with CSS selectors.
- Reference elements by stable `@eN` IDs.
- Fill, press, wait, and capture without juggling drivers.

Next up: read the [Core Concepts](concepts.md) page to understand how
mapping and references work under the hood, then dive into the
[Command Reference](commands/index.md) for every flag.


--- file: docs/concepts.md ---

---
title: Core Concepts
---

A short tour of the ideas that make Vibium feel different from older browser
automation tools.

## The browser daemon

Vibium runs a long-lived daemon that owns the browser process. Each `vibium`
command is a small client that talks to that daemon over a local socket. Two
practical consequences:

- Commands are **fast** — there is no per-command startup cost.
- State **persists** between commands — cookies, the current page, the active
  tab, scroll position, and element references all carry over.

The daemon shuts down on demand, when you explicitly stop it, or when the
session ends.

From a script using a client library, always pair `browser.start()` with a
matching `browserSession.stop()` (or the language's equivalent) so the daemon
doesn't outlive the script — running the same script twice in a row otherwise leaves
orphaned browser processes around.

## Element references (`@eN`)

Most UI automation tools want a CSS selector for every interaction. Vibium
takes a different approach: it numbers the interactive elements on the current
page and lets you refer to them by short, stable IDs.

```
@e1  link    "Sign in"
@e2  input   placeholder="Email"
@e3  button  "Continue"
```

You get these IDs by running [`vibium map`](commands/map.md), or by calling
[`vibium find ...`](commands/find.md) which returns a reference for each match.

References are stable across commands as long as the page does not change
substantially. Each `map` or `find` refreshes the current reference set, so
an `@eN` reference only means what it meant in the last result you saw. When
the DOM shifts, run `map` again (or `diff map` to see what moved) to refresh
them.

## Semantic finding

Vibium's `find` subcommands match elements the way a human would describe
them: visible text, form labels, placeholders, ARIA roles. CSS selectors are
intentionally not the primary interface — they are brittle and they don't
match how an agent reads a page.

| Subcommand                          | Matches                                |
| ----------------------------------- | -------------------------------------- |
| `vibium find text "Sign in"`        | Visible text content                   |
| `vibium find label "Email"`         | Inputs whose label is "Email"          |
| `vibium find placeholder "Search"`  | Inputs with that placeholder           |
| `vibium find role button`           | Elements with that ARIA role           |

## Verbs and subverbs

A few Vibium commands are actually small command groups:

- `vibium find` has subcommands `text`, `label`, `placeholder`, `role`.
- `vibium wait` is overloaded — `vibium wait "<selector>"` waits for a CSS
  selector, while `vibium wait text "<text>"` and `vibium wait url "<path>"`
  use named subcommands.
- `vibium record` has `start` and `stop`.

That means `vibium wait "h2"` and `vibium wait text "h2"` do different
things: the first waits for any element matching the CSS selector `h2`, the
second waits for the literal string `h2` to appear in the visible page.
When in doubt, the [Command Reference](commands/index.md) shows the
exact synopsis for each command.

## Standards-based protocol

Under the hood, Vibium speaks [WebDriver BiDi](https://w3c.github.io/webdriver-bidi/),
the W3C bidirectional WebDriver protocol. That means:

- It is a **standard**, not a vendor-specific debugging protocol.
- Future browser support comes "for free" as more browsers ship BiDi.
- You can mix Vibium with other BiDi-aware tools if you ever need to.

## Capture vs. interaction

Vibium splits into two clean halves:

- **Interaction** — `go`, `click`, `fill`, `select`, `check`, `press`, `wait`.
- **Capture** — `text`, `screenshot`, `pdf`, `eval`, `record`.

This makes it easy to reason about side effects: capture commands never change
the page; interaction commands always do.

## MCP server mode

`vibium mcp` starts an MCP (Model Context Protocol) server that exposes the
same commands as MCP tools. Plug it into Codex, Claude Code, Cline, Cursor, or
another MCP-aware client and the browser becomes part of the agent's tool
inventory.
See [MCP Server Integration](mcp-integration.md).


--- file: docs/mcp-integration.md ---

---
title: MCP Server Integration
---

Vibium ships an MCP (Model Context Protocol) server so AI agents can drive
the browser as a first-class tool, alongside their other tools.

## What you get

When Vibium is registered as an MCP server, the agent gains tools that map
1:1 to the CLI commands: navigation, mapping, finding, clicking, filling,
capture, and so on. The agent can use them directly without spawning shell
subprocesses.

## Registering Vibium

### Claude Code

```sh
claude mcp add vibium -- npx -y vibium mcp
```

### Gemini CLI

```sh
gemini mcp add vibium npx -y vibium mcp
```

### Other MCP-aware clients

Any client that can spawn an MCP server over stdio can use Vibium. The
command to spawn is:

```sh
npx -y vibium mcp
```

or, if you have already installed Vibium globally:

```sh
vibium mcp
```

## What runs where

The MCP server runs locally as a subprocess of your client. It manages the
same browser daemon the CLI uses, so:

- CLI commands and MCP-driven commands share state.
- A screenshot or recording started from one interface is visible to the other.
- Stopping the server stops the client end, not the browser daemon.

## Using it from an agent

Inside an agent, the tools generally appear with names matching the CLI
verbs (`go`, `map`, `find`, `click`, `fill`, `screenshot`, `text`, …). The
agent's tool-use loop is:

1. Call `go` with a URL.
2. Call `map` (or `find`) to discover references.
3. Call `click` / `fill` / `select` to interact.
4. Call `text` / `screenshot` to read the result.

This is the same loop documented in [Getting Started](getting-started.md);
MCP just removes the shell from the middle.

## Troubleshooting

If the agent reports that the tool isn't registered, confirm with your
client's `mcp list` (or equivalent) command, then try re-registering. If the
server fails to start, run `vibium mcp` directly to surface any error
messages, and check that the bundled browser has been downloaded by running
a quick `vibium go https://example.com` first.


--- file: docs/client-libraries.md ---

---
title: Client Libraries
---

Vibium provides first-class libraries for JavaScript/TypeScript, Python, and
Java. Each one wraps the same underlying binary, so the behavior matches the
CLI exactly.

## Installation

```sh
# JavaScript / TypeScript
npm install vibium

# Python
uv add vibium
```

Java (Gradle):

```gradle
implementation 'com.vibium:vibium:26.3.18'
```

## JavaScript / TypeScript (async)

```js
import { browser } from 'vibium'

const browserSession = await browser.start()
const vibe = await browserSession.page()

await vibe.go('https://example.com')
const png = await vibe.screenshot()

await browserSession.stop()
```

The JavaScript client also exposes a synchronous flavor that works well in a
Node REPL — see the project README for details.

## Python (sync)

```python
from vibium import browser

browser_session = browser.start()
vibe = browser_session.page()

vibe.go("https://example.com")
text = vibe.text()
print(text)

browser_session.stop()
```

The Python client also has an async flavor; the API is the same with `await`
in front of every call.

The Python client locates the bundled Vibium binary automatically. To use a
custom build, set:

```sh
export VIBIUM_BIN_PATH=/path/to/your/vibium
```

## Java

```java
var browserSession = Vibium.start();
var vibe = browserSession.page();

vibe.go("https://example.com");
var png = vibe.screenshot();

browserSession.stop();
```

The published Maven Central artifact bundles native binaries for every
supported platform.

## Mapping CLI commands to library calls

The libraries mirror the CLI:

| CLI                                | Library (Python sync, illustrative)                     |
| ---------------------------------- | ------------------------------------------------------- |
| `vibium go <url>`                  | `vibe.go(url)`                                          |
| `vibium map`                       | `vibe.map()`                                            |
| `vibium find text "<text>"`        | `vibe.find_text(text)`                                  |
| `vibium click @e2`                 | `vibe.click("@e2")`                                     |
| `vibium fill @e3 "<value>"`        | `vibe.fill("@e3", value)`                               |
| `vibium text`                      | `text = vibe.text()`                                    |
| `vibium eval "<js>"`               | `vibe.eval(js)`                                         |

Refer to each language's package documentation for exact method names — the
shape of the API is the same across all three.

## Lifecycle

- `browser.start()` boots a browser process (or attaches to a running one).
- `browserSession.page()` opens a new tab and returns a handle.
- `browserSession.stop()` shuts the browser down cleanly.

You generally want one `browser.start()` per process and one `page()` per
logical session.


--- file: docs/troubleshooting.md ---

---
title: Troubleshooting
---

Quick fixes for the most common issues.

## "command not found: vibium"

The npm global `bin` directory isn't on your `PATH`. Either add it
(`npm config get prefix` will show you where), or skip the global install
entirely and use `npx`:

```sh
npx -y vibium go https://example.com
```

If you plan to use Vibium repeatedly in a session, alias it once:

```sh
alias vibium='npx -y vibium'
```

## The browser doesn't appear

By default Vibium runs a visible browser. If you don't see a window:

- You may be on a headless host (e.g. a CI runner or a remote server with no
  display). That's expected; capture commands like `screenshot` and `text`
  still work.
- Google Chrome for Testing may still be downloading on first use. Re-run the
  command after it finishes.

## "no element matches" from `find`

Vibium matches semantically: visible text, label, placeholder, role. If
nothing matches:

- The page may not be ready yet — try [`vibium wait`](commands/wait.md) before
  finding.
- The text may differ from what you expect — `vibium text` will show you the
  actual rendered content.
- The element may be inside a closed `<details>`, a hidden tab, or a shadow
  root that requires scrolling or expanding first.

## A reference like `@e3` doesn't work anymore

References are stable while the page is unchanged, and each `find` or `map`
output defines the current `@eN` set. If the page navigated or re-rendered,
run [`vibium map`](commands/map.md) (or [`vibium diff map`](commands/diff.md))
to refresh. Get into the habit of running `wait` after any action that
triggers navigation.

## Custom binary path

The Python and Java clients honor `VIBIUM_BIN_PATH` if you need to point at a
locally built binary:

```sh
export VIBIUM_BIN_PATH=/path/to/your/vibium
```

## MCP server won't start

Run the server directly to see the error message:

```sh
vibium mcp
```

If the bundled browser hasn't been downloaded yet, run any normal command
first (for example `vibium go https://example.com`) so the download
completes, then restart your MCP client.

## Recordings are huge

`vibium record` captures a screenshot per step. Long sessions produce big
zips. If you only need a final snapshot, use [`vibium screenshot`](commands/screenshot.md)
instead of `record`.

## Where to ask for help

- **Bugs in the `vibium` binary, library, or MCP server** — file an issue at
  <https://github.com/VibiumDev/vibium/issues>.
- **Mistakes, gaps, or unclear writing in these docs** — file an issue
  against this docs repository.

When reporting a runtime bug, please include:

- Your platform and architecture (`uname -a` or equivalent).
- The exact command you ran and the full output.
- The Vibium version (`vibium --version`).
- A `record.zip` from a `vibium record` session reproducing the problem,
  if you can.


--- file: docs/faq.md ---

---
title: FAQ
---

## How is Vibium different from Selenium / Playwright / Puppeteer?

Vibium targets **AI agents first** and humans second. Its surface area is
small and verb-shaped (`go`, `map`, `click`, `fill`), its element references
are short and human-readable (`@e1`), and it favors semantic locators
(text, label, placeholder, role) over CSS selectors. The result is a tool an
LLM can use correctly with very little prompting.

Under the hood it speaks W3C [WebDriver BiDi](https://w3c.github.io/webdriver-bidi/),
the same standards-track protocol the wider browser-automation ecosystem is
moving toward.

## Which browsers does it support?

Vibium ships with its own managed Google Chrome for Testing build, downloaded
automatically on first use. Future browser support will follow as more
browsers ship BiDi.

## Do I need to install a separate driver?

No. Vibium is one ~10 MB binary. There is no `chromedriver`, no `geckodriver`,
no profile directory you have to maintain.

## Do I need to install Vibium at all?

No. Every command works through `npx`:

```sh
npx -y vibium go https://example.com
npx -y vibium screenshot -o page.png
npx -y vibium text
```

This is convenient for CI jobs, throwaway scripts, sandboxes, or any host
where you'd rather not install software globally. See
[Installation](installation.md) for details.

## Can I run Vibium headlessly on CI?

Yes. The browser will run without a visible window on hosts without a display.
Capture commands (`screenshot`, `text`, `pdf`) work the same way as on a
desktop.

## Is there a Python / TypeScript / Java SDK?

Yes — see [Client Libraries](client-libraries.md). All three wrap the same
binary, so behavior is identical across languages.

## How do I plug it into Codex or another agent?

Register Vibium as an MCP server. See [MCP Server Integration](mcp-integration.md).

## Why semantic locators instead of CSS selectors?

CSS selectors are brittle: a refactor of class names breaks all your tests.
They are also hard for an LLM to produce reliably. Semantic locators
(`find text "Sign in"`, `find label "Email"`, `find role button`) describe
what a human would describe, which is also what an LLM tends to produce
naturally.

`vibium eval` is still available when you genuinely need a CSS or XPath
selector for something the semantic API can't express.

## What's the license?

Apache 2.0.

## Where does Vibium store data?

The bundled browser and any cached state live in Vibium's data directory.
Removing that directory wipes Vibium's local state without affecting your
system Chrome.


--- file: docs/commands/index.md ---

---
title: Overview
sidebar:
  order: 0
---

The Vibium CLI commands you'll reach for in day-to-day use, grouped by
purpose. Each entry has its own page with syntax, flags, and examples.

> **Scope.** This reference covers the commands needed to drive the
> quickstart, the tutorial, and the typical agent loop. The `vibium`
> binary exposes additional lower-level commands (run `vibium --help`
> to list them) — they're useful escape hatches but are not part of the
> documented surface yet.

## Navigation

- [`vibium go`](go.md) — navigate to a URL.

## Mapping & references

- [`vibium map`](map.md) — list interactive elements as `@eN` references.
- [`vibium diff map`](diff.md) — show how the page has changed since the last map.

## Finding elements

- [`vibium find`](find.md) — locate elements semantically by text, label,
  placeholder, or ARIA role.

## Interacting

- [`vibium click`](click.md) — click an element by reference.
- [`vibium fill`](fill.md) — type text into an input.
- [`vibium select`](select.md) — choose a dropdown option.
- [`vibium check`](check.md) — toggle a checkbox.
- [`vibium press`](press.md) — send a keystroke.

## Waiting

- [`vibium wait`](wait.md) — block until an element, URL, or text appears.

## Capture

- [`vibium text`](text.md) — extract page text.
- [`vibium screenshot`](screenshot.md) — save a PNG screenshot.
- [`vibium pdf`](pdf.md) — save the page as a PDF.
- [`vibium eval`](eval.md) — run JavaScript in the page.

## Recording

- [`vibium record`](record.md) — start/stop a session recording.

## Agent integration

- [`vibium mcp`](mcp.md) — start the MCP server.

## Conventions

- Arguments shown as `<x>` are required; `[x]` are optional.
- `@eN` always refers to a numeric element reference returned by `map` or `find`.
- All commands share a single browser/daemon, so state persists across calls.
- Output is plain text on stdout unless a `-o` flag specifies a file.

## Running without installing

Every command on every page below also works through `npx` — no global
install required:

```sh
npx -y vibium go https://example.com
npx -y vibium map
npx -y vibium screenshot -o page.png
```

For a session-wide alias:

```sh
alias vibium='npx -y vibium'
```


--- file: docs/commands/go.md ---

---
title: vibium go
---

Navigate the active tab to a URL.

## Synopsis

```
vibium go <url>
```

## Description

Loads `<url>` in the current page. If no browser is running, Vibium starts one
first; otherwise it reuses the existing session.

The command returns once the navigation has committed. To wait for specific
content to be ready, follow up with [`vibium wait`](wait.md).

## Examples

```sh
vibium go https://example.com
vibium go https://github.com/VibiumDev/vibium
```

## See also

- [`vibium wait`](wait.md) — block on a URL change or element appearance.
- [`vibium map`](map.md) — list elements on the new page.


--- file: docs/commands/map.md ---

---
title: vibium map
---

List the interactive elements on the current page as numbered references.

## Synopsis

```
vibium map
```

## Description

Scans the current page for interactive elements (links, buttons, inputs,
selects, checkboxes, role-based widgets) and prints each one with a stable
reference of the form `@eN`. You then pass those references to interaction
commands like [`click`](click.md), [`fill`](fill.md), or [`select`](select.md).

References are stable for the lifetime of the page. When the DOM changes
(navigation, dynamic re-render, etc.), call `map` again — or use
[`diff map`](diff.md) to see only what changed.

## Output

Each line contains:

- The reference (`@e1`, `@e2`, …)
- The element role (`link`, `button`, `input`, `select`, …)
- A short human-readable description (visible text, label, placeholder, etc.)

Example:

```
@e1  link    "Sign in"
@e2  input   placeholder="Email"
@e3  input   placeholder="Password"
@e4  button  "Continue"
```

## Examples

```sh
vibium go https://example.com
vibium map
```

## See also

- [`vibium diff map`](diff.md)
- [`vibium find`](find.md)
- [`vibium screenshot`](screenshot.md)


--- file: docs/commands/diff.md ---

---
title: vibium diff map
---

Show how the page's interactive elements have changed since the last
[`map`](map.md).

## Synopsis

```
vibium diff map
```

## Description

After you run `vibium map`, Vibium remembers the snapshot of `@eN` references.
`vibium diff map` compares the current page against that snapshot and prints
only the differences:

- elements that appeared
- elements that disappeared
- elements whose description changed

This is the fast way to confirm that an interaction actually changed the page,
or to find a newly revealed widget (a modal, an autocomplete, an expanded
section) without re-reading the full map.

## Examples

```sh
vibium go https://example.com
vibium map
vibium click @e1
vibium diff map
```

## See also

- [`vibium map`](map.md)
- [`vibium wait`](wait.md)


--- file: docs/commands/find.md ---

---
title: vibium find
---

Locate an element by a semantic attribute and return its reference.

## Synopsis

```
vibium find text "<text>"
vibium find label "<label>"
vibium find placeholder "<text>"
vibium find role <role>
```

## Description

`find` matches elements the way a human would describe them, returning one or
more `@eN` references you can pass to interaction commands.

| Variant                              | Matches                                                                 |
| ------------------------------------ | ----------------------------------------------------------------------- |
| `vibium find text "<text>"`          | Elements whose visible text contains `<text>`.                          |
| `vibium find label "<label>"`        | Form fields whose `<label>` is `<label>`.                               |
| `vibium find placeholder "<text>"`   | Inputs with that placeholder.                                           |
| `vibium find role <role>`            | Elements with the given ARIA role (`button`, `searchbox`, `link`, …).   |

Each match is printed in the same form as [`vibium map`](map.md): a line per
element with its `@eN` reference, role, and a short description.

## Examples

```sh
vibium find text "Sign in"
vibium find label "Email"
vibium find placeholder "Search..."
vibium find role button
```

Typical workflow — find, then act on the reference:

```sh
$ vibium find label "Email"
@e2  input  label="Email"

$ vibium fill @e2 "alice@example.com"
```

## See also

- [`vibium map`](map.md) — list every element instead of finding one.
- [`vibium click`](click.md), [`vibium fill`](fill.md) — act on the result.


--- file: docs/commands/click.md ---

---
title: vibium click
---

Click an element by reference.

## Synopsis

```
vibium click @e<num>
```

## Description

Performs a real mouse click on the element identified by `@e<num>`. The
reference must come from a recent [`map`](map.md) or [`find`](find.md) call.

Use [`wait`](wait.md) afterward if the click triggers navigation or a delayed
DOM update.

## Examples

Click the second mapped element:

```sh
vibium click @e2
```

Click an element you've located by text — find first, then click the
reference it returns:

```sh
$ vibium find text "Sign in"
@e4  link  "Sign in"

$ vibium click @e4
```

## See also

- [`vibium map`](map.md), [`vibium find`](find.md)
- [`vibium wait`](wait.md)


--- file: docs/commands/fill.md ---

---
title: vibium fill
---

Type text into an input field.

## Synopsis

```
vibium fill @e<num> "<value>"
```

## Description

Focuses the element referenced by `@e<num>` and types `<value>` into it. Works
for any input that accepts text: `<input type="text">`, `<input type="email">`,
`<input type="password">`, `<textarea>`, `contenteditable` elements, and so on.

## Examples

```sh
vibium fill @e2 "alice@example.com"
vibium fill @e3 "correct horse battery staple"
```

When you don't know the reference yet, find it first:

```sh
$ vibium find label "Email"
@e2  input  label="Email"

$ vibium fill @e2 "alice@example.com"
```

## See also

- [`vibium press`](press.md) — send a single keystroke (e.g. `Enter`).
- [`vibium select`](select.md), [`vibium check`](check.md).


--- file: docs/commands/select.md ---

---
title: vibium select
---

Choose an option from a `<select>` dropdown.

## Synopsis

```
vibium select @e<num> "<option>"
```

## Description

Picks the option whose visible label matches `<option>` from the dropdown
referenced by `@e<num>`.

## Examples

```sh
vibium select @e5 "United States"
```

Find first, then select:

```sh
$ vibium find label "Country"
@e5  select  label="Country"

$ vibium select @e5 "Canada"
```

## See also

- [`vibium fill`](fill.md), [`vibium check`](check.md), [`vibium click`](click.md).


--- file: docs/commands/check.md ---

---
title: vibium check
---

Toggle a checkbox.

## Synopsis

```
vibium check @e<num>
```

## Description

Sets the checkbox referenced by `@e<num>` to its checked state. If you need to
explicitly uncheck a checked box, click it directly with [`click`](click.md)
instead.

## Examples

```sh
vibium check @e7
```

Find first, then check:

```sh
$ vibium find label "I agree to the terms"
@e7  input  label="I agree to the terms"

$ vibium check @e7
```

## See also

- [`vibium click`](click.md), [`vibium select`](select.md), [`vibium fill`](fill.md).


--- file: docs/commands/press.md ---

---
title: vibium press
---

Send a keystroke to the focused element.

## Synopsis

```
vibium press <key>
```

## Description

Dispatches a real key event for `<key>`. Common values:

- Letters and digits: `a`, `B`, `7`
- Named keys: `Enter`, `Tab`, `Escape`, `Backspace`, `Delete`,
  `ArrowUp`, `ArrowDown`, `ArrowLeft`, `ArrowRight`, `Home`, `End`,
  `PageUp`, `PageDown`, `Space`

This is the simplest way to submit a form that responds to Enter, or to
navigate a custom keyboard-driven widget.

## Examples

Submit a search:

```sh
vibium fill @e3 "vibium"
vibium press Enter
```

Tab between fields:

```sh
vibium press Tab
```

## See also

- [`vibium fill`](fill.md) — type a whole string into an input.


--- file: docs/commands/wait.md ---

---
title: vibium wait
---

Block until something appears on the page.

## Synopsis

```
vibium wait "<selector>"
vibium wait url "<path>"
vibium wait text "<text>"
```

## Description

`wait` is how you synchronize with asynchronous browser behavior — navigation,
network responses, animation, dynamic re-renders. It blocks until its
condition becomes true, then exits successfully. If the condition never
becomes true within Vibium's wait timeout, the command exits with an error.

| Variant                          | Becomes true when…                                |
| -------------------------------- | ------------------------------------------------- |
| `vibium wait "<selector>"`       | An element matching `<selector>` is on the page.  |
| `vibium wait url "<path>"`       | The current URL contains `<path>`.                |
| `vibium wait text "<text>"`      | `<text>` appears anywhere in the visible page.    |

## Examples

Wait for a button to appear:

```sh
vibium wait "button.continue"
```

Wait for a path change:

```sh
vibium wait url "/results"
```

Wait for a result to be rendered:

```sh
vibium wait text "Results for"
```

## See also

- [`vibium go`](go.md), [`vibium click`](click.md) — actions that often
  warrant a `wait` afterward.
- [`vibium diff map`](diff.md) — alternative way to confirm a change.


--- file: docs/commands/text.md ---

---
title: vibium text
---

Extract the visible text of the current page.

## Synopsis

```
vibium text
```

## Description

Prints the rendered, human-visible text content of the page to stdout. This is
the cleanest way to give an agent the "what does the user see" view of the
page without HTML noise.

The output is plain text suitable for `grep`, redirection, or piping into
another tool.

## Examples

```sh
vibium text
vibium text > page.txt
vibium text | grep -i "error"
```

## See also

- [`vibium screenshot`](screenshot.md) — visual capture instead of textual.
- [`vibium eval`](eval.md) — run JavaScript to extract a more specific value.


--- file: docs/commands/screenshot.md ---

---
title: vibium screenshot
---

Capture a PNG screenshot of the current page.

## Synopsis

```
vibium screenshot [-o <file>]
```

## Description

Saves a PNG of the visible page. With `-o`, the file is written to the given
path; without `-o`, the image is written to stdout (so you can pipe it into
`base64`, `feh -`, or another tool).

## Examples

```sh
vibium screenshot -o homepage.png
vibium screenshot > /tmp/page.png
```

## See also

- [`vibium pdf`](pdf.md) — save the whole document as PDF instead.
- [`vibium map`](map.md) — list the same `@eN` references textually.


--- file: docs/commands/pdf.md ---

---
title: vibium pdf
---

Save the current page as a PDF.

## Synopsis

```
vibium pdf [-o <file>]
```

## Description

Renders the current page as a PDF document. With `-o`, writes to the given
file; without `-o`, writes to stdout.

Useful for archiving, sharing, or feeding the rendered document to a
PDF-aware downstream tool.

## Examples

```sh
vibium pdf -o page.pdf
vibium pdf > /tmp/page.pdf
```

## See also

- [`vibium screenshot`](screenshot.md) — image capture.
- [`vibium text`](text.md) — text-only capture.


--- file: docs/commands/eval.md ---

---
title: vibium eval
---

Run JavaScript in the current page.

## Synopsis

```
vibium eval "<javascript>"
```

## Description

Executes `<javascript>` in the page context and prints the result to stdout.
The script has full access to `window`, `document`, and any globals the page
exposes.

This is the escape hatch for anything Vibium's higher-level commands don't
cover — extracting computed values, dispatching custom events, querying the
DOM with a custom selector, and so on. Reach for `eval` when the semantic
commands aren't enough; reach for the semantic commands first.

## Examples

```sh
vibium eval "document.title"
vibium eval "document.querySelectorAll('article').length"
vibium eval "window.scrollTo(0, document.body.scrollHeight)"
```

## See also

- [`vibium text`](text.md), [`vibium find`](find.md) — prefer these when they fit.


--- file: docs/commands/record.md ---

---
title: vibium record
---

Record a session as a sequence of screenshots and commands.

## Synopsis

```
vibium record start
vibium record stop
```

## Description

`record start` begins a recording. From that point on, Vibium captures a
screenshot at every step. `record stop` ends the recording and writes the
captured screenshots to `record.zip` in the current directory.

The resulting archive is useful for:

- Sharing a reproducible bug report.
- Auditing what an agent did during an autonomous run.
- Spot-checking the visual state of a session after the fact.

## Examples

```sh
vibium record start
vibium go https://example.com
vibium click @e1
vibium record stop
ls record.zip
```

## See also

- [`vibium screenshot`](screenshot.md) — one-off capture instead of a session.


--- file: docs/commands/mcp.md ---

---
title: vibium mcp
---

Start the MCP (Model Context Protocol) server.

## Synopsis

```
vibium mcp
```

## Description

Runs Vibium as an MCP server over stdio so that any MCP-aware client (Codex,
Claude Code, Cline, Cursor, Gemini CLI, etc.) can drive the browser as a tool.
The server exposes the same operations as the CLI: navigation, mapping,
finding, clicking, filling, capture, and so on.

You normally do not invoke `vibium mcp` directly. Instead, register it with
your client and let the client manage its lifecycle. See
[MCP Server Integration](../mcp-integration.md) for the registration
recipes.

## Examples

Register with Claude Code:

```sh
claude mcp add vibium -- npx -y vibium mcp
```

Register with Gemini CLI:

```sh
gemini mcp add vibium npx -y vibium mcp
```

## See also

- [MCP Server Integration](../mcp-integration.md)


--- file: docs/contributing.md ---

---
title: Contributing
---

Thanks for considering a contribution to Vibium. The canonical contributing
guide lives at <https://github.com/VibiumDev/vibium/blob/main/CONTRIBUTING.md>;
this page summarizes the most important pieces.

## Develop in a VM

Because Vibium is exercised heavily by AI-assisted tools during development,
contributors are encouraged to work inside a VM to contain potential side
effects. Platform-specific setup guides for macOS, Linux x86, and Windows
x86 live under `docs/how-to-guides/` in the main repository.

## Toolchain prerequisites

To build the full project you'll want:

- Go 1.21+
- Node.js 18+
- Python 3.9+
- Java 21+ with Gradle 8+
- (Optional) GitHub CLI

The Go toolchain builds the core binary; Node, Python, and Java are needed
for their respective client libraries.

## Build and test

After cloning the repository:

```sh
make
make test
```

`make` installs dependencies, compiles the binary, downloads the managed
browser, and builds each client library. `make test` runs the test suites.
There are also more granular targets for building or testing individual
components — see the `Makefile` for the full list.

## Working on the clients

Each client library can be built and tested in isolation:

- **JavaScript** — under `clients/javascript`. Both sync and async APIs ship,
  and the sync API is convenient in a Node REPL.
- **Python** — run `uv sync` from the Python client directory. The client
  locates the bundled binary automatically, or honors `VIBIUM_BIN_PATH`.
- **Java** — published to Maven Central with native binaries bundled. Compile
  examples against the built JAR and dependency directory.

## Workflow

- **Team members** can push directly to the main repository.
- **External contributors** should fork, push to their fork, and open a pull
  request against `main`.

## Reporting bugs

When reporting an issue, please include:

- Your platform and architecture.
- The exact command(s) and full output.
- The Vibium version (`vibium --version`).
- A `record.zip` from a `vibium record` session if the bug is interactive.

## Documentation contributions

This `vibium-docs` site is a separate Markdown collection. To update a doc:

1. Edit the relevant Markdown file.
2. Run `make build` so generated public assets, LLM docs, and the Starlight
   site stay in sync.
3. Open a pull request.

Canonical brand imagery lives in `site/src/assets/brand/`. Files in
`site/public/` are generated or served build inputs; update the source image or
generation script instead of editing generated favicons directly.