Our Approach

We operate below
the interface.

Instead of hooking into the browser's internals, we interact with software the same way a human does: through the operating system. Real pixels. Real input devices. Real timing. Real residential IPs matched to each session's geo.

Real Chrome, not patched Chromium

We run unmodified Chrome with full rendering pipelines, GPU compositing, and all standard browser behaviors intact. No automation hooks. No DevTools injection. No detectable substrate.

OS-level input, not dispatched events

Inputs flow through the operating system’s native input stack — the same path a physical keyboard and mouse use. The browser cannot distinguish our input from human input because the delivery mechanism is identical.

GPU-backed rendering, not headless mode

Full pixel rendering through the graphics pipeline. WebGL, Canvas, and every visual fingerprinting surface behave exactly as they would on a real machine because they are running on a real graphics stack.

Isolation without virtual machines

We achieve per-session isolation without the overhead of full VMs. This keeps cost per session low enough to be viable at scale — the economics that killed every previous attempt at this approach.

Recovery-first, not script-first

Real-world software is nondeterministic. Modals appear. Pages stall. Layouts shift. We build for recovery and adaptation, not brittle step-by-step scripts that break on the first unexpected state.

Clean network identity, not rotating proxies

Every session gets a residential IP matched to its fingerprint’s geo. No datacenter ranges that get blocklisted on sight. No geo mismatches that trip Cloudflare in milliseconds. The network layer is as real as the browser layer — because a perfect browser on a flagged IP is still a blocked session.

How It Works

The execution pipeline

Every task flows through five stages: from the AI agent's instruction to a completed action inside human-only software. No DOM hooks. No synthetic events. Just real execution.

ReceiveTask ReceivedAI agent sends structured instruction

SpawnEnvironment ReadyIsolated session with real Chrome, GPU & residential IP

ExecuteAction PerformedOS-level input drives the interaction

ObserveState VerifiedVisual + DOM state reconciliation

ReturnResult DeliveredData returned or recovery triggered

DOM Automation

CDP hooks → synthetic events → breaks on legacy

Execution Runtime

OS input → real rendering → task complete

Receive

Task Received

AI agent sends structured instruction

Spawn

Environment Ready

Isolated session with real Chrome, GPU & residential IP

Execute

Action Performed

OS-level input drives the interaction

Observe

State Verified

Visual + DOM state reconciliation

Return

Result Delivered

Data returned or recovery triggered

DOM Automation — CDP hooks, synthetic events, breaks on legacy

Execution Runtime — OS input, real rendering, task complete

The Architecture Problem

Existing tools assume the browser cooperates.

Every browser automation tool today (Playwright, Puppeteer, Selenium, RPA frameworks) operates above the interface. They hook into DevTools, inject scripts, and dispatch synthetic events. This works when software cooperates. Most real-world software does not.

DOM Automation

Controls browser via DevTools Protocol
Uses patched Chromium with automation hooks
Dispatches synthetic DOM events
Datacenter IPs that get blocklisted on sight
Fingerprint-to-geo mismatches trip CDN detection
Assumes stable, cooperative interfaces
Detectable by fingerprinting and behavioral analysis

Execution Runtime (Our Approach)

Operates below the interface at the OS level
Uses real Chrome with real rendering paths
Drives native input devices through the kernel
Residential IPs matched to session fingerprint geo
Consistent identity per session — no random rotation
Embraces nondeterminism with recovery-first design
Interaction patterns indistinguishable from human usage

The Distinction

Interfaces vs. Interactions

Browser automation tools automate interfaces: DOM nodes, selectors, events. But real work happens at the interaction level. Focus, timing, context, recovery. That's where we operate.

Interface Automation

Selectors

querySelector, XPath, CSS targets

DOM Nodes

Direct tree manipulation

Synthetic Events

Programmatic event dispatch

Scripts

Injected JavaScript execution

Determinism

Expects predictable outcomes

Interaction Execution

Focus & Context

Real window focus, z-order awareness

Timing & Cadence

Human-natural input timing

State Awareness

Visual + DOM state reconciliation

Error Recovery

Retry, adapt, recover from failures

Side Effects

Real-world consequences, not assertions

“Playwright automates interfaces.
We automate interactions.”

Where We Fit

The execution stack

Layer 3

AI / Business Logic

LLM planners, workflow engines, agents

Layer 2

Our Execution Runtime

Real browsers, real inputs, clean IPs, recovery

Layer 1

Human-Designed Software

No APIs, hostile, legacy, regulated

DOM automation tools live above the interface. We live below it. That's why they can't evolve into us. Playwright stays for tests and friendly software. We exist for everything else.

Why is this the right time to build this?

Why Now →

We operate belowthe interface.