Clone
Rebuild frontend, backend, database, and local tooling.
VeriEnv
VeriEnv automatically clones real-world websites into fully executable synthetic environments, exposing internal state through a Python SDK so agents can learn from deterministic, verifiable rewards instead of brittle LLM-as-a-judge feedback.
Learn in controlled website replicas without touching live production environments.
Every trajectory can be checked programmatically through the synthetic environment.
The core idea is simple: treat language models as environment creators, not just action policies. By reconstructing websites into instrumented training worlds, VeriEnv makes web-agent self-evolution safe, repeatable, and scalable.
Motivation
Direct self-evolution on the open web is unsafe, hard to reset, and often judged by ambiguous instructions or non-verifiable reward signals. VeriEnv replaces that loop with controllable, executable replicas.
The paper’s motivation figure contrasts fragile real-world exploration with VeriEnv’s synthetic websites, validated tasks, and deterministic reward signals.
Open motivation figureMethod Overview
VeriEnv uses a coding agent to reconstruct the full stack of a website, then generates tasks and judges that interact with both the UI and the database through an SDK for end-to-end verification.
Rebuild frontend, backend, database, and local tooling.
Create tasks at multiple difficulty levels automatically.
Judge outcomes deterministically with executable checks over environment state.
This overview figure shows the full VeriEnv loop: clone a website, expose code/database interfaces, generate task-judge pairs, and train agents using verified reward signals.
Open method overview figureWebsite Showcase
These snapshots come from synthetic sites built for agent training. The hero animation cycles across multiple reconstructed websites to emphasize scale and diversity.
Recipe discovery and structured cooking workflows.
Task-heavy booking flows with bold transactional UI.
Highly visual commercial hero sections and product landing flows.
Editorial card layouts and article-heavy browsing patterns.
Public-sector portals with utility-first navigation.
Event discovery pages with strong promotional hero design.
Why It Matters
VeriEnv shifts the bottleneck for web-agent training from unverifiable language supervision to scalable environment construction. More websites mean more tasks, more coverage, and more stable reinforcement signals.
Takeaway
VeriEnv makes it possible to scale self-evolving web agents with synthetic websites that are faithful enough to be useful and instrumented enough to be verifiable.