web eval agent

New

This repository provides an MCP server implementation named web-eval-agent that launches a browser-driven automated agent to execute and debug web applications from within a developer's code editor. It is built to integrate with the operative.sh MCP system and editor integrations such as Cursor, Cline, and Windsurf so an agent can be invoked from an IDE chat to run natural-language test tasks against a running web app. The agent drives a browser (interactive or headless), captures screenshots, console logs, and network traffic, and returns a structured UX evaluation report and chronological timeline. The package includes tools to set up persistent browser state for single-sign-on reuse and to run autonomous end-to-end UX checks without manual step-by-step testing. Installation and update instructions target developers on macOS, Linux, and Windows who want automated in-editor web app QA.

Stars

1138

App URL

https://github.com/Operative-Sh/web-eval-agent

Github Repository

https://github.com/Operative-Sh/web-eval-agent/blob/main/README.md

Features

The README documents several concrete features: navigation of a web app using BrowserUse with performance improvement when using the operative backend, automated capture of network requests with filtering, and collection of console logs and errors. It provides two MCP tools: web_eval_agent, which drives the browser and returns a rich UX report, and setup_browser_state, which opens an interactive browser to persist cookies and local storage for subsequent runs. Tool arguments are explicit: web_eval_agent requires url and a natural-language task and supports an optional headless_browser flag defaulting to false; setup_browser_state accepts an optional url. Additional UX-oriented features include screencasting, an in-browser agent overlay with pause/play/stop controls, and chronological timelines and network/console summaries in output reports.

Use Cases

This project helps developers and QA engineers automate end-to-end testing and debugging of web applications directly from their editor, saving time by letting an agent reproduce flows and report issues. It produces detailed UX evaluation reports showing step-by-step actions, console logs, network requests, screenshots, and a timeline that make it easier to locate regressions and reproduce bugs. The reusable browser state tool reduces friction for flows that require authentication by preserving cookies and local storage. Integration with operative.sh MCP and editor chat means tests can be triggered conversationally, and installer and update commands enable quick setup with Playwright and the uv runtime. The result is faster iterative debugging, reproducible QA runs, and less manual exploratory testing.

web eval agent

Basic Information

Links

Categorization

App Details

Categories

Similar Listings

virtual lab

mnemo

EdgeChains

RAG Agents Accelerator

LLM Zero to Hundred

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags

More Filters

web eval agent

Categories

Similar Listings

virtual lab

mnemo

EdgeChains

RAG Agents Accelerator

LLM Zero to Hundred

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags