TheAgenticBrowser
Basic Information
Agentic Browser is an agent-based system that automates web browser interactions through a natural language interface. It is built on the PydanticAI Python agent framework and orchestrates specialized agents to perform web tasks such as form filling, product searches on e-commerce sites, content retrieval, media interaction, and project management workflows. The project provides a runnable Python entry point and an optional API server with a POST execute_task endpoint. It is configured through a .env file and supports Playwright for browser control, optional screenshot analysis models, Google Custom Search integration, and an option to connect to a remote browser via a Steel Dev API key.
Links
Stars
241
Language
Github Repository
App Details
Features
Modular three-agent architecture comprising a Planner Agent for task decomposition, a Browser Agent for executing navigation and DOM interactions, and a Critique Agent for verification and iterative improvement. Browser automation features include web research across domains, data extraction and compilation, e-commerce price and availability scraping, context-aware cross-domain traversal, DOM inspection, and screenshot analysis. Setup utilities and runtime features include uv virtual environment management, Playwright driver installation, environment variable configuration for model and API keys, a local CLI run mode and a FastAPI-compatible server mode, plus Docker run instructions for API deployment.
Use Cases
The repository enables users to automate repetitive and complex browser workflows using plain language commands, reducing manual web navigation and data collection effort. It helps extract structured information from web pages, compare product data across sites, gather research materials, and verify actions via automated critique and screenshot/DOM checks. Developers can run the system locally or deploy it as an API to integrate browser automation into other tools. Configuration options for text and screenshot models, Google Custom Search, logging, and browser storage make it adaptable to varied use cases and environments.