Report Abuse

Basic Information

WebAgent is a research and engineering repository from Tongyi Lab at Alibaba for building, evaluating and deploying information-seeking web agents. It aggregates multiple agent projects and artifacts including WebWalker, WebDancer, WebSailor, WebShaper and WebWatcher, plus benchmarks and datasets for web traversal and long-horizon browsing tasks. The repo provides model checkpoints and references to released models on public model hubs, training and post-training methodologies, benchmark data, and example demos. It targets researchers and engineers who want to reproduce experiments, train agentic models, study agentic data synthesis, or deploy interactive demos. The README also includes quick-start steps, environment requirements, deployment and demo scripts, and citation information for the associated papers and preprints.

Links

Categorization

App Details

Features
The repository bundles models, datasets, benchmarks and training pipelines specific to web information seeking. Key dataset and benchmark artifacts include WebShaper data, WebWalkerQA and the SailorFog-QA benchmark. Training and method features described include a formalization-driven data synthesis pipeline, an agentic Expander for iterative question generation, a four-stage training paradigm for WebDancer, a post-training pipeline and a two-stage RFT-plus-DUPO agentic RL approach for WebSailor, and trajectory-level supervision with RL fine-tuning (DAPO). The repo contains deployment and demo scripts, example Gradio demos, instructions for model deployment using sglang, and references to released checkpoints such as WebDancer-32B and WebSailor variants.
Use Cases
WebAgent helps researchers and developers reproduce state-of-the-art work in autonomous web browsing and information seeking by providing code, data and models together with deployment and demo examples. It supplies benchmark tasks and curated datasets to evaluate long-horizon, high-uncertainty QA and web traversal abilities, and documents training recipes and RL algorithms aimed at improving agent generalization and reasoning. The included demos and scripts lower the barrier to interactive evaluation and deployment, while the papers and citations enable proper scholarly use. The repo is useful for building, extending or comparing agentic systems focused on web search, multi-step reasoning, dataset synthesis and evaluation on browsing benchmarks.

Please fill the required fields*