Report Abuse

Basic Information

This repository collects research code, models, datasets, benchmarks, training recipes and demos for building and evaluating web-based information-seeking agents. Maintained by Tongyi Lab at Alibaba, it brings together multiple projects (WebWalker, WebDancer, WebSailor, WebShaper, WebWatcher) that target autonomous web traversal, long-horizon information acquisition and agentic reasoning. The repo hosts benchmark suites, synthesized datasets and example deployments to reproduce research results and run interactive demos. It documents quick start steps for environment setup, model deployment and running demos, and highlights released model checkpoints and datasets that support training and evaluation of agentic search systems. The primary goal is to enable researchers and practitioners to study, reproduce and extend methods for agentic information seeking on the web.

Links

App Details

Features
The repository bundles several complementary artifacts and methods for web agents. It provides a formalization-driven data synthesis pipeline (WebShaper) and an agentic Expander for generating information-seeking instances. It contains post-training and RL-oriented methodologies for extended chain-of-thought and policy optimization (SailorFog-QA dataset, RFT cold start and DUPO algorithm for WebSailor). WebDancer supplies a four-stage training paradigm including browsing data construction, trajectory sampling, supervised fine-tuning and RL. The repo includes pre-trained model releases and deployment scripts, Gradio demo launch scripts, evaluation benchmarks such as WebWalkerQA, GAIA and BrowseComp, and example configuration and API integration points for deployment.
Use Cases
This project helps researchers reproducibly develop and benchmark agents that traverse and search the web. It provides ready-made datasets and synthesized QA tasks to train and stress-test agentic behavior, benchmarks and evaluation suites to measure progress on complex browsing tasks, and pre-trained model checkpoints and demo scripts to try models locally or on cloud services. The included training recipes and RL routines illustrate practical pipelines for scaling from supervised fine-tuning to reinforcement learning. Deployment scripts and demo instructions make it easier to run interactive studies and compare models. Overall the repo lowers the barrier to experiment with and extend state-of-the-art web traversal and information-seeking agents.

Please fill the required fields*