AutoWebGLM

Report Abuse

Basic Information

AutoWebGLM is the official implementation of a large language model-driven web navigation agent developed to make LLMs better at browsing and interacting with real web pages. The project extends the ChatGLM3-6B model to perform automated web navigation tasks and to handle practical browsing challenges by combining model architecture work, data, training pipelines, environment integrations, and evaluation tooling. The repository provides code for training and inference, publicly disclosed evaluation datasets and environments, and example modifications to web interaction platforms. It is intended for researchers and engineers who want to reproduce the paper results, experiment with web-directed LLM behaviors, or evaluate and improve web navigation capabilities using the supplied benchmarks and environments.

Links

App Details

Features
The repository documents several technical innovations and supporting artifacts. It includes an HTML Simplification Algorithm designed to reduce noisy page structure while preserving salient information for LLM consumption. It describes a Hybrid Human-AI Training approach and curriculum construction for web browsing data. The project uses Reinforcement Learning and Rejection Sampling techniques to improve webpage comprehension, browser operations, and task decomposition. It supplies a bilingual benchmark called AutoWebBench for Chinese and English web tasks and shares evaluation code and datasets. The repo also contains modifications and execution instructions for WebArena and MiniWob++ environments to enable realistic web interaction testing.
Use Cases
AutoWebGLM helps researchers and developers build, evaluate, and iterate on web-capable language-model agents. By providing training recipes, evaluation scripts, and curated bilingual benchmarks, it enables reproducible comparison of web navigation strategies and model improvements. The HTML simplification and curriculum training methods reduce input complexity for LLMs, potentially improving performance on page understanding and action selection. Environment adaptations for WebArena and MiniWob++ let users run controlled interaction experiments. Public evaluation code and an eval script allow users to score model outputs consistently. The repository is distributed under Apache-2.0 for research use and includes citation details for scholarly use.

Please fill the required fields*