GPT4V-AD-Exploration
Basic Information
This repository is the official code and asset companion to a technical report titled "On the Road with GPT-4V(ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent." It collects original test images, case examples, and documented interactions where the GPT-4V visual-language model is evaluated on tasks relevant to autonomous driving. The README explains that the project explores scenario understanding, reasoning about driving scenes, and instances of the model serving as a driving agent. The repository is organized into categorized directories that group cases by task type and includes JSON files that capture the prompts and GPT-4V responses together with the PNG images the model analyzed. The repo is released under the MIT license and intended as a reproducible reference and dataset for researchers examining visual-language model behavior in driving contexts.