Report Abuse

Basic Information

ClickClickClick is a developer-oriented framework for enabling autonomous control of Android devices and macOS computers using large language models, either local or cloud-hosted. It is designed to let developers describe high-level UI tasks in natural language and have the system plan, locate UI elements via screenshots, and execute interactions automatically. The repo includes executors for Android (ADB) and macOS, integrated planners and visual finders backed by multiple LLM providers, and configurable settings via a models.yaml configuration file. It exposes a command line tool, a Python API, a REST API server, and a Gradio web interface so tasks can be run interactively, scripted, or called programmatically. The README documents installation, prerequisites such as Python 3.11 and ADB or accessibility permissions, examples for email, navigation, and system tasks, and guidance for model selection and troubleshooting.

Links

App Details

Features
Multi-platform support for Android and macOS executors with ADB and accessibility integration. Pluggable LLM support including OpenAI, Anthropic Claude, Google Gemini, and local Ollama models with recommendations for planner and finder roles. Three user interfaces: CLI commands, a Python API for embedding in applications, and a Gradio web UI for visual task input and monitoring. Screenshot-based visual automation that detects and interacts with UI elements. Configurable execution parameters and model settings through config/models.yaml, including timeouts, delays, screen coordinates, and image sizes. A lightweight REST API server for remote task execution. Debug and troubleshooting guidance, performance tuning tips, and example task prompts for common workflows.
Use Cases
The project reduces the need to write brittle, element-specific automation scripts by combining LLM-based planning with screenshot-driven element detection, enabling natural-language task descriptions to drive device actions. It supports both cloud and local models so teams can balance performance, cost, and privacy, and it provides multiple integration points (CLI, Python, REST, web UI) to fit into development workflows, CI tests, or demo environments. Developers can quickly prototype automations such as composing emails, navigating apps, or performing system checks across devices. Built-in configuration and model recommendations speed setup, while debug mode, troubleshooting steps, and performance optimizations help stabilize runs on slower devices. The roadmap and plugin goals indicate extensibility for multi-device orchestration and platform expansion.

Please fill the required fields*