Report Abuse

Basic Information

airda (Air Data Agent) is a multi-agent project focused on data analysis that helps translate data analysis requests into executable artifacts. It is designed to understand data development and analysis needs, interpret data schemas and business metrics, and generate SQL and Python code for queries, data processing and machine learning tasks. The system orchestrates multiple specialized agents for tasks such as data search, SQL generation, code generation and visualization analysis. It can connect to data sources, build a knowledge base, and produce application outputs like dashboards, data APIs and data applications. The repository provides a CLI, environment configuration, and integration points for embedding models and MongoDB to support retrieval and storage of metadata and embeddings.

Links

App Details

Features
The README highlights precise data retrieval across hundreds or thousands of tables, business-knowledge-aware understanding of metrics and calculation formulas, and multi-agent collaboration with multi-turn dialogues. Implemented capabilities include SQL generation, data ingestion and a knowledge base. The project supports custom configuration via environment templates and log configuration, uses a default Chinese embedding model (stella-large-zh-v2), and requires Python 3.10 or newer. Operational dependencies include MongoDB, with Docker instructions provided. CLI commands cover environment loading, logging config, adding and syncing MySQL data sources, listing data sources, and running an interactive CLI for question answering. The README also documents current completion status and planned features.
Use Cases
airda helps teams and analysts by automating routine parts of data analysis and discovery, reducing time spent finding relevant tables and writing SQL or Python code. Its business-aware understanding can surface correct metrics and calculation logic, improving the relevance of generated queries and analyses. Multi-agent collaboration and self-debugging of generated code aim to lower error rates and speed iteration. The ability to turn analysis results into dashboards, data APIs or applications makes it easier to operationalize insights. Provided CLI tools and configuration templates simplify integration, while MongoDB-backed metadata and embeddings enable scalable retrieval. The README notes some features like chart generation, corpora and task planning are still in progress.

Please fill the required fields*