Report Abuse

Basic Information

Airflow AI SDK is a Python software development kit designed to integrate large language models and agent-style AI calls into Apache Airflow pipelines. It provides decorator-based task primitives that let Airflow DAGs call LLMs, run multi-step agent logic, produce embeddings, and branch DAG control flow based on model outputs. The project aims to let teams use mature orchestration tooling to run LLM-driven workflows alongside traditional ETL, operational processes, and ML pipelines. The README documents a quick install option with optional dependencies, points to an examples repository containing a full local Airflow environment and sample pipelines, and links to a docs directory with getting started and usage guides. The code targets developers and platform engineers who want to embed model inference, reasoning agents, and output parsing into Airflow-managed workflows.

Links

App Details

Features
The README highlights decorator-based task types for Airflow: @task.llm for calling language models, @task.agent for orchestrating multi-step agent reasoning with custom tools, and @task.embed for creating vector embeddings from text. It supports automatic output parsing by using type hints to validate and parse LLM responses. There is a branching decorator @task.llm_branch to alter DAG control flow based on model output. The SDK advertises model support through the underlying Pydantic AI library including providers such as OpenAI, Anthropic, and Gemini. The package is installable via pip with optional extras, and the repository points to an examples repo and a docs directory for guided usage and features.
Use Cases
The SDK helps teams put LLM calls and agent workflows under Airflow orchestration so AI steps can be scheduled, monitored, and integrated with existing data and ML pipelines. By exposing LLM and agent behavior as Airflow tasks, it enables reuse of Airflow features like DAG scheduling, parameters, and branching while providing type-based parsing to reduce runtime errors from model outputs. Embedding tasks let pipelines generate vectors for downstream search or retrieval. The examples repository and documentation provide a ready local environment and usage patterns to speed onboarding. The project is intended for developers and platform operators who need to operationalize model-driven steps alongside traditional pipeline tasks.

Please fill the required fields*