phoenix
Basic Information
Phoenix is an open-source AI observability platform for experimentation, evaluation, and troubleshooting of LLM applications. It is designed to capture traces of model calls, benchmark responses and retrievals, and organize versioned datasets and experiments to evaluate prompts, models, and retrieval components. The platform provides a playground for prompt engineering, tools for prompt management with versioning and tagging, and the ability to replay traced LLM calls. Phoenix is vendor- and language-agnostic and integrates with popular frameworks and providers through OpenTelemetry-based instrumentation. It can run locally, in notebooks, in containers, or in cloud deployments and is distributed as a full Python package plus lighter subpackages and JavaScript/TypeScript clients for deployed platforms.