Report Abuse

Basic Information

Paramount is a developer-focused tool for recording and evaluating AI chat interactions so expert reviewers can perform quality assurance, capture ground truth, and run automated regression tests on LLM-driven systems. The package instruments AI functions via a provided decorator to capture inputs and outputs, stores recordings to a configurable backend, and exposes a local UI and API for human subject matter experts to inspect and score chat transcripts. It is intended to run entirely offline in a private environment. Configuration is done through a paramount.toml file which is autogenerated on first run and maps which function parameters correspond to chat lists, roles, and content. The repo includes an example script, CLI entry to launch the UI, and a Dockerfile for containerized deployments.

Links

App Details

Features
Captures AI function calls with a simple decorator to record message history and new inputs. Provides a local web UI and API for reviewing recorded chats and tracking accuracy over time. Supports configurable storage backends with CSV and Postgres options and an autogenerated paramount.toml for mapping inputs, outputs, and UI columns. Offers replay capability via a function_url pointing to an LLM API for regression testing. Configuration options include API endpoint, port, chat list role and content keys, meta and input/output columns, and optional splitting by bot identifier. Includes example usage, a minimal example script, Dockerfile.server for container deployment, and separate developer documentation for client and server configuration. Licensed under GPL.
Use Cases
Paramount helps teams validate and improve LLM behavior by producing reproducible recordings of chat interactions that subject matter experts can evaluate. It enables capturing ground truth and running replay-based regression tests to detect regressions after model or code changes. The local UI and configurable storage let teams review results privately and track accuracy improvements over time. Docker support and configurable backends make it easier to integrate into existing CI or developer workflows. The tool simplifies mapping complex function inputs and outputs into a chat display format so reviewers see conversations as delivered to end users. Overall it reduces manual effort in QA and provides structured artifacts for analysis and auditing.

Please fill the required fields*