promptfoo

New

promptfoo is a developer-focused tool for testing, evaluating, and hardening LLM applications. It provides a local-first CLI and Node package to run automated prompt and model evaluations, compare model outputs side-by-side across providers, and perform red teaming and vulnerability scanning to generate security reports. The project is designed to reduce trial-and-error during LLM development and to help teams ship more secure, reliable AI apps. It supports integration into CI/CD pipelines, runs entirely on the developer's machine so prompts and data remain private, and is documented with getting-started guides and red-team guidance. The README highlights support for multiple model providers and emphasizes a data-driven approach to drive decisions about prompts and models.

Stars

7997

App URL

https://github.com/promptfoo/promptfoo

Github Repository

https://github.com/promptfoo/promptfoo/blob/main/README.md

Features

Automated evaluation workflows for prompts and models that run from the command line or via a Node.js package. Red teaming and vulnerability scanning features that produce security reports and help identify model failures or risky behaviors. Model comparison across multiple providers including OpenAI, Anthropic, Azure, Bedrock, and Ollama. Local-first operation with live reload and caching to speed iteration and preserve privacy. CI/CD integrations to run checks automatically. Documentation, llms.txt support for discoverability, and an open source MIT license with an active community and Discord for collaboration.

Use Cases

Promptfoo helps developer teams move from ad-hoc experimentation to reproducible, measurable testing of LLM behavior. By providing automated evaluations and red-team scans it surfaces correctness, safety, and prompt regressions before deployment. Local execution ensures sensitive prompts and examples do not leave developer machines, while caching and live reload speed iteration. Integration with CI/CD enables continuous checks and prevents regressions in model or prompt updates. Model comparison features make it easier to select providers and configurations based on metrics rather than guesswork. The tool is accompanied by docs and community channels to help teams adopt best practices for secure, production-grade LLM apps.

promptfoo

Basic Information

Links

Categorization

App Details

Categories

Similar Listings

libro

azure dev

web builder

indie hacker tools plus

Clean Coder AI

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags

More Filters

promptfoo

Categories

Similar Listings

libro

azure dev

web builder

indie hacker tools plus

Clean Coder AI

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags