Report Abuse

Basic Information

Vocode Core is an open source Python library for building voice-based applications that connect large language models to real-time audio input and output. It is designed for developers who want to create conversational voice experiences such as live streaming conversations with an LLM, voice-based personal assistants, phone-call agents, and integrations into conferencing platforms like Zoom. The project provides a single library with abstractions for audio input and output, conversational flow, agent configuration, and telephony hosting so developers can prototype and deploy voice-enabled LLM apps quickly. The README includes a pip install quickstart and example code that shows how to wire a microphone, transcriber, LLM agent, and synthesizer into a streaming conversation. The repo emphasizes extensibility, community contribution, and documentation to help developers extend integrations and deploy production voice agents.

Links

App Details

Features
Vocode Core provides primitives and ready integrations to assemble voice-LLM applications. Core features include streaming conversations from system audio, telephony support for inbound and outbound phone calls, Zoom dial-in for meetings, and the ability to use outbound calls from within agent flows. The library has built-in connectors for many transcription providers such as AssemblyAI, Deepgram, Gladia, Google Cloud, Microsoft Azure, RevAI, Whisper and Whisper.cpp. It also supports LLM backends such as OpenAI and Anthropic. For synthesis it integrates with services including Rime.ai, Microsoft Azure, Google Cloud, Play.ht, Eleven Labs, Cartesia, Coqui, gTTS, StreamElements, Bark, and AWS Polly. The project includes a React SDK and examples, plus a quickstart snippet showing microphone -> transcriber -> agent -> synthesizer wiring.
Use Cases
Vocode Core reduces the engineering effort required to add real-time voice interaction to applications by providing reusable building blocks, provider integrations, and example workflows. Developers can prototype a working voice conversation in minutes using the quickstart and then swap transcription, LLM, or TTS providers as needed. The telephony features let teams connect LLM agents to phone numbers for inbound and outbound calls, enabling practical use cases like voice assistants, automated callers, and conversational widgets in meetings. Documentation, a community Discord, and contribution guidance make it straightforward to extend the library or add new integrations. The modular design supports deploying voice agents across desktop audio, telephony, and conferencing environments while keeping implementation details encapsulated.

Please fill the required fields*