Report Abuse

Basic Information

joinly is an open source connector middleware and MCP server that enables developers to equip AI agents to join and interact in live video meetings. It provides an out-of-the-box Docker image and a client package to run either as a self-hosted server or as a standalone client that connects to meeting URLs on Zoom, Google Meet, Microsoft Teams or any browser-based meeting. The project exposes meeting tools and real time resources such as a live transcript feed, participant and chat access, and programmatic actions for speaking, sending chat messages, muting, and joining or leaving meetings. It is designed to be modular so teams can bring their own LLM provider and choose STT and TTS services. The README contains quickstart instructions, example configs and demos to help developers run and extend agents that act in meetings while keeping deployments privacy first.

Links

Categorization

App Details

Features
joinly provides live interaction capabilities that let agents respond by voice or chat and execute tasks during a meeting. It includes conversational flow handling for interruptions and multi speaker scenarios. Cross platform browser support lets agents join Google Meet, Zoom and Teams. The architecture is modular and supports bring your own LLM providers including OpenAI, Anthropic and local Ollama. Multiple speech providers are supported for transcription and synthesis such as Whisper, Deepgram, Kokoro and ElevenLabs. The MCP server exposes tools including join meeting, leave meeting, speak text, send chat message, mute and unmute, get chat history, get participants and get transcript. It also offers a subscribable live transcript resource and supports Docker based deployment, a CUDA image for GPU acceleration, a joinly client package and example configurations for integrating additional MCP servers.
Use Cases
joinly simplifies building meeting aware AI agents by providing the infrastructure and toolset required for real time participation. Developers can quickly deploy a self hosted MCP server or run a client in Docker and configure LLM, STT and TTS providers via environment variables or command line options. The exposed tools and live transcript resource enable common meeting automation tasks such as real time note taking, spoken responses, chat interactions, participant inspection and integrations demonstrated in demos like editing Notion or creating a GitHub issue. GPU enabled images improve transcription and TTS performance for production use. The project includes example clients, MCP configuration syntax for adding external tool servers, developer container support and debugging options to accelerate extension, testing and safe, privacy conscious deployments.

Please fill the required fields*