LitServe
Basic Information
LitServe is a developer-focused serving framework for building and deploying complete AI systems including agents, multi-component pipelines, RAG servers, MCP servers, and single- or multi-model inference endpoints. It exposes simple Python primitives (for example LitAPI and LitServer) where users implement setup and predict methods to wire together models, databases, and custom logic without writing YAML or bespoke MLOps glue code. The project targets a wide range of model types including LLMs, vision, audio, and classical ML and supports both self-hosting and one-click deployment to a managed cloud. LitServe is designed to give fine-grained control over batching, streaming, multi-GPU execution, autoscaling, and worker behavior while remaining compatible with common ML stacks such as PyTorch, JAX, and TensorFlow.