Report Abuse

Basic Information

This repository provides a collection of hands-on labs and reference material that explore the AI Gateway pattern using Azure API Management. It documents experiments that demonstrate how to manage, secure and govern AI service APIs, focusing on integrations with Azure OpenAI, Azure AI Foundry and compatible model backends. The content is organized as Jupyter notebooks accompanied by deployment artifacts such as Bicep templates and API Management policy examples. Labs cover agent-oriented scenarios including Model Context Protocol (MCP) client authorization, OpenAI Agents, Azure AI Agent Service and realtime audio/text experiments. The repo also demonstrates inference API patterns, self-hosted small language models, function calling and operational controls like FinOps. The materials are intended as practical blueprints and learning playgrounds rather than production software.

Links

Categorization

App Details

Features
A broad set of labs covering API management patterns for AI, each with a Jupyter notebook, Bicep deployment templates and APIM policy samples. Notable labs include MCP client authorization, Model Context Protocol experiments, OpenAI Agents integration, AI Agent Service orchestration, realtime audio/text examples, function calling with Azure Functions, and SLM self-hosting. Operational and governance features include token rate limiting, token metrics emitting, semantic caching, model routing, backend pool load balancing, access controlling and content safety enforcement. Supporting tools include tracing and streaming notebooks and a customizable mock server for local OpenAI API simulation. The repo maps labs to the Azure Well-Architected Framework and provides FinOps guidance for token and cost controls.
Use Cases
The repository helps architects and developers learn how to build an API-led gateway for AI workloads and to apply APIM policies to secure, monitor and control model usage. Step-by-step notebooks and Bicep templates enable rapid experimentation and reproducible deployments in an Azure subscription. Teams can prototype agentic flows, integrate tools with Model Context Protocol, test realtime and function-calling scenarios, and evaluate operational controls such as rate limiting, token metrics, logging and semantic caching. The labs also show patterns for cost governance with FinOps, storing LLM logs, content safety checks and backend load balancing. Supporting utilities like the mock server, tracing and streaming examples simplify development and testing before production integration.

Please fill the required fields*