agents mcp usage

Report Abuse

Basic Information

This repository is a demonstration and benchmarking platform for using the Model Context Protocol (MCP) with LLM agents across multiple agent frameworks. It contains worked examples showing how to connect agents to a single MCP server and how to coordinate multiple specialised MCP servers. Included are framework-specific examples for Google Agent Development Kit, LangGraph, OpenAI Agents, and Pydantic-AI, plus Python MCP servers provided as demos. The project also bundles an evaluation suite for mermaid diagram correction tasks, a Streamlit dashboard for interactive result exploration, and tracing integration via Logfire. The examples show environment setup, server connection, agent configuration and execution, and aim to teach developers how to standardise contexts, tools and resources when building agentic systems with interchangeable LLM providers.

Links

Categorization

App Details

Features
The repo provides multiple hands-on examples and utilities: basic single-MCP examples for ADK, LangGraph, OpenAI Agents and Pydantic-AI, and multi-MCP examples demonstrating coordination between specialised servers. Demo Python MCP servers include example_server.py with tools like add(a, b), get_current_time(), and a greeting resource, and mermaid_validator.py for validating Mermaid diagrams. A comprehensive evaluation suite tests mermaid correction tasks with multi-model benchmarking, easy/medium/hard difficulty tiers, parallel or sequential runs, robust retry and failure handling, CSV exports, and a Streamlit dashboard (merbench_ui.py) that shows leaderboards, cost analysis, failure categorisation and performance trends. Tracing and observability are supported through Logfire integration.
Use Cases
This repository is useful for developers, researchers and teams who want to learn or evaluate how to integrate MCP servers with diverse agent frameworks and multiple LLM providers. It supplies runnable examples and scripts to reproduce single-server and multi-server agent workflows, enabling experimentation with modular tool and resource separation. The evaluation suite and dashboard let users benchmark and compare models on structured tasks, measure token usage and cost, and inspect failures to guide model selection or system tuning. Logfire tracing provides visibility into agent traces and runtime behavior, which helps with debugging and analysis. Overall the project accelerates prototyping, interoperability testing and repeatable benchmarking for MCP-based agent systems.

Please fill the required fields*