Report Abuse

Basic Information

LLMCompiler is a developer-focused framework for orchestrating parallel function calling with large language models. It automatically decomposes user-specified problems into tasks, identifies which tasks can run in parallel and which are interdependent, and computes an optimized orchestration for invoking user-provided functions (tools) during LLM reasoning. The project supports both open-source and closed-source models, provides example configurations and benchmarks, and includes scripts to run experiments and store results. It is intended to reproduce and extend the experiments from the associated paper, enabling lower latency, reduced API cost, and improved accuracy when multi-function calling is required. The repository includes installation instructions, runnable scripts, and example configs for HotpotQA, Movie Recommendation, and ParallelQA benchmarks.

Links

Categorization

App Details

Features
Automatic task decomposition and optimized orchestration of multiple function calls to exploit parallelism during LLM reasoning. Support for a range of model backends including OpenAI GPT models, vLLM-served custom models, Azure endpoints, and Friendli endpoints. Command-line runners such as run_llm_compiler.py and an evaluation script evaluate_results.py to reproduce paper benchmarks. Config-driven benchmark examples under configs for hotpotqa, movie, and parallelqa. Options for streaming task dispatch to reduce latency, logging and detailed benchmarking flags, and the ability to add custom benchmarks by supplying tools definitions and in-context prompts.
Use Cases
LLMCompiler helps developers and researchers reduce latency and API cost and improve accuracy for complex tasks that require multiple function calls by enabling parallel execution where possible. It provides a reproducible evaluation setup and example benchmarks so users can compare orchestration strategies and reproduce the paper"s results. The framework makes it straightforward to integrate custom tool functions and example prompts, to run on different model backends including local vLLM, and to export per-example predictions and latency statistics. News notes show integrations with LangGraph/LangChain and LlamaIndex, which can ease adoption in larger pipelines.

Please fill the required fields*