nndeploy

Report Abuse

Basic Information

nndeploy is a workflow-based, multi-platform AI deployment tool designed to help AI engineers, product managers and developers rapidly deploy algorithm ideas to cloud, desktop, mobile and edge devices. It provides a visual drag-and-drop workflow editor plus Python and C++ APIs so workflows can be exported as JSON and executed across targets. The project ships a collection of ready-to-run AI algorithms and example workflows covering image classification, detection, tracking, segmentation, generative models and a small large language model. The README and docs emphasize quick prototyping, multi-backend portability and end-to-end deployment from visual design to runtime execution. CLI utilities are provided to launch the visual app, run JSON workflows and clean backend resources.

Links

Categorization

App Details

Features
Visual workflow editor with drag-and-drop composition and front-end parameter tuning. Export and import workflows as JSON for reproducible execution. Python and C++ APIs for programmatic control. Out-of-the-box models and templates including classification, detection, tracking, segmentation, Stable Diffusion and a QWen model. Multi-end inference support with native integrations for many runtimes including PyTorch, TensorRT, OpenVINO, ONNX Runtime, MNN, TNN, ncnn, CoreML, AscendCL, RKNN, TVM, SNPE and a self-developed inference backend. Performance features include parallel execution modes, memory pooling and zero-copy strategies and native C++/CUDA/SIMD optimized nodes. CLI tools for launching the app, running workflows and cleaning resources.
Use Cases
nndeploy reduces time-to-deployment by letting teams design pipelines visually and then run the same workflow on multiple platforms without heavy porting. The JSON workflow format and language bindings enable automation, CI integration and reuse of the same deployment across devices. Built-in performance optimizations and backend integrations help achieve higher inference throughput and lower memory usage when moving models to production. Support for custom Python and C++ nodes lets developers extend and integrate proprietary logic. Prebuilt model examples and workflow templates accelerate prototyping and demos, and documentation plus sample courses help teams learn deployment and inference optimization practices.

Please fill the required fields*