WavCraft

Design

WavCraft is an LLM-driven system for audio content creation and editing that connects large language models with audio expert models and DSP functions. The repository provides tools to perform text-guided audio editing of existing clips, generate new audio from textual prompts, and produce audio-aware scriptwriting where the model writes scenes and generates corresponding sounds. It includes command-line entry points and an interactive chat mode for iterative editing and supports watermarking so outputs can be identified as produced or modified by WavCraft. The project supplies environment setup and service launch scripts to run deep learning components locally and accepts OpenAI and Hugging Face credentials for model access. The codebase is provided for research purposes and includes a mandatory watermarking disclaimer.

Stars

527

App URL

https://github.com/JinhuaLiang/WavCraft

Github Repository

https://github.com/JinhuaLiang/WavCraft/blob/main/README.md

Features

WavCraft implements several core capabilities: text-guided audio editing that modifies input WAV files via a one-line CLI or an interactive chat flow, text-guided audio generation from prompts, and an audio scriptwriting mode that produces scene descriptions and sound designs. Utility scripts include scripts/setup_envs.sh for environment setup and scripts/start_services.sh to launch local model services. Example entry points shown in the README are WavCraft.py and WavCraft-chat.py for batch and interactive use, plus check_watermark.py to verify whether audio was generated or altered by WavCraft. The project added watermarking functionality and supports openLLMs such as the Mistral family for generation. The architecture is described as an LLM orchestrator that glues together expert audio models and DSP tools.

Use Cases

WavCraft helps researchers and creators prototype LLM-driven audio workflows by enabling natural-language control of audio editing and generation. Users can quickly apply high-level textual instructions to an existing recording to add or modify sounds, run interactive sessions to refine edits, or request the system to draft audio-oriented scripts and generate corresponding soundtracks. Built-in watermarking and a checker script aid provenance and detection of synthesized content. The provided setup and service scripts make it easier to run required deep learning components locally and to experiment with OpenAI or Hugging Face models. The README emphasizes research-only usage and advises not to disable watermarking, offering a practical sandbox for experimenting with audio LLM orchestration.

WavCraft

Basic Information

Links

App Details

Categories

Similar Listings

python-sdks

sd webui agent scheduler

meshgen

Interactive LLM Powered NPCs

sd webui agent scheduler

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags

More Filters

WavCraft

Categories

Similar Listings

python-sdks

sd webui agent scheduler

meshgen

Interactive LLM Powered NPCs

sd webui agent scheduler

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags