Amphion is an efficient, real-time audio generation toolkit designed for high-quality text-to-audio synthesis. Developed by OpenMMLab, it provides a modular framework for building and deploying audio models, including support for text-to-speech (TTS) and other generative audio tasks. Amphion aims to bridge the gap between research and production through its performance-optimized architecture and extensibility.
Key Features
Modular and extensible design for audio generation
Real-time inference with low-latency audio output
Pretrained models for quick deployment
Support for multi-speaker and multilingual TTS
Built-in benchmarking and evaluation tools
Compatible with PyTorch and ONNX
Optimized for both training and inference pipelines