Android DevHub

Amphion: Real-Time Audio Generation Toolkit by OpenMMLab

Amphion: Real-Time Audio Generation Toolkit by OpenMMLab

Category: Deep Learning

License: Apache-2.0

Model Type: Speech Synthesis

Amphion is an efficient, real-time audio generation toolkit designed for high-quality text-to-audio synthesis. Developed by OpenMMLab, it provides a modular framework for building and deploying audio models, including support for text-to-speech (TTS) and other generative audio tasks. Amphion aims to bridge the gap between research and production through its performance-optimized architecture and extensibility.

Key Features

Modular and extensible design for audio generation
Real-time inference with low-latency audio output
Pretrained models for quick deployment
Support for multi-speaker and multilingual TTS
Built-in benchmarking and evaluation tools
Compatible with PyTorch and ONNX
Optimized for both training and inference pipelines

GitHub Arxiv

Similar Projects

Word2Wave: Text-Controlled GAN Audio Generation

Word2Wave: Text-Controlled GAN Audio Generation

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Fooocus AI Image Generator - A GUI-Powered Stable Diffusion Image Generator

Fooocus AI Image Generator - A GUI-Powered Stable Diffusion Image Generator

Tango: Latent Diffusion Models for Text‑to‑Audio Generation

Tango: Latent Diffusion Models for Text‑to‑Audio Generation

WaveGrad2: Iterative Refinement for Text-to-Speech Synthesis

WaveGrad2: Iterative Refinement for Text-to-Speech Synthesis

Abogen – Audiobook Generator for EPUB, PDF, and Text

Abogen – Audiobook Generator for EPUB, PDF, and Text