ElevenLabs Clone

ElevenLabs Clone

Category: Other
License: MIT
Model Type: Speech Synthesis
A self‑hosted full‑stack AI audio platform offering capabilities similar to ElevenLabs. It supports text‑to‑speech, voice conversion, and creative text‑to‑audio generation. The system combines Dockerized model containers, a FastAPI backend, and a Next.js frontend for a complete, interactive experience.

Key Features

  • Text‑to‑speech synthesis using StyleTTS2 models
  • Voice cloning/conversion with Seed‑VC
  • Text‑to‑audio creative synthesis using Make‑An‑Audio module
  • Fine‑tuning support for custom voice identities
  • Containerized deployment via Docker and Docker Compose
  • FastAPI backend providing scalable inference endpoints
  • Modern Next.js frontend featuring voice selection, playback UI, and user history
  • Authentication via Auth.js and credit-based usage tracking
  • Job queuing through Inngest to manage model workloads
  • Integration with AWS S3 for storing generated audio files
  • Multiple pre-trained voices included
  • Fully responsive UI built with Tailwind CSS