SubToAudio: Subtitle-to-Audio Conversion with Coqui TTS

SubToAudio: Subtitle-to-Audio Conversion with Coqui TTS

Category: Deep Learning
License: MIT
Model Type: Speech Synthesis
SubToAudio is a Python-based tool that converts subtitle files (e.g., .srt, .ass) into synchronized audio files. It utilizes Coqui TTS, a high-quality open-source text-to-speech engine, to generate speech and aligns the audio timing with the subtitle timestamps. This is particularly useful for creating audio versions of videos or for accessibility purposes.

Key Features

  • Subtitle Synchronization: Aligns audio output with subtitle timestamps for accurate timing.
  • Multiple Subtitle Formats Supported: Handles .srt, .ass, and other common subtitle formats.
  • Coqui TTS Integration: Leverages Coqui TTS for high-quality, open-source text-to-speech synthesis.
  • Language Support: Supports various languages through Coqui TTS models.
  • Pretrained Models: Offers pretrained models for quick setup and usage.
  • Command-Line Interface: Provides a simple CLI for easy integration into workflows.
  • Colab Notebook: Includes a Colab notebook for interactive usage and experimentation.