Voice-Pro

Key Features

Voice Changer (RVC): Transform your voice with real-time conversion.
Zero-shot Voice Cloning (E2, F5-TTS): Clone any voice with just a short sample.
YouTube Downloading: Extract audio from videos in multiple formats (mp3, wav, flac).
Vocal Isolation (UVR5): Separate vocals from background using MDX-Net and Demucs.
Text-to-Speech (Edge-TTS): Generate natural-sounding speech in multiple languages.
Multi-language Translation: Instantly translate audio and subtitles into various languages.

Powered by advanced Whisper engines: Whisper, Faster-Whisper, and Whisper-Timestamped for top-tier speech recognition.
Supports NVIDIA GPUs with CUDA 12.1 for enhanced performance.
Minimum 4GB VRAM required; 8GB recommended for optimal processing.

Explore how Voice-Pro can clone celebrity voices with zero-shot technology.

Clone or download from GitHub: git clone https://github.com/abus-aikorea/voice-pro.git.
Run configure.bat to set up the environment.
Launch the WebUI by running start.bat.
Requires Windows 10/11 (64-bit), NVIDIA GPU with CUDA 12.1, and 4GB+ VRAM (8GB recommended).

Product Information

Home Karaoke (Pop)

Studio Tab Workflow Demo

Click the thumbnails to watch Voice-Pro in action!

Voice-Pro has been recognized and discussed across various platforms. Here are some notable mentions:

Hacker News: Featured in "Show HN: Voice-Pro – AI Voice Cloning" for its open-source audio manipulation capabilities.
SourceForge: Hosted as a mirror with user reviews, highlighting its transcription and translation features.
AIShareNet: Introduced as a versatile audio processing tool with real-time translation and YouTube downloads.