Madhuvod / VoxLinguaLinks
A Model (maybe an app) that translates the audio of a video from one language to another language, cloning the voice of original video with the translated audio
☆15Updated 5 months ago
Alternatives and similar repositories for VoxLingua
Users that are interested in VoxLingua are comparing it to the libraries listed below
Sorting:
- This repository includes training, inference, evaluation, and utility scripts developed for fine-tuning the Whisper medium.en model on Ai…☆18Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Updated 4 months ago
- Supervoice diffusion enhance☆27Updated last year
- specifications and documentation for the Open Voice Interoperability Initiative Project☆19Updated last week
- ☆23Updated last week
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 5 months ago
- Transfer learning approach to pronunciation scoring☆11Updated last year
- Forced alignment decoder for Whisper.☆14Updated last year
- Text-to-Speech Latency Benchmark☆18Updated 4 months ago
- Open TTS models, built for streaming on the edge☆43Updated 7 months ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆19Updated last month
- a simple system for 2-way interruptible voice interactions between human and LLM☆30Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆15Updated last year
- Getting confidences from any end-to-end systems☆11Updated 2 years ago
- Code and Resources for "LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study", introducing methods to leverage LLMs for G…☆13Updated 5 months ago
- proof of concept conversation orchestrator with a speech-language model☆20Updated last year
- Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class conditioning built on F5-TTS☆19Updated 2 months ago
- Sing any popular song with your voice☆11Updated 3 years ago
- C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Updated 9 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated 2 years ago
- Text To Speech Multilingual Support (+20 Language)☆50Updated 2 years ago
- Official PyTorch implementation of "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis…☆16Updated 7 months ago
- ☆11Updated 11 months ago
- The Vokan Architecture (Tsukasa speech based)☆10Updated 8 months ago
- Sophia AI Assistant is a Python-based desktop AI that performs a variety of tasks, including answering questions, opening applications, b…☆22Updated last year
- Soniox Compare. Compare real-time voice AI side by side. No glossy charts, just results.☆13Updated 3 months ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆29Updated last week
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 10 months ago
- [INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…☆16Updated last year