Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quality of synthesis, as well as small foot print TTS integrated with disability aids and various other applications.
☆17Feb 9, 2024Updated 2 years ago
Alternatives and similar repositories for Fastspeech2_MFA
Users that are interested in Fastspeech2_MFA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆32Aug 22, 2024Updated last year
- Some script for helping using Montreal Forced Aligner, maily for transforming Hanzi character to pinyin and extrat pause time from .textg…☆14Feb 9, 2024Updated 2 years ago
- OCR as a service☆15Dec 11, 2016Updated 9 years ago
- [ASRU 2023] Code of paper SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation☆21Aug 13, 2024Updated last year
- Carnatic singing voice separation trained with in-domain data with leakage☆11Nov 5, 2023Updated 2 years ago
- My runthrough of karpathy's lectures (with notes), building NN's from scratch, simple autoregressive language models, GPT models and lear…☆10Sep 11, 2023Updated 2 years ago
- Language Identification for Indian languages☆31Dec 2, 2025Updated 3 months ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 8 months ago
- Generative Adaptive MIDI Extractor☆55Mar 14, 2026Updated last week
- Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…☆13Mar 18, 2024Updated 2 years ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 6 months ago
- 삼각형의 실전! Triton☆16Feb 15, 2024Updated 2 years ago
- Demo for DART, Audio Imagination workshop submission in NeurIPS 2024☆13Apr 15, 2025Updated 11 months ago
- [ICLR 2025 Spotlight] Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model☆14Apr 23, 2025Updated 11 months ago
- Diffusers++: State-of-the-art diffusion models for image and audio generation in PyTorch☆14Sep 18, 2024Updated last year
- An attempt to recognise raga of a Carnatic song.☆13Dec 24, 2022Updated 3 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆13Mar 11, 2025Updated last year
- ModelQ is a lightweight, battle-tested Python library for scheduling and queuing machine learning inference tasks. It's designed as a fas…☆18Jan 30, 2026Updated last month
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆69Nov 1, 2024Updated last year
- Malayalam Corpus by Swathanthra Malayalam Computing☆19Apr 2, 2023Updated 2 years ago
- 基于PC-DDSP和nsf-HiFiGAN的声码器☆18Jul 17, 2023Updated 2 years ago
- Generate and morph between checkfaces☆22May 4, 2022Updated 3 years ago
- image retrieval/tagging with CLIP☆13Jul 13, 2024Updated last year
- VTrick template resource☆19Nov 8, 2022Updated 3 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based …☆65Aug 24, 2025Updated 6 months ago
- Lightning-YOLOs provides clean, modular YOLO object detection models built on PyTorch Lightning, making it easier to train, extend, and e…☆34Jan 19, 2026Updated 2 months ago
- ☆14Aug 19, 2024Updated last year
- Repository contains various Malayalam ASR based resources curated from multiple sources☆18Oct 1, 2021Updated 4 years ago
- [InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter☆93Jul 4, 2024Updated last year
- Tools to isolate speaker and transcribe unstructured audio clips☆11Dec 4, 2022Updated 3 years ago
- ☆11Feb 20, 2025Updated last year
- Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.☆29Oct 18, 2024Updated last year
- Implementation of "Face detection in untrained deep neural networks" (Baek et al., Nature Communications, 2021)☆10Nov 2, 2021Updated 4 years ago
- Elixir bindings to Kokoro-82M text-to-speech model☆20Mar 4, 2025Updated last year
- ☆10Apr 8, 2024Updated last year
- Official PyTorch Implementation for the "What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-mod…☆20Sep 26, 2024Updated last year
- Real-time melgan based on cpu !!!☆13Dec 3, 2019Updated 6 years ago
- ☆42Nov 19, 2025Updated 4 months ago