This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.
☆20Nov 19, 2024Updated last year
Alternatives and similar repositories for fairseq
Users that are interested in fairseq are comparing it to the libraries listed below
Sorting:
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- ☆25Nov 3, 2025Updated 3 months ago
- offical code for Dense-TSNet☆12Sep 17, 2024Updated last year
- [ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation☆13Aug 2, 2023Updated 2 years ago
- radiomixer☆14Feb 16, 2022Updated 4 years ago
- My version of the RVC V2 Disconnected Colab notebook, which allows you to use RVC without using WebUI/Gradio☆15Jun 11, 2024Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- ☆16Apr 24, 2025Updated 10 months ago
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆23Oct 30, 2024Updated last year
- ☆20Jul 22, 2022Updated 3 years ago
- This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"☆15Apr 8, 2024Updated last year
- ☆17Jul 22, 2024Updated last year
- Open source crawler for Persian websites.☆20Aug 27, 2023Updated 2 years ago
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- ☆20Sep 2, 2024Updated last year
- a Frontier Japanese Speech Generation net☆60May 15, 2025Updated 9 months ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 4 months ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- ☆22Jun 24, 2024Updated last year
- ☆29Feb 4, 2025Updated last year
- ☆68Dec 30, 2025Updated 2 months ago
- ☆26Mar 20, 2024Updated last year
- ☆25Mar 6, 2024Updated last year
- Official page of "DeFTAN-II: Efficient multichannel speech enhancement with subgroup processing", IEEE/ACM Transactions on Audio, Speech,…☆31Nov 21, 2024Updated last year
- The official implementation of the DIFFA series for dLLM-based large audio language model☆59Feb 2, 2026Updated 3 weeks ago
- Official PyTorch implementation for "Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech …☆33May 11, 2025Updated 9 months ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆69Aug 13, 2024Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆36Feb 11, 2025Updated last year
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…☆69Mar 31, 2024Updated last year
- My vocoder experiments☆31Jul 26, 2025Updated 7 months ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆132Sep 25, 2023Updated 2 years ago
- [ICASSP 2024] Official code for FreGrad☆34May 13, 2024Updated last year
- Cantonese Text to Speech with VITS implementation☆37Apr 8, 2023Updated 2 years ago
- [INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"☆64Jun 16, 2025Updated 8 months ago
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆78Dec 3, 2024Updated last year
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆41Jan 4, 2026Updated last month