A Pytorch Implementations for Various Vector Quantization Methods
☆33Sep 14, 2021Updated 4 years ago
Alternatives and similar repositories for pytorch-vector-quantization
Users that are interested in pytorch-vector-quantization are comparing it to the libraries listed below
Sorting:
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Mar 7, 2023Updated 2 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- ☆10Apr 8, 2024Updated last year
- A Web Application for Baroque-style Human/Computer Musical Jamming.☆15May 31, 2023Updated 2 years ago
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- Song Describer is a data collection platform for annotating music with textual descriptions.☆60Dec 3, 2024Updated last year
- ☆16Feb 10, 2026Updated 2 weeks ago
- ☆31Jul 13, 2023Updated 2 years ago
- A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…☆36Feb 10, 2024Updated 2 years ago
- Implementation of SoundStorm built upon SpeechTokenizer.☆116Nov 2, 2023Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- A benchmark corpus for ASR hypothesis revising task☆21Sep 26, 2023Updated 2 years ago
- The official repo of our research work "Interactive Editing for Text Summarization".☆23Jun 3, 2023Updated 2 years ago
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆73Mar 17, 2025Updated 11 months ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- TPSE-GST Tacotron2☆14May 1, 2019Updated 6 years ago
- Parallel waveform generation with DiffusionGAN☆17Mar 26, 2022Updated 3 years ago
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- ☆18Feb 9, 2020Updated 6 years ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…☆45Jul 30, 2022Updated 3 years ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- Interface for using TTS and vocoder models in the form of a text editor☆19Nov 25, 2025Updated 3 months ago
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Oct 19, 2022Updated 3 years ago
- **ARCHIVED** Filesystem interface to 🤗 Hub☆59Apr 6, 2023Updated 2 years ago
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Dec 4, 2023Updated 2 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"☆62Jan 7, 2026Updated last month
- This is the code for the EMNLP2020 Finding paper "BERT for Monolingual and Cross-Lingual Reverse Dictionary"☆19Sep 27, 2020Updated 5 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆23Mar 18, 2024Updated last year
- Speech synthesis using LPC☆23Jun 5, 2021Updated 4 years ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆66Oct 28, 2024Updated last year
- Prosodic Speech Segmentation with Transformers☆26Feb 25, 2024Updated 2 years ago
- Hugging Face Download (Cache) Manager☆21Aug 7, 2022Updated 3 years ago
- This repo includes beat and bar annotations for the ballroom dataset.☆24Sep 6, 2023Updated 2 years ago
- Synthesized singing voice demos of WeSinger 2 paper.☆26Feb 20, 2023Updated 3 years ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Jan 24, 2024Updated 2 years ago
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆26Nov 4, 2023Updated 2 years ago