choijeongsoo / utut
[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
☆27Updated 4 months ago
Alternatives and similar repositories for utut:
Users that are interested in utut are comparing it to the libraries listed below
- A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024☆13Updated 6 months ago
- This repository collects papers related to Speech Tokenizer.☆15Updated 3 months ago
- ☆43Updated last year
- ☆64Updated last year
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆25Updated 4 months ago
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆47Updated last year
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated last year
- The open source code for LLM-Codec☆123Updated 5 months ago
- ☆30Updated last year
- ☆48Updated 2 months ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆49Updated 7 months ago
- EMO-SUPERB submission☆42Updated 4 months ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Updated 9 months ago
- ☆16Updated 3 months ago
- ☆29Updated 2 months ago
- ☆19Updated last year
- Survey on speech generation work.☆17Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆52Updated 2 months ago
- A toolkit dedicate for speech evaluation.☆19Updated 4 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆12Updated last week
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆39Updated last year
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆73Updated 7 months ago
- [AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS☆64Updated 2 months ago
- Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.☆22Updated last year
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Updated 11 months ago
- A CSRankings-like index for speech researchers☆33Updated 3 months ago
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆68Updated last year
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆32Updated 10 months ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆11Updated 8 months ago