facebookresearch / emphassess
This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses paper (de Seyssel et al., 2023).
☆12Updated 8 months ago
Related projects: ⓘ
- ☆39Updated last year
- Script to perform statistical significance test between ASR hypotheses.☆19Updated 7 years ago
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆56Updated last year
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆41Updated last year
- Phoneme segmentation using pre-trained speech models☆49Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆46Updated 10 months ago
- ☆49Updated this week
- multilingual speech aligner☆70Updated 10 months ago
- wav2vec2 audio classification for prosodic boundary detection and other tasks☆31Updated last year
- ☆30Updated last year
- SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆34Updated 2 months ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆32Updated last year
- Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.☆86Updated 2 years ago
- Official release of StyleTalk dataset.☆53Updated 2 months ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆32Updated 11 months ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆21Updated 6 months ago
- ☆18Updated 3 months ago
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆47Updated 8 months ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆68Updated last year
- ☆69Updated this week
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆80Updated 11 months ago
- ☆23Updated this week
- ConMamba for Automatic Speech Recognition☆38Updated last month
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆21Updated 11 months ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆48Updated 3 months ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆70Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Updated last year
- ☆62Updated 8 months ago
- Training code and trained checkpoints for ASGAN.☆60Updated 8 months ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit.☆8Updated 2 years ago