RBenita/DIFFAR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RBenita/DIFFAR)

RBenita / DIFFAR

Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation

☆32

Alternatives and similar repositories for DIFFAR

Users that are interested in DIFFAR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yuhanghe01 / RiTTA
View on GitHub
Event Relation in Text-to-Audio (TTA) Generation
☆21Feb 26, 2025Updated last year
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆38Aug 30, 2025Updated 10 months ago
ChanganVR / action2sound
View on GitHub
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆26Oct 1, 2024Updated last year
FantSun / Speechflow
View on GitHub
Speechflow for emotion recognition related information decomposition
☆10Jul 27, 2021Updated 4 years ago
deepvk / muse
View on GitHub
🎵 muse: Music Separation
☆11Feb 14, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
audiolabs / anechoic-noise
View on GitHub
Generator for anechoic, non-stationary noise signals
☆12Aug 12, 2022Updated 3 years ago
AmphionTeam / TaDiCodec
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆77Jan 25, 2026Updated 5 months ago
Audio-Foundation-Models / ConversationTTS
View on GitHub
☆101Jan 19, 2026Updated 6 months ago
ajd12342 / paraspeechclap
View on GitHub
Codebase for 'ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining'
☆23Jun 20, 2026Updated last month
koudounasalkis / voc2vec
View on GitHub
This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.
☆57Apr 14, 2025Updated last year
OpenMOSS / MOSS-Speech
View on GitHub
MOSS-Speech is a true speech-to-speech large language model without text guidance.
☆137Feb 13, 2026Updated 5 months ago
light1726 / BetaVAE_VC
View on GitHub
Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"
☆43Apr 10, 2023Updated 3 years ago
apple-yinhan / Noise-robust-SED
View on GitHub
☆14Jan 2, 2025Updated last year
egruttadauria98 / SSpaVAlDo
View on GitHub
☆37Jan 6, 2026Updated 6 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MLSpeech / DeepPhoneticToolsTutorial
View on GitHub
Tutorial on {Deep} Phonetic Tools given in BigPhon @ LabPhon15
☆12Apr 17, 2017Updated 9 years ago
jnwnlee / video-foley
View on GitHub
Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 20…
☆18Feb 27, 2026Updated 4 months ago
thu-ml / Bridge-TTS
View on GitHub
Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).
☆132Jul 12, 2024Updated 2 years ago
Taltt / FNSE-SBGAN
View on GitHub
FNSE-SBGAN: Far-field Speech Enhancement with Schrödinger Bridge and Generative Adversarial Networks
☆20May 12, 2025Updated last year
michaelneri / audio-distance-estimation
View on GitHub
Official repository of the work "Speaker Distance Estimation in Enclosures from Single-Channel Audio" published to IEEE/ACM Transactions …
☆40Jun 29, 2026Updated 3 weeks ago
MaxMax2016 / Grad-TTS-Chinese
View on GitHub
Huawei Grad-TTS for Chinese
☆50Sep 26, 2023Updated 2 years ago
titu1994 / warprnnt_numba
View on GitHub
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch
☆17Mar 11, 2022Updated 4 years ago
adasegroup / OSM-one-shot-multispeaker
View on GitHub
Framework for one-shot multispeaker system based on Deep Learning
☆19May 30, 2021Updated 5 years ago
YoavRamon / Speech-Recognition-Israel
View on GitHub
The repository for Speech Recognition Israel meetup group. It is used to material collection and sharing.
☆13Jul 12, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
chaufanglin / Normal2Whisper
View on GitHub
Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"
☆14Oct 31, 2024Updated last year
xinshengwang / robpitch
View on GitHub
A pitch detection model trained to be robust against noise and reverberation environments.
☆27Jan 21, 2025Updated last year
hs-oh-prml / DurFlexEVC
View on GitHub
☆81Jan 22, 2025Updated last year
kyutai-labs / moshi-rag
View on GitHub
MoshiRAG is a compact full-duplex speech language model augmented with asynchronous knowledge retrieval to improve factuality without sac…
☆130Apr 28, 2026Updated 2 months ago
yusunnny / CST-former
View on GitHub
CST-former: Transformer with Channel-Spectro-Temporal Attention for Sound Event Localization and Detection (ICASSP 2024)
☆38May 20, 2025Updated last year
sarulab-speech / multi-speaker-dgp
View on GitHub
Official implementation of DGP-based multi-speaker speech synthesis with PyTorch
☆24Mar 23, 2021Updated 5 years ago
ShawnPi233 / SynParaSpeech
View on GitHub
Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (IC…
☆72Apr 27, 2026Updated 2 months ago
NiniAndy / Paraformer-V2
View on GitHub
来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
☆29Nov 20, 2024Updated last year
slp-rl / SC-PhASE
View on GitHub
This repo contains the official PyTorch implementation of "A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement" (…
☆28Aug 8, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zeyuxie29 / PicoAudio
View on GitHub
☆45Jan 13, 2025Updated last year
dmlguq456 / NeXt_TDNN_ASV
View on GitHub
Official repository of NeXt-TDNN for speaker verification
☆84Oct 10, 2024Updated last year
line / open-universe
View on GitHub
Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.
☆118Aug 29, 2024Updated last year
el-iot / vim-wikipedia-browser
View on GitHub
A vim plugin for navigating between wikiedia articles
☆14Jul 13, 2020Updated 6 years ago
felixperfler / Stable-Hybrid-Auditory-Filterbanks
View on GitHub
[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement
☆43Jul 25, 2025Updated 11 months ago
xkx-hub / KALL-E
View on GitHub
[AAAI 2026 oral] KALL-E:Autoregressive Speech Synthesis with Next-Distribution Prediction
☆41Sep 25, 2025Updated 9 months ago
google-deepmind / librispeech-long
View on GitHub
LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …
☆98Dec 28, 2024Updated last year