Hannieliao/Emilia-NV

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hannieliao/Emilia-NV)

Hannieliao / Emilia-NV

Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"

☆90

Alternatives and similar repositories for Emilia-NV

Users that are interested in Emilia-NV are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AmphionTeam / AnyAccomp
View on GitHub
AnyAccomp: Generalizable accompaniment generation for vocals and solo instruments, powered by a quantized melodic bottleneck.
☆37Dec 22, 2025Updated 4 months ago
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆65Sep 22, 2025Updated 7 months ago
NKU-HLT / DIFFA
View on GitHub
[AAAI 2026 & ACL 2026] The official implementation of the DIFFA series for dLLM-based large audio language model
☆79Apr 7, 2026Updated last month
ASLP-lab / SenSE
View on GitHub
Official code of SenSE.
☆84Oct 30, 2025Updated 6 months ago
Shy-98 / MELLE
View on GitHub
Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"
☆41Jun 28, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yfyeung / CLSP
View on GitHub
[ACL 2026 Main] Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
☆72Apr 6, 2026Updated last month
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆115Jun 4, 2025Updated 11 months ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆125Mar 20, 2025Updated last year
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 3 months ago
naver-ai / usdm
View on GitHub
Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)
☆95Dec 3, 2024Updated last year
light1726 / Speech-Tokenization-Papers
View on GitHub
This repository follows papers and reports on discrete speech representation learning and speech tokenization methods for speech language…
☆15Dec 1, 2023Updated 2 years ago
bfs18 / e2_tts
View on GitHub
☆70Sep 3, 2024Updated last year
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆62Nov 1, 2024Updated last year
omine-me / LaughterSegmentation
View on GitHub
2024 Latest laughter detection & segmentaion model. Paper: "Robust Laughter Segmentation with Automatic Diverse Data Synthesis", Interspe…
☆65Sep 1, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆102Nov 1, 2025Updated 6 months ago
ajd12342 / paraspeechcaps
View on GitHub
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆161Mar 26, 2026Updated last month
thuhcsi / VoxInstruct
View on GitHub
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
☆99Nov 9, 2024Updated last year
cantabile-kwok / vec2wav2.0
View on GitHub
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆79Dec 3, 2024Updated last year
SparkAudio / VoxBox
View on GitHub
A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.
☆112May 5, 2025Updated last year
haoheliu / SemantiCodec-inference
View on GitHub
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
☆250Mar 7, 2025Updated last year
AlanBaade / SyllableLM
View on GitHub
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
☆63Jul 1, 2025Updated 10 months ago
francislata / unicats
View on GitHub
An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".
☆26Nov 4, 2023Updated 2 years ago
xiaomi-research / dasheng-denoiser
View on GitHub
Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…
☆79Jun 16, 2025Updated 11 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Aria-K-Alethia / BigCodec
View on GitHub
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆217Sep 19, 2024Updated last year
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆244Apr 20, 2024Updated 2 years ago
k2-fsa / libriheavy
View on GitHub
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
☆217Sep 10, 2024Updated last year
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 2 years ago
AmphionTeam / TaDiCodec
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆77Jan 25, 2026Updated 3 months ago
bfs18 / armel
View on GitHub
poorman's ar-dit tts
☆45Dec 31, 2025Updated 4 months ago
ryuclc / CosyVoice2-GRPO
View on GitHub
A simple implementation for improving CosyVoice2 by GRPO method
☆38May 5, 2026Updated 2 weeks ago
qiuqiangkong / audio_flow
View on GitHub
☆120May 5, 2026Updated 2 weeks ago
nonverbalspeech38k / nonverspeech38k
View on GitHub
The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…
☆66Dec 26, 2025Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
speechnovateur / languagecodec_tmp
View on GitHub
Temporary anonymous version
☆22Mar 20, 2024Updated 2 years ago
LAION-AI / emotion-annotations
View on GitHub
☆109Feb 28, 2026Updated 2 months ago
nene1212 / MaskGCT-Training
View on GitHub
Training code for MaskGCT-T2S model.
☆24Dec 14, 2024Updated last year
gibbona1 / neal
View on GitHub
NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.
☆18Apr 7, 2025Updated last year
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆52May 1, 2025Updated last year
jimbozhang / xares
View on GitHub
A benchmark for evaluating audio encoders on various audio tasks.
☆53Apr 27, 2026Updated 3 weeks ago
jimbozhang / xares-llm-template
View on GitHub
Template for creating audio encoders compatible with X-ARES
☆19Feb 11, 2026Updated 3 months ago