semanticVAD/testsets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/semanticVAD/testsets)

semanticVAD / testsets

Testing sets for semanticVAD

☆20

Alternatives and similar repositories for testsets

Users that are interested in testsets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ASLP-lab / M7-TTS
View on GitHub
M7-TTS: A Mini-Scale Multilingual and Multi-Dialect Text-to-Speech Language Model with Mimi codec and Multi Token Prediction
☆20Mar 19, 2026Updated 4 months ago
yfyeung / DS-WED
View on GitHub
[ICASSP 2026] Official code for "Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration"
☆17Apr 16, 2026Updated 3 months ago
nii-yamagishilab / VCC2020-database
View on GitHub
☆53Dec 18, 2020Updated 5 years ago
teamtee / LLM-ASR-Error-Correction
View on GitHub
This is a framework for using large language models to improve ASR recognition accuracy. You need to provide the recognized text and tag …
☆17Jun 5, 2025Updated last year
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Mrunal-G / Casual-turn-taking-and-backchannel-prediction
View on GitHub
☆16Jun 25, 2024Updated 2 years ago
rikishimizu / MeanFlow-TSE
View on GitHub
☆26Jun 10, 2026Updated last month
smart-audio / audio_diarization_annotation
View on GitHub
Audio Diarization Annotation tool
☆30Nov 8, 2019Updated 6 years ago
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
uthree / fastersvc
View on GitHub
☆26Mar 20, 2024Updated 2 years ago
tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
NKU-HLT / KNN-CTC
View on GitHub
[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
☆42Mar 20, 2024Updated 2 years ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
itsnotacie / AAAI-26_SepPrune
View on GitHub
SepPrune: Structured Pruning for Efficient Deep Speech Separation-AAAI'26
☆15May 31, 2025Updated last year
kaistmm / AdaptVC
View on GitHub
☆17Jun 2, 2025Updated last year
xkx-hub / KALL-E
View on GitHub
[AAAI 2026 oral] KALL-E:Autoregressive Speech Synthesis with Next-Distribution Prediction
☆41Sep 25, 2025Updated 9 months ago
ryuclc / CosyVoice2-GRPO
View on GitHub
A simple implementation for improving CosyVoice2 by GRPO method
☆39May 5, 2026Updated 2 months ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
LianjiaTech / athena
View on GitHub
An open-source implementation of sequence-to-sequence based speech processing engine
☆39Jan 11, 2023Updated 3 years ago
BUTSpeechFIT / TS_SUPERB
View on GitHub
☆16Apr 2, 2025Updated last year
ttslr / MonTTS
View on GitHub
☆16Dec 23, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ftshijt / Interspeech2024_DiscreteSpeechChallenge
View on GitHub
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Jan 26, 2024Updated 2 years ago
exercise-book-yq / FreeCodec
View on GitHub
FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS
☆24Sep 9, 2024Updated last year
ASLP-lab / OmniCodec
View on GitHub
OmniCodec: Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement
☆46Apr 17, 2026Updated 3 months ago
zjzser / WMCodec
View on GitHub
PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…
☆18Jul 31, 2025Updated 11 months ago
wenet-e2e / west
View on GitHub
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
☆206Updated this week
echocatzh / py-aec-unified2021
View on GitHub
☆47Jun 6, 2021Updated 5 years ago
IU-SAIGE / pse
View on GitHub
Efficient Personalized Speech Enhancement through Self-Supervised Learning
☆23Mar 12, 2023Updated 3 years ago
aasish / userIntentDataset
View on GitHub
☆14Dec 27, 2016Updated 9 years ago
yu-haoyuan / fd-badcat
View on GitHub
fd-sds
☆20Apr 8, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
umbertocappellazzo / Omni-AVSR
View on GitHub
Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…
☆38Mar 10, 2026Updated 4 months ago
ZhangXinWhut / SimWhisper-Codec
View on GitHub
Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"
☆37Jan 28, 2026Updated 5 months ago
FunAudioLLM / CV3-Eval
View on GitHub
☆187Aug 25, 2025Updated 10 months ago
msplabresearch / MSP-Podcast_Challenge
View on GitHub
MSP-Podcast Challenge Baseline Code
☆31Jun 12, 2024Updated 2 years ago
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
ASLP-lab / Easy-Turn
View on GitHub
Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
☆121Jan 25, 2026Updated 5 months ago
ishine / Mutiband-HIFIGAN
View on GitHub
Mutiband version of HIFIGAN
☆19Nov 6, 2020Updated 5 years ago