Hypotheses-Paradise/UADF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hypotheses-Paradise/UADF)

Hypotheses-Paradise / UADF

☆17

Alternatives and similar repositories for UADF

Users that are interested in UADF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆18Jul 22, 2024Updated 2 years ago
tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
YUCHEN005 / GILA
View on GitHub
Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"
☆18Jun 21, 2023Updated 3 years ago
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
YUCHEN005 / RATS-Channel-A-Speech-Data
View on GitHub
This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…
☆16Oct 22, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Hypotheses-Paradise / Hypo2Trans
View on GitHub
Single-blind supplementary materials for NeurIPS 2023 submission
☆94Oct 30, 2024Updated last year
YUCHEN005 / MIR-GAN
View on GitHub
Code for paper "MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recogni…
☆16Jun 21, 2023Updated 3 years ago
Hertin / WavPrompt
View on GitHub
☆37Jun 30, 2022Updated 4 years ago
YUCHEN005 / RobustGER
View on GitHub
Code for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"
☆143May 8, 2024Updated 2 years ago
swagshaw / Rainbow-Keywords
View on GitHub
Rainbow Keywords - Official PyTorch Implementation
☆14Jun 27, 2024Updated 2 years ago
thuhcsi / Contextual-Biasing-Dataset
View on GitHub
open-source Mandarian biased word dataset
☆14Sep 21, 2023Updated 2 years ago
Srijith-rkr / KAUST-Whisper-Adapter
View on GitHub
INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!
☆41Sep 11, 2023Updated 2 years ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
yichen14 / FastAdaSP
View on GitHub
Code for the paper "FastAdaSP: An Efficient Multitask Inference Framework for Large Speech Language Models". @ EMNLP'24(Oral)
☆17Nov 14, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
pashanitw / W2V2-BERT-ASR-Training
View on GitHub
☆15Mar 25, 2024Updated 2 years ago
Srijith-rkr / Whispering-LLaMA
View on GitHub
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
☆271May 19, 2024Updated 2 years ago
tango4j / llm_speaker_tagging
View on GitHub
SLT 2024 Challenge: Post-ASR-Speaker-Tagging
☆16Jun 16, 2024Updated 2 years ago
Lhx94As / Awesome-Spoken-Language-Identification
View on GitHub
An awesome spoken LID repository. (Working in progress
☆109Apr 22, 2024Updated 2 years ago
stevenhillis / awesome-asr-contextualization
View on GitHub
A curated list of awesome papers on contextualizing E2E ASR outputs
☆81May 10, 2023Updated 3 years ago
YUCHEN005 / DPSL-ASR
View on GitHub
Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"
☆44May 23, 2023Updated 3 years ago
hongfeixue / StutteringSpeechChallenge
View on GitHub
SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
☆12Jun 11, 2024Updated 2 years ago
facebookresearch / fbai-speech
View on GitHub
Repo for the FB AI Speech team.
☆27Aug 24, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
teamtee / LLM-ASR-Error-Correction
View on GitHub
This is a framework for using large language models to improve ASR recognition accuracy. You need to provide the recognized text and tag …
☆18Jun 5, 2025Updated last year
shirley-wu / daco
View on GitHub
[NeurIPS 2024 D&B Track] DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation
☆14Mar 5, 2025Updated last year
YUCHEN005 / UniVPM
View on GitHub
Code for paper "Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition"
☆28Jun 21, 2023Updated 3 years ago
YuanGongND / llm_speech_emotion_challenge
View on GitHub
☆23Jun 24, 2024Updated 2 years ago
GeorgeEfstathiadis / LLM-Diarize-ASR-Agnostic
View on GitHub
Repository for "LLM-based speaker diarization correction: A generalizable approach" paper
☆23Jul 31, 2024Updated last year
emonosuke / emoASR
View on GitHub
End-to-end MOdeling of ASR (Automatic Speech Recognition)
☆33Feb 16, 2023Updated 3 years ago
raotnameh / End-to-end-E2E-Named-Entity-Recognition-from-English-Speech
View on GitHub
☆32Dec 2, 2020Updated 5 years ago
YUCHEN005 / UNA-GAN
View on GitHub
Code for paper "Unsupervised Noise adaptation using Data Simulation"
☆14May 16, 2024Updated 2 years ago
ckyang1124 / SAKURA
View on GitHub
Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Informa…
☆25Aug 14, 2025Updated 11 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
KrishnaDN / Keyword-Transformer
View on GitHub
Implementation of the paper "Keyword Transformer: A Self-Attention Model for Keyword Spotting"
☆23May 19, 2021Updated 5 years ago
kyegomez / USM
View on GitHub
Implementation of Google's USM speech model in Pytorch
☆36Jul 20, 2026Updated last week
the-bird-F / GLM-Voice-RAG
View on GitHub
[EMNLP 2025 Findings] A complete cross-modal RAG system for end-to-end speech-to-speech large models, including ASR-based Retrieval and E…
☆31Jul 11, 2025Updated last year
songweige / Dmoz-Dataset
View on GitHub
content.rdf.u8.gz
☆11Dec 15, 2020Updated 5 years ago
Alibaba-NLP / AISHELL-NER
View on GitHub
[ICASSP 2022] AISHELL-NER: Named Entity Recognition from Chinese Speech
☆26Apr 20, 2022Updated 4 years ago
tira-io / tira
View on GitHub
The source code for the TIRA Shared Task Platform
☆19Jul 14, 2026Updated 2 weeks ago
swagshaw / TorchKWS
View on GitHub
Collection of PyTorch implementations of Spoken Keyword Spotting presented in research papers.
☆41Apr 5, 2024Updated 2 years ago