nickjw0205/Improving-ASR-with-LLM-Description

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nickjw0205/Improving-ASR-with-LLM-Description)

nickjw0205 / Improving-ASR-with-LLM-Description

☆20

Alternatives and similar repositories for Improving-ASR-with-LLM-Description

Users that are interested in Improving-ASR-with-LLM-Description are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆18Jul 22, 2024Updated last year
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
introlab / uimvdr
View on GitHub
☆13Oct 11, 2024Updated last year
aalto-speech / interspeech2019_karhila_et_al
View on GitHub
Compendium for the paper "Transparent pronunciation scoring using articulatorily weighted phoneme edit distance" by Karhila, Smolander, Y…
☆25May 6, 2019Updated 7 years ago
hhhaaahhhaa / ASR-TTA
View on GitHub
☆16Nov 4, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Observeai-Research / Phoneme-BERT
View on GitHub
☆34Jun 15, 2021Updated 5 years ago
yxduir / LLM-SRT
View on GitHub
☆28Mar 11, 2026Updated 4 months ago
BUTSpeechFIT / diacorrect
View on GitHub
Error correction back-end for speaker diarization
☆18Sep 26, 2023Updated 2 years ago
bigcash / spleeter-pytorch-mnn
View on GitHub
convert spleeter pretrained model to pytorch and onnx, then convert to mnn
☆21Dec 17, 2020Updated 5 years ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
soumimaiti / speechlmscore_tool
View on GitHub
☆34Nov 24, 2024Updated last year
talhanai / wer-sigtest
View on GitHub
Script to perform statistical significance test between ASR hypotheses.
☆23Aug 13, 2017Updated 8 years ago
mtkresearch / clairaudience
View on GitHub
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)
☆26Oct 10, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
Yip-Jia-Qi / codecformer
View on GitHub
☆21Jul 15, 2024Updated 2 years ago
cuhealthybrains / MT-LLM
View on GitHub
The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"
☆50Apr 7, 2025Updated last year
NKU-HLT / AudioEditor
View on GitHub
☆47Apr 2, 2025Updated last year
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 3 months ago
mubtasimahasan / DM-Codec
View on GitHub
Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”
☆57Jun 1, 2025Updated last year
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
HuangZiliAndy / SSL_for_multitalker
View on GitHub
ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS
☆33Mar 16, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Aisaka0v0 / TS-Whisper
View on GitHub
☆33Jun 12, 2025Updated last year
fchest / Speech-Transformer-multi-GPUs
View on GitHub
A PyTorch implementation of Speech Transformer with multi-GPUs, an End-to-End ASR with Transformer network on Mandarin Chinese. This code…
☆10Dec 25, 2019Updated 6 years ago
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
tuanio / nextformer
View on GitHub
PyTorch implementation of "Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition"
☆10Dec 15, 2022Updated 3 years ago
sony / bigvsan_eval
View on GitHub
Evaluation tool used in the BigVSAN paper
☆14Mar 22, 2024Updated 2 years ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
deepvk / muse
View on GitHub
🎵 muse: Music Separation
☆11Feb 14, 2024Updated 2 years ago
deep-diver / neurips2024
View on GitHub
Read and Listen to NeurIPS 2024 Papers
☆17Feb 20, 2025Updated last year
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
PecholaL / MAIN-VC
View on GitHub
Lightweight Speech Representation Learning for One-Shot Voice Conversion
☆23Dec 12, 2024Updated last year
dodohow1011 / SpeechAdvReprogram
View on GitHub
A Study of Low-Resource Speech Commands Recognition Based on Adversarial Reprogramming
☆19Oct 12, 2023Updated 2 years ago
skit-ai / N-Best-ASR-Transformer
View on GitHub
Code for ACL-IJCNLP 2021 paper "N-Best-ASR-Transformer: Enhancing SLU Performance using Multiple ASR Hypotheses."
☆17Nov 30, 2021Updated 4 years ago
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
utter-project / mHuBERT-147-scripts
View on GitHub
Collection of scripts from mHuBERT-147.
☆35Nov 19, 2024Updated last year
shuheikatoinfo / UtterTune
View on GitHub
LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…
☆26Jul 8, 2026Updated last week
facebookresearch / emphassess
View on GitHub
This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …
☆25Jan 9, 2024Updated 2 years ago