huangcanan/Awesome-Large-Speech-Model

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huangcanan/Awesome-Large-Speech-Model)

huangcanan / Awesome-Large-Speech-Model

A repository used to organize content related to Large Speech(Audio) Model, including paper, data, applications, tools and so on.

☆28

Alternatives and similar repositories for Awesome-Large-Speech-Model

Users that are interested in Awesome-Large-Speech-Model are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
Jolieresearch / ICPF
View on GitHub
☆14Nov 26, 2025Updated 7 months ago
NiuTrans / LMT
View on GitHub
Building a inclusive, scalable, and high-performance multilingual translation model
☆126May 7, 2026Updated 2 months ago
NiuTrans / MTVenues
View on GitHub
A list of conferences and journals relevant to machine translation
☆33Mar 17, 2022Updated 4 years ago
ICDM-UESTC / TrustworthyExplanation
View on GitHub
Redundancy Undermines the Trustworthiness of Self-Interpretable GNNs, International Conference on Machine Learning (ICML), 2025
☆15Jun 23, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
shaokai1209 / MDSA
View on GitHub
[IEEE, TASLP, 2023] The code of the paper "Multi-Source Discriminant Subspace Alignment for Cross-Domain Speech Emotion Recognition".
☆19Sep 27, 2024Updated last year
Flawless1202 / Transformer
View on GitHub
A Pytorch-Lightning Implementation of Transformer Network
☆11Oct 22, 2020Updated 5 years ago
NKU-HLT / DIFFA
View on GitHub
[AAAI 2026 & ACL 2026] The official implementation of the DIFFA series for dLLM-based large audio language model
☆83Apr 7, 2026Updated 3 months ago
Pradeepiit / hf0
View on GitHub
Hybrid f0 estimation using Convolutional Neural Network
☆12Apr 29, 2019Updated 7 years ago
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
ICDM-UESTC / COSE
View on GitHub
The implementation of Paper: Compose Yourself: Average-Velocity Flow Matching for One-Step Speech Enhancement.
☆16Sep 23, 2025Updated 9 months ago
NiuTrans / Introduction-to-Transformers
View on GitHub
An introduction to basic concepts of Transformers and key techniques of their recent advances.
☆53Dec 21, 2023Updated 2 years ago
ictnlp / LSG
View on GitHub
The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”
☆15Jan 3, 2025Updated last year
Labbeti / conette-audio-captioning
View on GitHub
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
☆23Dec 17, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
NKU-HLT / KNN-CTC
View on GitHub
[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
☆42Mar 20, 2024Updated 2 years ago
NKU-HLT / Fusion-Insider-threat-detection
View on GitHub
[ICANN 2023] Anomaly-Based Insider Threat Detection via Hierarchical Information Fusion
☆18Nov 20, 2023Updated 2 years ago
hwang-cs-ime / ATSS
View on GitHub
This is the official code for ``ATSS: Detecting AI-Generated Videos via Anomalous Temporal Self-Similarity''
☆17Apr 7, 2026Updated 3 months ago
yusun-nlp / CasRel_fastNLP
View on GitHub
fastNLP reimplementation of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction"
☆11Dec 11, 2020Updated 5 years ago
LuLuLuyi / TDAR
View on GitHub
Advancing Block Diffusion Language Models for Test-Time Scaling
☆16Feb 14, 2026Updated 5 months ago
CLUEbenchmark / SuperCLUE-Industry
View on GitHub
中文原生工业测评基准
☆17Mar 21, 2024Updated 2 years ago
facebookresearch / fbai-speech
View on GitHub
Repo for the FB AI Speech team.
☆26Aug 24, 2021Updated 4 years ago
vsingh-group / FrameQuant
View on GitHub
☆11Nov 16, 2024Updated last year
NKU-HLT / Emotion-Recognition
View on GitHub
Paper List
☆18Jul 2, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ASLP-lab / Easy-Turn
View on GitHub
Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
☆121Jan 25, 2026Updated 5 months ago
NKU-HLT / Role-Play-Prompting
View on GitHub
[NAACL 2024] Better Zero-Shot Reasoning with Role-Play Prompting
☆36Nov 14, 2023Updated 2 years ago
minguinho26 / Prefix_AAC_ICASSP2023
View on GitHub
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
☆30Dec 6, 2023Updated 2 years ago
k2-fsa / fast_rnnt
View on GitHub
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆149Aug 25, 2023Updated 2 years ago
lovit / sejong_corpus
View on GitHub
세종말뭉치 가공데이터 Repository
☆14Sep 11, 2018Updated 7 years ago
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
godmoves / TensorFlow_to_TensorRT
View on GitHub
A demo to show how to convert a TensorFlow model to TensorRT uff or PLAN
☆11Jul 22, 2018Updated 7 years ago
mairaksi / PiENet
View on GitHub
Pitch estimation network (PiENet) for noise-robust neural F0 estimation of speech signals
☆50Jul 24, 2019Updated 6 years ago
sjlee7 / speech-dereverberation
View on GitHub
speech-dereverberation-using-GANs
☆13Jan 28, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mike-qz-wang / AUDETER
View on GitHub
Testing github connection on vscode update
☆16Jun 1, 2026Updated last month
NiuTrans / GRAM
View on GitHub
Code for ICML 2025 paper "GRAM: A Generative Foundation Reward Model for Reward Generalization"
☆21Sep 4, 2025Updated 10 months ago
leichaocn / LSTM
View on GitHub
以Word2Vec和LSTM为基础,实现一个语言模型
☆11Nov 7, 2017Updated 8 years ago
hongsunjang / pipe-bd
View on GitHub
[DATE 2023] Pipe-BD: Pipelined Parallel Blockwise Distillation
☆12Jul 13, 2023Updated 3 years ago
Y-Research-SBU / SlideGen
View on GitHub
Official Repository for SlideGen
☆15Jun 1, 2026Updated last month
CLAD23 / CLAD
View on GitHub
☆21Apr 23, 2024Updated 2 years ago
Dream-High / DJCM
View on GitHub
☆30Apr 22, 2024Updated 2 years ago