huangcanan / Awesome-Large-Speech-ModelView external linksLinks
A repository used to organize content related to Large Speech(Audio) Model, including paper, data, applications, tools and so on.
☆28Nov 8, 2025Updated 3 months ago
Alternatives and similar repositories for Awesome-Large-Speech-Model
Users that are interested in Awesome-Large-Speech-Model are comparing it to the libraries listed below
Sorting:
- Repo for the FB AI Speech team.☆25Aug 24, 2021Updated 4 years ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated last year
- Speech Emotion Recognition using Deep Learning☆12May 24, 2021Updated 4 years ago
- [Lab] lab website☆11Updated this week
- 《人工智能程序设计》大作业:吃豆人(成品)☆11Jul 12, 2022Updated 3 years ago
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆42Mar 20, 2024Updated last year
- 软 件缺陷管理系统 - SpringBoot+Vue☆10Jan 6, 2021Updated 5 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 3 months ago
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- open-source Mandarian biased word dataset☆14Sep 21, 2023Updated 2 years ago
- Semantic Map Learning of Traffic Light to Lane Assignment based on Motion Data☆11Mar 30, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 7 months ago
- c++的一些基础知识总结☆10Oct 28, 2020Updated 5 years ago
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation☆25Aug 24, 2025Updated 5 months ago
- [ICTC'24] - "Voice-Based Age and Gender Recognition: A Comparative Study of LSTM, RezoNet and Hybrid CNNs-BiLSTM Architecture" by Nhut Mi…☆10Jan 16, 2025Updated last year
- ☆42Dec 20, 2025Updated last month
- DOA estimation source code☆10May 13, 2019Updated 6 years ago
- 第二届计图人工智能挑战赛,基于Jittor的草图风景图像生成大赛☆10Jan 28, 2023Updated 3 years ago
- Unofficial reimplementation of CFNet: Cascade Fusion Network for Dense Prediction☆10Mar 23, 2023Updated 2 years ago
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆15Dec 23, 2024Updated last year
- ☆11Aug 10, 2022Updated 3 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 5, 2025Updated last year
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- a simple command line tool / package that prints the dependencies of a python project☆28Apr 6, 2018Updated 7 years ago
- ☆16Nov 5, 2018Updated 7 years ago
- homework of coursera nlp course. https://www.coursera.org/learn/language-processing/home/welcome☆15Dec 7, 2022Updated 3 years ago
- ICASSP 2024: Robust DOA estimation from deep acoustic imaging☆22Apr 14, 2024Updated last year
- 2021MathorCup高校数学建模挑战赛大数据竞赛B题-遥感地块分割-国家一等奖☆12May 1, 2021Updated 4 years ago
- TACTICS (The ACademic TeachIng Purpose Car Simulator)☆15Apr 11, 2024Updated last year
- ☆15Sep 13, 2022Updated 3 years ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- ☆16Feb 6, 2020Updated 6 years ago
- ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)☆10Mar 9, 2024Updated last year
- The source code for target sound detection☆15Feb 26, 2022Updated 3 years ago
- ☆13Mar 30, 2023Updated 2 years ago
- ✒️ ChatGPT as a writing partner.☆14Mar 6, 2023Updated 2 years ago