zruiii / QwenAudioSFTView external linksLinks
The repoduction codes for Qwen-Audio Fine-tuning
☆53Aug 15, 2024Updated last year
Alternatives and similar repositories for QwenAudioSFT
Users that are interested in QwenAudioSFT are comparing it to the libraries listed below
Sorting:
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Colab notebook for fine-tuning Qwen2-Audio with trl's SFT and PPO trainers.☆24Nov 23, 2024Updated last year
- Python runtime for WeTextProcessing (does not depend on Pynini)☆48Nov 28, 2025Updated 2 months ago
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 7 months ago
- ☆36Jan 6, 2026Updated last month
- Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.☆19Apr 22, 2019Updated 6 years ago
- ☆23Oct 17, 2024Updated last year
- CTC decoder with hotwords for ASR.☆34Apr 13, 2025Updated 10 months ago
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- This is a repository for fine-tuning Qwen2-Audio, currently supporting Distributed Data Parallel (DDP) and DeepSpeed.☆49Jul 28, 2025Updated 6 months ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"☆37Dec 6, 2023Updated 2 years ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 7 months ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆68Dec 30, 2025Updated last month
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆152Dec 5, 2024Updated last year
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- Arxiv automatically obtains the latest article service.☆11Apr 29, 2020Updated 5 years ago
- ☆15Nov 11, 2024Updated last year
- ☆23Dec 6, 2025Updated 2 months ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 7 months ago
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆15Dec 23, 2024Updated last year
- ☆11May 7, 2022Updated 3 years ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆12Feb 5, 2025Updated last year
- ☆14Aug 16, 2023Updated 2 years ago
- DPDFNet: causal single-channel speech enhancement that boosts DeepFilterNet2 with dual-path RNN blocks for stronger long-range temporal a…☆30Updated this week
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Aug 2, 2021Updated 4 years ago
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆30Aug 2, 2025Updated 6 months ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆82Oct 19, 2023Updated 2 years ago
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Dec 6, 2023Updated 2 years ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated last year
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆39Mar 15, 2024Updated last year
- ☆13Mar 11, 2025Updated 11 months ago
- ☆26Sep 22, 2022Updated 3 years ago
- Vocoder NSF-HiFiGAN (Moved into deepaudio)☆56Dec 11, 2022Updated 3 years ago
- ☆36Sep 6, 2025Updated 5 months ago