The repoduction codes for Qwen-Audio Fine-tuning
☆53Feb 28, 2026Updated last month
Alternatives and similar repositories for QwenAudioSFT
Users that are interested in QwenAudioSFT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Colab notebook for fine-tuning Qwen2-Audio with trl's SFT and PPO trainers.☆24Nov 23, 2024Updated last year
- Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"☆37Dec 6, 2023Updated 2 years ago
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆18Nov 19, 2025Updated 4 months ago
- Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.☆19Apr 22, 2019Updated 6 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆23Oct 17, 2024Updated last year
- ☆36Jan 6, 2026Updated 3 months ago
- The world's fastest Python package for calculating integrated loudness (LUFS) from audio data as NumPy arrays☆25Dec 26, 2025Updated 3 months ago
- Python runtime for WeTextProcessing (does not depend on Pynini)☆49Nov 28, 2025Updated 4 months ago
- Vocoder NSF-HiFiGAN (Moved into deepaudio)☆56Dec 11, 2022Updated 3 years ago
- CTC decoder with hotwords for ASR.☆35Apr 13, 2025Updated 11 months ago
- This is a repository for fine-tuning Qwen2-Audio, currently supporting Distributed Data Parallel (DDP) and DeepSpeed.☆51Jul 28, 2025Updated 8 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆154Dec 5, 2024Updated last year
- Collection of scripts from mHuBERT-147.☆34Nov 19, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Your faithful, impartial partner for audio evaluation — know yourself, know your rivals. 真实评测,知己知彼。☆285Mar 19, 2026Updated 2 weeks ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Oct 19, 2023Updated 2 years ago
- Unified Audio-Visual Perception for Multi-Task Video Localization☆31Apr 19, 2024Updated last year
- Reimplementation of Miipher☆30Aug 16, 2023Updated 2 years ago
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆15Dec 23, 2024Updated last year
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 9 months ago
- ☆68Dec 30, 2025Updated 3 months ago
- ☆15Nov 11, 2024Updated last year
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆650Jun 9, 2024Updated last year
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- ☆132Jul 21, 2021Updated 4 years ago
- SpeechFake: A Large-Scale Multilingual Speech Deepfake Dataset Incorporating Cutting-Edge Generation Methods☆27Aug 13, 2025Updated 7 months ago
- ☆21Apr 24, 2025Updated 11 months ago
- [INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark☆314Mar 18, 2026Updated 2 weeks ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆25Sep 26, 2023Updated 2 years ago
- The open source code for LLM-Codec☆146Aug 18, 2024Updated last year
- Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆61Sep 5, 2025Updated 7 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Audio-FLAN☆159Sep 23, 2025Updated 6 months ago
- ☆62May 31, 2024Updated last year
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆52Sep 2, 2025Updated 7 months ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆26Sep 22, 2022Updated 3 years ago
- ☆203Updated this week
- WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models☆29Feb 13, 2026Updated last month