Lollipop/Qwen2-Audio

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Lollipop/Qwen2-Audio)

Lollipop / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

☆40

Alternatives and similar repositories for Qwen2-Audio

Users that are interested in Qwen2-Audio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tango4j / llm_speaker_tagging
View on GitHub
SLT 2024 Challenge: Post-ASR-Speaker-Tagging
☆16Jun 16, 2024Updated 2 years ago
xieh97 / dcase2023-audio-retrieval
View on GitHub
Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge
☆10Aug 8, 2023Updated 2 years ago
PigeonDan1 / paper_claw
View on GitHub
Paper Claw sends personalized daily research digests from arXiv and beyond straight to your inbox, featuring customizable categories, int…
☆32Updated this week
cahya-wirawan / indonesian-whisperer
View on GitHub
Experiment with OpenAI Whisper on Indonesian Languages
☆16Feb 24, 2023Updated 3 years ago
teamtee / Qwen2-Audio-finetune
View on GitHub
This is a repository for fine-tuning Qwen2-Audio, currently supporting Distributed Data Parallel (DDP) and DeepSpeed.
☆50Jul 28, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yzspku / TPPNet
View on GitHub
Temporal Pyramid Pooling Convolutional Neural Network for Cover Song Identification
☆36Feb 8, 2020Updated 6 years ago
ian-k-1217 / Fully-Generalized-Non-Local-Network
View on GitHub
☆10Jun 2, 2021Updated 5 years ago
aspose-email / Aspose.Email-Python-Dotnet
View on GitHub
Aspose.Email for Python via .NET Examples: https://products.aspose.com/email/python-net
☆10Oct 9, 2025Updated 9 months ago
zitadel / react-user-authentication
View on GitHub
This is the React sample used in the ZITADEL quick start guide.
☆11Apr 13, 2026Updated 3 months ago
malgamves / GameOfCharts
View on GitHub
A Realtime App to visualize votes on who folks think will die in Episode 3 of Game of Thrones Season 8. Built using Vue.js, Hasura and C…
☆14Dec 9, 2022Updated 3 years ago
gudgud96 / noisy-student-emotion-training
View on GitHub
Submission to MediaEval 2021 Emotions and Themes in Music challenge. Noisy-student training for music emotion tagging
☆11Dec 2, 2021Updated 4 years ago
iamhankai / voiceMusicSeparation
View on GitHub
Voice Music Separation competing for 6th Huawei Cup in ZJU
☆11Jun 2, 2015Updated 11 years ago
sadhusamik / fdlp_spectrogram
View on GitHub
☆14Nov 28, 2022Updated 3 years ago
hmohebbi / disentangling_representations
View on GitHub
☆14Oct 3, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
HarryHsing / EchoInk
View on GitHub
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning (🔥The Exploration of R1 for General Audio-Vis…
☆78Jun 3, 2026Updated last month
Anindyadeep / YogaPoseGNN
View on GitHub
When real time Yoga Position classification meets GNN
☆11Sep 17, 2023Updated 2 years ago
Gaiejj / align-anything
View on GitHub
☆16Nov 11, 2025Updated 8 months ago
OFA-Sys / AIR-Bench
View on GitHub
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
☆133Dec 9, 2024Updated last year
PigeonDan1 / ps-slm
View on GitHub
TASU: A New Style of Alignment of Speech LLM with only Text Training Data, zero-shot on ASR and Other SU tasks
☆27Jul 20, 2026Updated last week
HumanMLLM / Omni-Emotion
View on GitHub
☆22Jan 17, 2025Updated last year
YuanX9 / UATR-CMoE
View on GitHub
The PyTorch code for "Unraveling Complex Data Diversity in Underwater Acoustic Target Recognition through Convolution-based Mixture of Ex…
☆33Mar 5, 2024Updated 2 years ago
heye0507 / dl_related
View on GitHub
☆13Jan 17, 2020Updated 6 years ago
ucas-hao / qwen_audio_for_add
View on GitHub
[ACMMM2025] Official released code for ALLM4ADD
☆44Oct 30, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cuhksz-nlp / McASP
View on GitHub
☆12Dec 23, 2022Updated 3 years ago
JusperLee / speechbrain-docs-zh-cn
View on GitHub
SpeechBrain中文文档
☆12Mar 20, 2021Updated 5 years ago
felixgontier / dcase-2023-baseline
View on GitHub
☆14Mar 25, 2023Updated 3 years ago
pfalcon / canterbury-corpus
View on GitHub
The Canterbury compression corpus as a git repository
☆12Sep 20, 2020Updated 5 years ago
popcornell / OSDC
View on GitHub
☆18Jan 26, 2021Updated 5 years ago
Sreyan88 / GAMA
View on GitHub
Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities
☆153Dec 5, 2024Updated last year
Hypotheses-Paradise / Hypo2Trans
View on GitHub
Single-blind supplementary materials for NeurIPS 2023 submission
☆94Oct 30, 2024Updated last year
justinsalamon / musicseg_deepemb
View on GitHub
Code for paper: "Deep Embeddings and Section Fusion Improve Music Segmentation"
☆54Oct 10, 2022Updated 3 years ago
jimmy-dq / SimVOS
View on GitHub
☆14May 25, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
bill317996 / Singer-identification-in-artist20
View on GitHub
Addressing the confounds of accompaniments in singer identification
☆18Mar 24, 2020Updated 6 years ago
ASLP-lab / FastTurn
View on GitHub
☆35May 19, 2026Updated 2 months ago
azmat21 / UyghurTextResource
View on GitHub
uyghur text resource crawled from website
☆12Dec 25, 2015Updated 10 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
SiavashShams / ssamba
View on GitHub
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
☆140Nov 5, 2025Updated 8 months ago
zengchang233 / MTGAN
View on GitHub
MTGAN: Speaker Verification through Multitasking Triplet Generative Adversarial Networks
☆19Feb 29, 2020Updated 6 years ago
QwenLM / Qwen2-Audio
View on GitHub
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
☆2,097Apr 21, 2025Updated last year