The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
☆39Sep 8, 2024Updated last year
Alternatives and similar repositories for Qwen2-Audio
Users that are interested in Qwen2-Audio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Informa…☆24Aug 14, 2025Updated 8 months ago
- Temporal Pyramid Pooling Convolutional Neural Network for Cover Song Identification☆34Feb 8, 2020Updated 6 years ago
- Voice Music Separation competing for 6th Huawei Cup in ZJU☆11Jun 2, 2015Updated 10 years ago
- ☆14Oct 3, 2025Updated 6 months ago
- MLX binary vectors and associated algorithms.☆14Mar 13, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This is the React sample used in the ZITADEL quick start guide.☆11Mar 15, 2023Updated 3 years ago
- A Realtime App to visualize votes on who folks think will die in Episode 3 of Game of Thrones Season 8. Built using Vue.js, Hasura and C…☆14Dec 9, 2022Updated 3 years ago
- ☆10Jun 2, 2021Updated 4 years ago
- When real time Yoga Position classification meets GNN☆11Sep 17, 2023Updated 2 years ago
- The PyTorch code for "Unraveling Complex Data Diversity in Underwater Acoustic Target Recognition through Convolution-based Mixture of Ex…☆31Mar 5, 2024Updated 2 years ago
- Aspose.Email for Python via .NET Examples: https://products.aspose.com/email/python-net☆10Oct 9, 2025Updated 6 months ago
- A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.☆41Feb 27, 2025Updated last year
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆26Mar 17, 2025Updated last year
- The official implement of Freeze-Omni.☆15Jul 10, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for paper: "Deep Embeddings and Section Fusion Improve Music Segmentation"☆54Oct 10, 2022Updated 3 years ago
- The Canterbury compression corpus as a git repository☆12Sep 20, 2020Updated 5 years ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆154Dec 5, 2024Updated last year
- Using Kaldi x-vector method to train speaker recognition model on aishell database.☆17Aug 19, 2021Updated 4 years ago
- ☆14May 25, 2024Updated last year
- Attentive Periodic Temporal Network☆13Dec 9, 2019Updated 6 years ago
- ☆11Oct 20, 2022Updated 3 years ago
- uyghur text resource crawled from website☆12Dec 25, 2015Updated 10 years ago
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 7 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Addressing the confounds of accompaniments in singer identification☆18Mar 24, 2020Updated 6 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated 2 years ago
- Target speaker automatic speech recognition (TS-ASR)☆13Oct 14, 2023Updated 2 years ago
- kaldi cnn-tdnnf baseline☆13Aug 31, 2021Updated 4 years ago
- Collect the awesome works evolved around reasoning models like O1/R1 in visual domain☆54Jul 21, 2025Updated 8 months ago
- Python Implementation of Singing Voice Separation using RPCA☆13Dec 17, 2016Updated 9 years ago
- Python bindings for minimp3☆17Sep 11, 2023Updated 2 years ago
- Raptor, the random arpeggiator (real-time algorithmic composition program implemented as a Pd patch)☆12Jan 12, 2018Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.☆93Dec 22, 2022Updated 3 years ago
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 2 years ago
- Official repository for U-SAM (Interspeech 2025)☆26Jun 3, 2025Updated 10 months ago
- ☆10Mar 16, 2026Updated last month
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- Human Pose Classification☆16Feb 19, 2023Updated 3 years ago
- Seeing Wake Words: Audio-visual Keyword Spotting☆66Sep 16, 2020Updated 5 years ago