☆10Sep 25, 2024Updated last year
Alternatives and similar repositories for LOAE
Users that are interested in LOAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A speech signal processing library in Python with emphasis on deep learning.☆31Jul 16, 2022Updated 3 years ago
- The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…☆26Nov 18, 2025Updated 4 months ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- ☆37Jun 9, 2025Updated 10 months ago
- Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…☆21Feb 1, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- Source code for Consistent ensemble distillation for audio tagging☆63Mar 20, 2026Updated 3 weeks ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated 11 months ago
- Official Implementation of EnCLAP (ICASSP 2024)☆94Jun 2, 2024Updated last year
- ☆23Mar 19, 2025Updated last year
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- official implementation of MGA-CLAP (ACM MM 2024)☆30Oct 25, 2024Updated last year
- Colab notebook for fine-tuning Qwen2-Audio with trl's SFT and PPO trainers.☆24Nov 23, 2024Updated last year
- ☆25Sep 10, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆50Aug 27, 2024Updated last year
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- ☆20Apr 18, 2024Updated last year
- Script to demonstrate how to use a Language Model for Semantic Turn Detection. Refer to blog post for full details.☆17May 9, 2025Updated 11 months ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- music semantic understanding evaluation benchmark☆25Aug 12, 2023Updated 2 years ago
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆39Nov 20, 2024Updated last year
- PyTorch implementation of MelNet☆10Aug 24, 2019Updated 6 years ago
- Audio captioning recipe☆52Oct 23, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆44Nov 21, 2025Updated 4 months ago
- LUCY: Linguistic Understanding and Control Yielding Early Stage of Her☆60Apr 14, 2025Updated last year
- Official Implementation of GLAP - General Language Audio Pretraining☆68Mar 25, 2026Updated 3 weeks ago
- ☆38Jul 4, 2024Updated last year
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Dec 6, 2023Updated 2 years ago
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆21Nov 25, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- This project is created for our Confidential Laboratory, which is supported by HEU☆13Jul 2, 2018Updated 7 years ago
- [ACL 2026 Main] Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training☆71Apr 6, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Music Language Model Generation, Optimization, and Practice☆51Updated this week
- In Divisive we have all points in one cluster initially and we break the cluster into required number of clusters.☆10May 19, 2018Updated 7 years ago
- Script to generate VAD dataset used in Asteroid recipe☆21Sep 30, 2021Updated 4 years ago
- Official implementation for FlowSep☆74Jan 2, 2025Updated last year
- Python and C/C++ library for fast, accurate PCA on the GPU☆12Jun 4, 2018Updated 7 years ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆97Nov 9, 2024Updated last year
- DRFI For Region Dissection☆13Jan 11, 2019Updated 7 years ago