☆10Sep 25, 2024Updated last year
Alternatives and similar repositories for LOAE
Users that are interested in LOAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A speech signal processing library in Python with emphasis on deep learning.☆31Jul 16, 2022Updated 3 years ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- ☆35Jun 9, 2025Updated 9 months ago
- Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…☆21Feb 1, 2023Updated 3 years ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Source code for Consistent ensemble distillation for audio tagging☆60Mar 20, 2026Updated last week
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆15Apr 29, 2025Updated 10 months ago
- Official Implementation of EnCLAP (ICASSP 2024)☆94Jun 2, 2024Updated last year
- ☆22Mar 19, 2025Updated last year
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- official implementation of MGA-CLAP (ACM MM 2024)☆30Oct 25, 2024Updated last year
- Colab notebook for fine-tuning Qwen2-Audio with trl's SFT and PPO trainers.☆24Nov 23, 2024Updated last year
- ☆24Sep 10, 2025Updated 6 months ago
- ☆50Aug 27, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- ☆20Apr 18, 2024Updated last year
- Script to demonstrate how to use a Language Model for Semantic Turn Detection. Refer to blog post for full details.☆17May 9, 2025Updated 10 months ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- music semantic understanding evaluation benchmark☆25Aug 12, 2023Updated 2 years ago
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆39Nov 20, 2024Updated last year
- Audio captioning recipe☆51Oct 23, 2025Updated 5 months ago
- PyTorch implementation of MelNet☆10Aug 24, 2019Updated 6 years ago
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆44Nov 21, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training☆70Feb 7, 2026Updated last month
- LUCY: Linguistic Understanding and Control Yielding Early Stage of Her☆60Apr 14, 2025Updated 11 months ago
- Official Implementation of GLAP - General Language Audio Pretraining☆65Jan 5, 2026Updated 2 months ago
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Dec 6, 2023Updated 2 years ago
- ☆38Jul 4, 2024Updated last year
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆21Nov 25, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- This project is created for our Confidential Laboratory, which is supported by HEU☆13Jul 2, 2018Updated 7 years ago
- In Divisive we have all points in one cluster initially and we break the cluster into required number of clusters.☆10May 19, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Script to generate VAD dataset used in Asteroid recipe☆21Sep 30, 2021Updated 4 years ago
- Official implementation for FlowSep☆70Jan 2, 2025Updated last year
- Python and C/C++ library for fast, accurate PCA on the GPU☆12Jun 4, 2018Updated 7 years ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆97Nov 9, 2024Updated last year
- DRFI For Region Dissection☆13Jan 11, 2019Updated 7 years ago
- [AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models☆28Dec 14, 2023Updated 2 years ago
- Explaining audio differences using language☆16Feb 11, 2025Updated last year