☆10Sep 25, 2024Updated last year
Alternatives and similar repositories for LOAE
Users that are interested in LOAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A speech signal processing library in Python with emphasis on deep learning.☆31Apr 13, 2026Updated 3 weeks ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…☆28Nov 18, 2025Updated 5 months ago
- ☆37Jun 9, 2025Updated 10 months ago
- Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…☆21Feb 1, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- Source code for Consistent ensemble distillation for audio tagging☆65Mar 20, 2026Updated last month
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated last year
- Official Implementation of EnCLAP (ICASSP 2024)☆95Jun 2, 2024Updated last year
- ☆23Mar 19, 2025Updated last year
- A list of resources that can help in research for automated audio captioning☆34Feb 17, 2021Updated 5 years ago
- official implementation of MGA-CLAP (ACM MM 2024)☆31Oct 25, 2024Updated last year
- Colab notebook for fine-tuning Qwen2-Audio with trl's SFT and PPO trainers.☆24Nov 23, 2024Updated last year
- ☆26Sep 10, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆50Aug 27, 2024Updated last year
- A list of papers about audio captioning☆79Jul 1, 2022Updated 3 years ago
- Script to demonstrate how to use a Language Model for Semantic Turn Detection. Refer to blog post for full details.☆17May 9, 2025Updated 11 months ago
- ☆21Updated this week
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- music semantic understanding evaluation benchmark☆25Aug 12, 2023Updated 2 years ago
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆39Nov 20, 2024Updated last year
- PyTorch implementation of MelNet☆10Aug 24, 2019Updated 6 years ago
- Audio captioning recipe☆52Oct 23, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- LUCY: Linguistic Understanding and Control Yielding Early Stage of Her☆60Apr 14, 2025Updated last year
- Official Implementation of GLAP - General Language Audio Pretraining☆70Mar 25, 2026Updated last month
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆31Dec 6, 2023Updated 2 years ago
- ☆38Jul 4, 2024Updated last year
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆46Nov 21, 2025Updated 5 months ago
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆23Nov 25, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- This project is created for our Confidential Laboratory, which is supported by HEU☆13Jul 2, 2018Updated 7 years ago
- [ACL 2026 Main] Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training☆72Apr 6, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Music Language Model Generation, Optimization, and Practice☆55Apr 20, 2026Updated 2 weeks ago
- In Divisive we have all points in one cluster initially and we break the cluster into required number of clusters.☆10May 19, 2018Updated 7 years ago
- Script to generate VAD dataset used in Asteroid recipe☆21Sep 30, 2021Updated 4 years ago
- Official implementation for FlowSep☆75Jan 2, 2025Updated last year
- Python and C/C++ library for fast, accurate PCA on the GPU☆12Jun 4, 2018Updated 7 years ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆99Nov 9, 2024Updated last year
- DRFI For Region Dissection☆13Jan 11, 2019Updated 7 years ago