descriptinc / lyrebird-wav2clipView external linksLinks
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
☆356Feb 15, 2022Updated 4 years ago
Alternatives and similar repositories for lyrebird-wav2clip
Users that are interested in lyrebird-wav2clip are comparing it to the libraries listed below
Sorting:
- Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)☆858Sep 30, 2021Updated 4 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆87Dec 20, 2024Updated last year
- Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.☆93Dec 22, 2022Updated 3 years ago
- Contrastive Language-Audio Pretraining☆2,029May 15, 2025Updated 9 months ago
- Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)☆371Jul 12, 2024Updated last year
- Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned S…☆54Dec 15, 2020Updated 5 years ago
- An audio classification system for learning with out-of-distribution data☆33Dec 8, 2022Updated 3 years ago
- Official PyTorch implementation of Contrastive Learning of Musical Representations☆335Jul 25, 2024Updated last year
- Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".☆414Aug 14, 2022Updated 3 years ago
- Official implementation of "Learning Music Audio Representations Via Weak Language Supervision" (ICASSP 2022)☆47Dec 3, 2024Updated last year
- VGGSound: A Large-scale Audio-Visual Dataset☆350Sep 13, 2021Updated 4 years ago
- PyTorch Dataset for Speech and Music audio☆80Jul 12, 2024Updated last year
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".☆469Apr 24, 2024Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆101Jul 24, 2024Updated last year
- Audio Dataset for training CLAP and other models☆729Jan 8, 2026Updated last month
- A lightweight library for Frechet Audio Distance calculation.☆308Updated this week
- Efficient Training of Audio Transformers with Patchout☆371Jan 12, 2024Updated 2 years ago
- ☆58Nov 2, 2020Updated 5 years ago
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆328Nov 20, 2024Updated last year
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆647Apr 5, 2024Updated last year
- Learning audio concepts from natural language supervision☆640Sep 18, 2024Updated last year
- Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".☆1,419May 21, 2023Updated 2 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- Evaluation kit for the HEAR Benchmark☆62Feb 5, 2026Updated last week
- melodic object transcription framework☆26Nov 15, 2017Updated 8 years ago
- Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".☆54Jul 16, 2025Updated 7 months ago
- The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022☆210Jul 14, 2022Updated 3 years ago
- Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations☆100Jun 18, 2024Updated last year
- 🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps☆203Oct 6, 2025Updated 4 months ago
- ☆16Oct 16, 2018Updated 7 years ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆54Jan 29, 2024Updated 2 years ago
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆130Jun 11, 2024Updated last year
- An Audio Language model for Audio Tasks☆318Apr 19, 2024Updated last year
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆26Mar 27, 2024Updated last year
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆191Jul 12, 2024Updated last year
- Pitch Estimating Neural Networks (PENN)☆269Apr 2, 2025Updated 10 months ago
- Making an AI-generated music video from any song with Wav2CLIP and VQGAN-CLIP☆243Jun 10, 2022Updated 3 years ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆187May 29, 2024Updated last year