satvik-dixit / mace
Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems
☆11Updated 2 months ago
Alternatives and similar repositories for mace:
Users that are interested in mace are comparing it to the libraries listed below
- ☆43Updated 9 months ago
- A fast python library for aligning similar audio snippets passed in as NumPy arrays☆44Updated last week
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆47Updated last week
- ☆102Updated last month
- Official implementation for FlowSep☆34Updated 2 months ago
- ☆63Updated 11 months ago
- iSeparate library for the SDX2023 challenge☆13Updated last year
- Landing Page for All Things Source Separation☆23Updated 4 months ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆36Updated last year
- AudioSR-Upsampling (any -> 48kHz)☆40Updated last year
- small audio language model for reasoning☆49Updated last week
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆31Updated last year
- ☆40Updated 11 months ago
- Event Relation in Text-to-Audio (TTA) Generation☆17Updated last month
- The aim of this project is to make voice assistants more responsive towards whisper to some extent.☆10Updated 5 years ago
- [Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separation☆22Updated last year
- Frechet Audio Distance evaluation in PyTorch☆35Updated last year
- ☆17Updated this week
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆38Updated 9 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆13Updated 3 weeks ago
- A collection of audio autoencoders, in PyTorch.☆40Updated 2 years ago
- Simple PyTorch Denoisers for Waveform Audio☆35Updated last month
- Official implementation of "AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and sta…☆37Updated 4 months ago
- ☆22Updated 6 months ago
- million song dataset split for extended clean tag & artist-level stratified☆48Updated last year
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆23Updated last year
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆12Updated 6 months ago
- The official implementation of TokenSynth (ICASSP 2025)☆51Updated 3 weeks ago
- PodcastMix A dataset for separating music and speech in podcasts.☆43Updated 7 months ago
- ☆18Updated 10 months ago