satvik-dixit/mace

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/satvik-dixit/mace)

satvik-dixit / mace

Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems

☆13

Alternatives and similar repositories for mace

Users that are interested in mace are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
soham97 / ADIFF
View on GitHub
Explaining audio differences using language
☆16Feb 11, 2025Updated last year
soham97 / mellow
View on GitHub
small audio language model for reasoning
☆87Dec 4, 2025Updated 7 months ago
snap-research / GenAU
View on GitHub
☆53Mar 24, 2026Updated 3 months ago
audio-captioning / caption-evaluation-tools
View on GitHub
Tools for the evaluation of audio captioning.
☆19May 23, 2020Updated 6 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
microsoft / AudioEntailment
View on GitHub
Audio Entailment: Deductive Reasoning for Audio Understanding
☆17Dec 10, 2024Updated last year
soham97 / PAM
View on GitHub
PAM is a no-reference audio quality metric for audio generation tasks
☆76Jul 19, 2024Updated last year
raymondxyy / strfnet-IS2020
View on GitHub
Official repo for the STRFNet system appeared in INTERSPEECH2020
☆12Mar 6, 2021Updated 5 years ago
soham97 / sound_ai_progress
View on GitHub
Tracking states of the arts and recent results (bibliography) on sound tasks.
☆33Jan 10, 2023Updated 3 years ago
lukaszliniewicz / breath-removal
View on GitHub
Detect and remove or lower the volume of breathing in speech recordings.
☆16May 14, 2025Updated last year
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 3 months ago
qiuk2 / RobusTok
View on GitHub
Image Tokenizer Needs Post-Training
☆24Oct 4, 2025Updated 9 months ago
XinhaoMei / DCASE2021_task6_v2
View on GitHub
Code for CVSSP submission to DCASE 2021 Task 6
☆36Nov 22, 2022Updated 3 years ago
fakerybakery / simpletts
View on GitHub
A lightweight Python library for running TTS models with a unified API.
☆20Feb 18, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
audio-captioning / audio-captioning-papers
View on GitHub
A list of papers about audio captioning
☆79Jul 1, 2022Updated 4 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
thamquocdung / eCMU
View on GitHub
eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)
☆10Oct 30, 2024Updated last year
alexisdmacintyre / SpeechBreathingToolbox
View on GitHub
Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.
☆11Feb 17, 2024Updated 2 years ago
soham97 / MTL_Weakly_labelled_audio_data
View on GitHub
Code repo for "Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection"
☆17Nov 9, 2022Updated 3 years ago
projectlucas / efficient_whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆19Dec 1, 2022Updated 3 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lxa9867 / PaintSeg
View on GitHub
[NeurIPS 2023] Official Implementation of "PaintSeg: Painting Pixels for Training-free Segmentation"
☆14Dec 31, 2023Updated 2 years ago
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
ZhaoF-i / ASTWS-AEC
View on GitHub
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
☆30Nov 12, 2025Updated 7 months ago
andreasjansson / cantable-diffuguesion
View on GitHub
Bach chorale generation and harmonization
☆26Jan 12, 2023Updated 3 years ago
JosefAlbers / e2tts-mlx
View on GitHub
Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX
☆29Oct 15, 2024Updated last year
R1ckShi / FrontEnd-AEC
View on GitHub
Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.
☆19Apr 22, 2019Updated 7 years ago
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
sony / diffusion-timbre-transfer
View on GitHub
☆56Nov 5, 2024Updated last year
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
tarun360 / SpeakerProfiling
View on GitHub
Estimating the Age, Height, and Gender of a speaker with their speech signal.
☆15Sep 19, 2022Updated 3 years ago
Zhongxu-Wang / ArtSpeech
View on GitHub
ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations
☆21Sep 21, 2025Updated 9 months ago
manhph2211 / ViSTT
View on GitHub
I'm building an end-to-end Vietnamese Speech Recognition System. I'll deploy it into production with the help of Flask, Uwsgi, Nginx, and…
☆17Sep 9, 2022Updated 3 years ago
goepfert / noise_reduction
View on GitHub
Audio De-Noiser using a Convolutional Neural Network Architecture built with Tensorflow.js
☆22Jun 7, 2023Updated 3 years ago
Speech-Arena / speech_df_arena
View on GitHub
☆39Feb 26, 2026Updated 4 months ago
Pratyay / mac-monitor-mcp
View on GitHub
☆21Mar 30, 2026Updated 3 months ago
soham97 / Remaining_useful_life_NASA
View on GitHub
RUL Nasa Turbofan Dataset (paper)
☆26May 14, 2020Updated 6 years ago