google-research/mseb

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-research/mseb)

google-research / mseb

☆63

Alternatives and similar repositories for mseb

Users that are interested in mseb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

roudimit / Omni-R1
View on GitHub
[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
☆47Nov 21, 2025Updated 7 months ago
b-sigpro / sed-hsmm
View on GitHub
Onset-and-Offset-Aware Sound Event Detection
☆21Feb 10, 2025Updated last year
FreedomIntelligence / EchoX
View on GitHub
EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs
☆47Sep 19, 2025Updated 9 months ago
lysanderism / TimeAudio
View on GitHub
The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…
☆30Nov 18, 2025Updated 7 months ago
ASLP-lab / Hum-Dial
View on GitHub
ICASSP2026 HumDial Challenge
☆48May 28, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SonyResearch / dcase2025_stereo_seld_data_generator
View on GitHub
Data generator for stereo sound event localization and detection task of DCASE 2025 challenge
☆17Jul 17, 2025Updated 11 months ago
modal-projects / modal-nvidia-asr
View on GitHub
☆42Mar 31, 2026Updated 3 months ago
FreedomIntelligence / S2S-Arena
View on GitHub
☆21Jun 4, 2026Updated last month
SpeakerGuard / SpeakerGuard
View on GitHub
a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examp…
☆46Nov 20, 2024Updated last year
Orlllem / seld_wav2vec2
View on GitHub
☆18Feb 1, 2026Updated 5 months ago
ajin12 / tooldetection
View on GitHub
☆14Mar 16, 2019Updated 7 years ago
Alittleegg / Eureka-Audio
View on GitHub
Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…
☆40Apr 11, 2026Updated 2 months ago
manoskary / weavemuse
View on GitHub
An open agentic system built on smolagents, integrating multimodal state-of-the-art music AI models for understanding, generation, and in…
☆31Feb 6, 2026Updated 5 months ago
Honee-W / U-SAM
View on GitHub
Official repository for U-SAM (Interspeech 2025)
☆28Jun 3, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Tele-AI / TELEVAL
View on GitHub
☆24Jun 10, 2026Updated 3 weeks ago
ductuantruong / enskd
View on GitHub
[ICASSP'24] Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
☆16Mar 20, 2024Updated 2 years ago
ex3ndr / supervoice-flow
View on GitHub
SpeechFlow neural network implementation
☆23Aug 8, 2024Updated last year
ldzhangyx / music-melody-segmentation-using-neural-CRF
View on GitHub
☆13Nov 2, 2020Updated 5 years ago
i-need-sleep / mad
View on GitHub
☆16Sep 29, 2025Updated 9 months ago
HBKUVisCommunity / inshade
View on GitHub
☆12Oct 7, 2020Updated 5 years ago
SenseTime-FVG / InteractiveOmni
View on GitHub
☆22Dec 3, 2025Updated 7 months ago
daewoung / ViolinDiff
View on GitHub
[ICASSP 2025] Official implementation of "ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning".
☆17Feb 2, 2025Updated last year
yongyizang / AreYouReallyListening
View on GitHub
Official Repository for ISMIR 2025 paper "Are you really listening? Boosting Perceptual Awareness in Music-QA Benchmarks"
☆20Aug 18, 2025Updated 10 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ouor / so-vits-svc-5.0
View on GitHub
Core Engine of Singing Voice Conversion & Singing Voice Clone
☆17Jul 15, 2023Updated 2 years ago
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
Yuer867 / EMO_Harmonizer
View on GitHub
This is the official repository of Emotion-Driven Melody Harmonization via Melodic Variation and Functional Representation.
☆12Sep 25, 2024Updated last year
kymatio / ismir23-tutorial
View on GitHub
Kymatio: Deep Learning meets Wavelet Theory for Music Signal Processing
☆15Oct 27, 2025Updated 8 months ago
zzy1hjq / NeuralVC
View on GitHub
A real-time voice conversion model based on VITS.
☆16Aug 1, 2024Updated last year
johndpope / Singing-Voice-Conversion-with-conditional-VAW-GAN
View on GitHub
This is the implementation of the paper "VAW-GAN for Singing Voice Conversion withNon-parallel Training Data".
☆17Aug 12, 2020Updated 5 years ago
ta012 / SSLAM
View on GitHub
[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
☆79Oct 8, 2025Updated 8 months ago
datasetu / vermillion
View on GitHub
A high-performance, scalable middleware for time-series and static-file data exchange.
☆14Jul 20, 2023Updated 2 years ago
WangHelin1997 / SpeechTasks
View on GitHub
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…
☆83Jun 7, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
tutujingyugang1 / ChatVLA_public
View on GitHub
☆15Jun 11, 2025Updated last year
audiocontentanalysis / conferences
View on GitHub
MIR conference deadline countdowns
☆11Jun 24, 2026Updated last week
NKU-HLT / DIFFA
View on GitHub
[AAAI 2026 & ACL 2026] The official implementation of the DIFFA series for dLLM-based large audio language model
☆82Apr 7, 2026Updated 3 months ago
waybarrios / guidance-based-video-grounding
View on GitHub
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
☆22Sep 26, 2024Updated last year
google-research-datasets / Video-Timeline-Tags-ViTT
View on GitHub
A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free…
☆29Jan 15, 2022Updated 4 years ago
Inowlzy / RadarLLM
View on GitHub
[Accepted by AAAI-2026!] Official Code Repository of RadarLLM: Empowering Large Language Models to Understand Human Motion from Millimet…
☆35Nov 28, 2025Updated 7 months ago
microsoft / Distill-MOS
View on GitHub
Distillation of Self-Supervised Representation-Based Speech Quality Assessment
☆48May 15, 2025Updated last year