jimbozhang/xares

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jimbozhang/xares)

jimbozhang / xares

A benchmark for evaluating audio encoders on various audio tasks.

☆55

Alternatives and similar repositories for xares

Users that are interested in xares are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jimbozhang / xares-llm-template
View on GitHub
Template for creating audio encoders compatible with X-ARES
☆19Feb 11, 2026Updated 5 months ago
xiaomi-research / xares-llm
View on GitHub
XARES-LLM
☆55Mar 26, 2026Updated 4 months ago
xiaomi-research / acavcaps
View on GitHub
☆31Mar 27, 2026Updated 4 months ago
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
xiaomi-research / mecat
View on GitHub
☆44May 12, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
XiaoMi / dasheng
View on GitHub
Official PyTorch code for Deep Audio-Signal Holistic Embeddings
☆200Nov 7, 2025Updated 8 months ago
RicherMans / Dasheng
View on GitHub
Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"
☆86Nov 7, 2025Updated 8 months ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
RicherMans / CED
View on GitHub
Source code for Consistent ensemble distillation for audio tagging
☆75Mar 20, 2026Updated 4 months ago
xiaomi-research / dasheng-audiogen
View on GitHub
end-to-end text to audio scene generation model
☆50Jun 16, 2026Updated last month
nttcslab / eval-audio-repr
View on GitHub
EVAR ~ Evaluation package for Audio Representations
☆81Feb 19, 2026Updated 5 months ago
NieeiM / Dasheng-Audiogen
View on GitHub
Generate a complete audio clip with music, intelligible speech, and sound effects from text in one pass.
☆44May 27, 2026Updated 2 months ago
X-LANCE / public_talks
View on GitHub
Materials of public talks given By SJTU X-LANCE members
☆14Dec 3, 2022Updated 3 years ago
xiaomi-research / dasheng-glap
View on GitHub
Official Implementation of GLAP - General Language Audio Pretraining
☆75May 14, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
GLJS / audio-datasets
View on GitHub
GitHub Repository for the Survey Paper on Audio-Language Datasets for Scenes and Events
☆17Feb 7, 2025Updated last year
xiaomi-research / r1-aqa
View on GitHub
🤗 R1-AQA Model: mispeech/r1-aqa
☆325Mar 28, 2025Updated last year
hlt-mt / Speech-MASSIVE
View on GitHub
Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…
☆25Oct 8, 2025Updated 9 months ago
bovod-sjtu / HoliTok
View on GitHub
HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding
☆39Jun 8, 2026Updated last month
nttcslab / m2d
View on GitHub
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
☆162Feb 23, 2026Updated 5 months ago
BUTSpeechFIT / mt-asr-data-prep
View on GitHub
☆25Feb 26, 2026Updated 5 months ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
sarulab-speech / DuplexChat
View on GitHub
☆47Jul 5, 2026Updated 3 weeks ago
Tencent / StableToken
View on GitHub
[ICLR 2026] StableToken: A state-of-the-art noise-robust semantic speech tokenizer featuring Voting-LFQ for resilient SpeechLLMs.
☆33Feb 27, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kehanlu / DeSTA2.5-Audio
View on GitHub
Code for DeSTA2.5-Audio, general-purpose LALM
☆141Feb 4, 2026Updated 5 months ago
nttcslab / dcase2023_task2_evaluator
View on GitHub
☆12Aug 10, 2023Updated 2 years ago
alumae / sv_score_calibration
View on GitHub
Score calibration for speaker verification
☆25Dec 13, 2019Updated 6 years ago
KinWaiCheuk / IJCNN2020_music_transcription
View on GitHub
source code for the paper publised in IJCNN 2020 "The Impact of Audio Input Representations on Neural Network based Music Transcription"
☆13Apr 9, 2020Updated 6 years ago
MorenoLaQuatra / ARCH
View on GitHub
ARCH: Audio Representations benCHmark
☆57Aug 26, 2024Updated last year
urgent-challenge / urgent2026_challenge_track1
View on GitHub
Official baseline, dataset and evaluation scripts for the ICASSP 2026 URGENT challenge.
☆37Nov 12, 2025Updated 8 months ago
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
JazminVidal / gop-ft
View on GitHub
Transfer learning approach to pronunciation scoring
☆12Jan 17, 2024Updated 2 years ago
SonyCSLParis / audio-representations
View on GitHub
JEPAs for audio representation learning
☆26Jun 11, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
hearbenchmark / hear2021-submitted-models
View on GitHub
Open-source audio embedding models, submitted to the HEAR 2021 challenge
☆11Feb 15, 2026Updated 5 months ago
tpt-adasp / salt
View on GitHub
SALT: STANDARDIZED AUDIO EVENT LABEL TAXONOMY
☆16Nov 28, 2024Updated last year
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 5 months ago
Ruiqi-Yan / Awesome-Audio-Editing
View on GitHub
A curated list of models, benchmarks, tools and guides for audio editing
☆35Jul 7, 2026Updated 3 weeks ago
labhamlet / wavjepa
View on GitHub
This is the official codebase for WavJEPA. Time-domain audio foundation model for holistic downstream tasks. "Self-supervised learning fr…
☆34Feb 28, 2026Updated 5 months ago
LudovicTuncay / Audio-JEPA
View on GitHub
Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…
☆65Jul 16, 2026Updated last week
xiaomi-research / dasheng-lm
View on GitHub
Efficient audio understanding with general audio captions
☆429Apr 24, 2026Updated 3 months ago