jimbozhang/xares-llm-template

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jimbozhang/xares-llm-template)

jimbozhang / xares-llm-template

Template for creating audio encoders compatible with X-ARES

☆19

Alternatives and similar repositories for xares-llm-template

Users that are interested in xares-llm-template are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jimbozhang / xares
View on GitHub
A benchmark for evaluating audio encoders on various audio tasks.
☆55Apr 27, 2026Updated 2 months ago
xiaomi-research / xares-llm
View on GitHub
XARES-LLM
☆55Mar 26, 2026Updated 3 months ago
yuhanghe01 / Sound3DVDet
View on GitHub
Code for WACV24 work for multiview acoustic-visual detection
☆13Mar 22, 2024Updated 2 years ago
hlt-mt / Speech-MASSIVE
View on GitHub
Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…
☆25Oct 8, 2025Updated 9 months ago
ZhangXinWhut / SimWhisper-Codec
View on GitHub
Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"
☆37Jan 28, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yongaifadian1 / MNV-17
View on GitHub
Qwen2.5-Omni fine-tuned on MNV-17 dataset for nonverbal vocalization recognition
☆31Nov 13, 2025Updated 8 months ago
danpovey / conditional-flow-matching
View on GitHub
☆29Aug 8, 2024Updated last year
LAION-AI / emotional-speech-annotations
View on GitHub
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
☆35Oct 13, 2024Updated last year
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆120Jun 21, 2026Updated last month
SpeakerRecognizer / VoicePersonification
View on GitHub
☆18Dec 8, 2025Updated 7 months ago
AmphionTeam / SpeechJudge
View on GitHub
SpeechJudge: Towards Human-Level Judgment for Speech Naturalness (https://arxiv.org/abs/2511.07931)
☆77Dec 23, 2025Updated 6 months ago
qiuqiangkong / music_source_separation
View on GitHub
☆60Jun 15, 2026Updated last month
alibaba / vstyle
View on GitHub
☆34Sep 15, 2025Updated 10 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
X-LANCE / public_talks
View on GitHub
Materials of public talks given By SJTU X-LANCE members
☆14Dec 3, 2022Updated 3 years ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
X-LANCE / Xmart
View on GitHub
Xmart青年论坛仓库，存放历史学生论坛和前沿讲座的视频回放和讲义，获取最新Xmart预告欢迎关注公众号【XLANCE Lab】
☆54Apr 7, 2026Updated 3 months ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
Ming-er / Audio-Free-P-Tuning
View on GitHub
☆11Dec 28, 2023Updated 2 years ago
rossellhayes / ipa
View on GitHub
🗣️ Convert between phonetic alphabets
☆11Feb 7, 2022Updated 4 years ago
ASLP-lab / FMSU
View on GitHub
Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model
☆25May 21, 2026Updated 2 months ago
NKU-HLT / MusicEval-baseline
View on GitHub
☆12Apr 18, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
AmphionTeam / Emilia-NV
View on GitHub
Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"
☆91Sep 18, 2025Updated 10 months ago
averkij / Word-to-Number-Russian
View on GitHub
Проект для перевода чисел, записанных в текстовом виде на русском языке.
☆11Apr 5, 2022Updated 4 years ago
XiaoyuBIE1994 / SDCodec
View on GitHub
(ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec
☆48May 16, 2025Updated last year
xiaomi-research / dasheng-denoiser
View on GitHub
Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…
☆81Jun 16, 2025Updated last year
xiaomi-research / mecat
View on GitHub
☆44May 12, 2026Updated 2 months ago
seorim0 / SE-using-SRL-Model
View on GitHub
Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings
☆20Jun 6, 2025Updated last year
ZhaoZeyu1995 / BenNevis
View on GitHub
A Diffrentiable WFST-based End-to-End Automatic Speech Recognition toollkit with flexible topology support
☆12Feb 15, 2026Updated 5 months ago
JishengBai / AudioSetCaps
View on GitHub
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆208Dec 13, 2024Updated last year
Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines
View on GitHub
The baselines of ARC-Challenge-Interspeech2026
☆60Dec 1, 2025Updated 7 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
yqcai888 / DCASE2023
View on GitHub
2022 DCASE Challenge
☆14Sep 30, 2024Updated last year
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
xiquan-li / FineLAP
View on GitHub
[ACL 2026 Main] FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pre-training
☆36Apr 20, 2026Updated 3 months ago
X-LANCE / UniCATS-CTX-vec2wav
View on GitHub
[AAAI 2024] Code for CTX-vec2wav in UniCATS
☆130Jun 11, 2024Updated 2 years ago
ftshijt / speech_evaluation
View on GitHub
A toolkit dedicate for speech evaluation.
☆23Sep 26, 2024Updated last year
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
X-LANCE / SLAM-LLM
View on GitHub
A Framework for Speech, Language, Audio, Music Processing with Large Language Model
☆1,048Jan 15, 2026Updated 6 months ago