xiaomi-research / xares-llmView external linksLinks
XARES-LLM
☆53Updated this week
Alternatives and similar repositories for xares-llm
Users that are interested in xares-llm are comparing it to the libraries listed below
Sorting:
- A benchmark for evaluating audio encoders on various audio tasks.☆42Dec 11, 2025Updated 2 months ago
- Official Implementation of GLAP - General Language Audio Pretraining☆61Jan 5, 2026Updated last month
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆75Jun 16, 2025Updated 8 months ago
- Template for creating audio encoders compatible with X-ARES☆19Updated this week
- ICSD Dataset☆40Jun 11, 2025Updated 8 months ago
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆35Feb 5, 2026Updated last week
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆123Sep 2, 2025Updated 5 months ago
- 🤗 R1-AQA Model: mispeech/r1-aqa☆314Mar 28, 2025Updated 10 months ago
- This project is the official implementation of ``Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation'' in PyTorch, wh…☆12Nov 4, 2022Updated 3 years ago
- ☆18Aug 16, 2025Updated 6 months ago
- Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative model…☆74Feb 3, 2026Updated 2 weeks ago
- The dataset and baseline code for Text-to-Audio Grounding (TAG)☆50Oct 23, 2025Updated 3 months ago
- ☆17Jun 24, 2025Updated 7 months ago
- [JMLR] Gradual Domain Adaptation: Theory and Algorithms☆11Jan 14, 2025Updated last year
- Audio captioning recipe☆51Oct 23, 2025Updated 3 months ago
- These are various scripts to manipulate and/or measure the acoustic properties of speech sounds☆16Oct 18, 2024Updated last year
- Speech emotion recognition using LSTM, SVM and MLP | 语音情感识别☆10Jul 1, 2019Updated 6 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- Details of the datasets for Few-shot class-incremental audio classification☆11Dec 6, 2023Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)☆12Oct 15, 2024Updated last year
- ATTENTION AGGREGATION NETWORK FOR AUDIO-VISUAL EMOTION RECOGNITION☆13Sep 25, 2023Updated 2 years ago
- Open-source audio embedding models, submitted to the HEAR 2021 challenge☆11Updated this week
- SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.☆11Nov 15, 2025Updated 3 months ago
- ☆11Sep 25, 2024Updated last year
- ☆117Updated this week
- Testing sets for semanticVAD☆20Feb 18, 2025Updated 11 months ago
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"☆84Sep 18, 2025Updated 4 months ago
- ☆11Feb 14, 2025Updated last year
- Diffusion Net TensorFlow implementation☆11Nov 10, 2017Updated 8 years ago
- ☆16Apr 2, 2025Updated 10 months ago
- ☆34Sep 5, 2025Updated 5 months ago
- ☆10Mar 21, 2018Updated 7 years ago
- Pytorch implementation of "spectro-temporal attention-based voice activity detection"☆13Jun 4, 2024Updated last year
- ☆15Nov 10, 2025Updated 3 months ago
- A simple implementation for improving CosyVoice2 by GRPO method☆32Oct 17, 2025Updated 4 months ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆14Jun 28, 2024Updated last year
- Audio Processing & Visualization Concepts☆11Jun 20, 2023Updated 2 years ago