XiaomiMiMo / MiMo-Audio-EvalLinks
☆63Updated 2 weeks ago
Alternatives and similar repositories for MiMo-Audio-Eval
Users that are interested in MiMo-Audio-Eval are comparing it to the libraries listed below
Sorting:
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆99Updated 3 weeks ago
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆81Updated last week
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…☆21Updated last year
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆38Updated 3 weeks ago
- MOSS-Speech is a true speech-to-speech large language model without text guidance.☆47Updated last week
- ☆99Updated last week
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆62Updated 11 months ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆24Updated last month
- ☆26Updated last month
- OpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model☆84Updated 2 months ago
- ☆38Updated last year
- ☆19Updated 3 months ago
- ☆18Updated 3 weeks ago
- ☆34Updated last year
- Official release of StyleTalk dataset.☆70Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆76Updated 11 months ago
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆29Updated 7 months ago
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆49Updated last year
- ☆23Updated 3 months ago
- ☆39Updated 2 months ago
- This repo is text to speech with learnable audio encoder without alignment with transcript reference☆39Updated 3 weeks ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Updated 9 months ago
- Evaluate your agent memory on real-world dialogues, not LLM-simulated dialogues.☆21Updated 3 months ago
- A spoken version of the textual story cloze benchmark☆19Updated 2 years ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆48Updated 6 months ago
- ☆23Updated 11 months ago
- ☆28Updated 3 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆54Updated this week
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆63Updated last year
- Repository for "TESS-2: A Large-Scale, Generalist Diffusion Language Model"☆50Updated 7 months ago