AudioLLMs / AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models
☆176Updated this week
Alternatives and similar repositories for AudioBench:
Users that are interested in AudioBench are comparing it to the libraries listed below
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆119Updated 3 months ago
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆85Updated 3 months ago
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆159Updated last week
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆171Updated 8 months ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆187Updated this week
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆127Updated this week
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆138Updated last year
- ☆54Updated last week
- Audio-FLAN☆140Updated 3 weeks ago
- Audio Captioning datasets for PyTorch.☆115Updated 2 weeks ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆150Updated 6 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆192Updated 6 months ago
- Versatile Evaluation of Speech and Audio☆176Updated last week
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆194Updated 3 weeks ago
- A Survey of Spoken Dialogue Models (60 pages)☆284Updated 4 months ago
- The open source code for LLM-Codec☆132Updated 7 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- Audio Codec Speech processing Universal PERformance Benchmark☆244Updated 5 months ago
- Official Implementation of EnCLAP (ICASSP 2024)☆91Updated 10 months ago
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆220Updated 8 months ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆108Updated last year
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆112Updated last week
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆129Updated 9 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆113Updated 3 months ago
- It's a repository for implementations of neural speech editing algorithms.☆195Updated last year
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆48Updated 9 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆195Updated 11 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆123Updated 3 months ago
- Reference-aware automatic speech evaluation toolkit☆145Updated 3 months ago
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆128Updated 2 years ago