AudioLLMs / AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models
☆200Updated 3 weeks ago
Alternatives and similar repositories for AudioBench:
Users that are interested in AudioBench are comparing it to the libraries listed below
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆92Updated 4 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆123Updated 4 months ago
- ☆60Updated 3 weeks ago
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆177Updated last week
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆136Updated this week
- Audio-FLAN☆142Updated last month
- Audio Codec Speech processing Universal PERformance Benchmark☆251Updated last week
- A Survey of Spoken Dialogue Models (60 pages)☆287Updated 4 months ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆201Updated this week
- Audio Captioning datasets for PyTorch.☆115Updated last month
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆175Updated 9 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆142Updated last year
- The open source code for LLM-Codec☆133Updated 8 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆154Updated 7 months ago
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆197Updated last month
- VoiceLDM: Text-to-Speech with Environmental Context☆174Updated 8 months ago
- Versatile Evaluation of Speech and Audio☆184Updated last week
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆152Updated last month
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆117Updated last month
- Reference-aware automatic speech evaluation toolkit☆153Updated 4 months ago
- Update ASR paper everyday☆196Updated this week
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 7 months ago
- It's a repository for implementations of neural speech editing algorithms.☆196Updated last year
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆122Updated last week
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆126Updated 4 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆196Updated last year
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆222Updated 8 months ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆48Updated 5 months ago
- UTokyo-SaruLab MOS Prediction System☆170Updated 2 weeks ago