AudioLLMs / AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models
☆131Updated last week
Alternatives and similar repositories for AudioBench:
Users that are interested in AudioBench are comparing it to the libraries listed below
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆111Updated 2 months ago
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆80Updated 2 months ago
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆131Updated this week
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆138Updated last year
- Audio Captioning datasets for PyTorch.☆114Updated 3 months ago
- ☆43Updated last month
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆168Updated 7 months ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆165Updated last month
- PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.☆228Updated 5 months ago
- Reference-aware automatic speech evaluation toolkit☆144Updated 2 months ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆127Updated 8 months ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Updated last year
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆190Updated 5 months ago
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆121Updated last month
- Official Implementation of EnCLAP (ICASSP 2024)☆90Updated 9 months ago
- The open source code for LLM-Codec☆128Updated 6 months ago
- Audio Codec Speech processing Universal PERformance Benchmark☆241Updated 4 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆129Updated 4 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆107Updated 2 months ago
- Versatile Evaluation of Speech and Audio☆160Updated this week
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆116Updated 2 months ago
- UTokyo-SaruLab MOS Prediction System☆152Updated this week
- Implementation of SoundStorm built upon SpeechTokenizer.☆108Updated last year
- Official release of StyleTalk dataset.☆61Updated 8 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆107Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆98Updated last year
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆140Updated 2 weeks ago