YuanGongND / ltu
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
☆419Updated 11 months ago
Alternatives and similar repositories for ltu:
Users that are interested in ltu are comparing it to the libraries listed below
- Keep track of big models in audio domain, including speech, singing, music etc.☆474Updated 6 months ago
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆537Updated 9 months ago
- An Audio Language model for Audio Tasks☆302Updated 11 months ago
- Learning audio concepts from natural language supervision☆537Updated 6 months ago
- The Open Source Code of UniAudio☆551Updated 8 months ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆171Updated this week
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆219Updated 8 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆118Updated 3 months ago
- Audio Dataset for training CLAP and other models☆673Updated last year
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆574Updated 11 months ago
- PyTorch implementation of Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities.