An Audio Language model for Audio Tasks
☆322Apr 19, 2024Updated 2 years ago
Alternatives and similar repositories for Pengi
Users that are interested in Pengi are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".☆476Apr 24, 2024Updated 2 years ago
- Learning audio concepts from natural language supervision☆665Sep 18, 2024Updated last year
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆194Jul 12, 2024Updated last year
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆262Jul 25, 2024Updated last year
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆668Apr 5, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Audio Codec Speech processing Universal PERformance Benchmark☆306May 5, 2026Updated last month
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆131Dec 9, 2024Updated last year
- The Open Source Code of UniAudio☆607Jul 22, 2024Updated last year
- Keep track of big models in audio domain, including speech, singing, music etc.☆515Sep 26, 2024Updated last year
- SpeechGPT Series: Speech Large Language Models☆1,403Jul 22, 2024Updated last year
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models☆1,146Dec 15, 2025Updated 6 months ago
- SALMONN family: A suite of advanced multi-modal LLMs☆1,456May 26, 2026Updated last month
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆207Dec 13, 2024Updated last year
- Pytorch implementation of BigVSAN☆203Dec 9, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.☆732Jun 3, 2026Updated 3 weeks ago
- Audio Dataset for training CLAP and other models☆744Jan 8, 2026Updated 5 months ago
- Contrastive Language-Audio Pretraining☆2,197May 15, 2025Updated last year
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- An Open-source Streaming High-fidelity Neural Audio Codec☆509Mar 4, 2025Updated last year
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆39Jan 27, 2025Updated last year
- Audio Entailment: Deductive Reasoning for Audio Understanding