kyegomez / AudioFlamingoLinks
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"
☆40Updated 4 months ago
Alternatives and similar repositories for AudioFlamingo
Users that are interested in AudioFlamingo are comparing it to the libraries listed below
Sorting:
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆58Updated 7 months ago
- An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"☆31Updated 2 years ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆40Updated last month
- Official release of StyleTalk dataset.☆66Updated 11 months ago
- ☆32Updated 11 months ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆37Updated last year
- Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆45Updated 3 weeks ago
- small audio language model for reasoning☆64Updated 2 months ago
- Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"☆46Updated last month
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆54Updated last year
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆52Updated 2 years ago
- A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)☆53Updated last year
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆35Updated 10 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆59Updated 7 months ago
- ☆84Updated 3 weeks ago
- ☆39Updated 9 months ago
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆67Updated 5 months ago
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆48Updated last year
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆55Updated 2 months ago
- A spoken version of the textual story cloze benchmark☆17Updated last year
- A low-bitrate single-codebook 16 kHz speech codec based on focal modulation☆91Updated 4 months ago
- Official Implementation of EnCLAP (ICASSP 2024)☆92Updated last year
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆35Updated last year
- ☆35Updated last year
- Pytorch implementation of INTEGRATED PARAMETER-EFFICIENT TUNING FOR GENERAL-PURPOSE AUDIO MODELS☆10Updated last year
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…☆19Updated last year
- ☆15Updated last year
- ☆75Updated 2 weeks ago
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆42Updated 2 months ago
- Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"☆49Updated 2 years ago