dynamic-superb/multimodal-llama

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dynamic-superb/multimodal-llama)

dynamic-superb / multimodal-llama

The official implementation of ImageBind-LLM and Whisper-LLM from the paper "Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech".

☆21

Alternatives and similar repositories for multimodal-llama

Users that are interested in multimodal-llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AdvSV / AdvSV.github.io
View on GitHub
AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…
☆11Nov 21, 2023Updated 2 years ago
cmpute / audio-codec-benchmark
View on GitHub
Comprehensive quantitative comparison of lossless and lossy audio codecs
☆41Feb 11, 2023Updated 3 years ago
pittisl / mPnP-LLM
View on GitHub
Code for paper "Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI"
☆13Jan 19, 2024Updated 2 years ago
roger-tseng / av-superb
View on GitHub
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated 2 years ago
ano-demo / AdvAttacksASVspoof
View on GitHub
This is the implementation of the paper "Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification".
☆42Mar 9, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cyhuang-tw / robust-vc
View on GitHub
☆11May 7, 2022Updated 4 years ago
ga642381 / Spoken-Dialogue-Model-Survey
View on GitHub
A survey of spoken dialogue models (SDMs) with speech input and speech output. Focus on their Intermediate Representation and Generation …
☆31Mar 24, 2026Updated 4 months ago
xjchenGit / SingGraph
View on GitHub
Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).
☆24Sep 19, 2025Updated 10 months ago
atosystem / SSL_Interface
View on GitHub
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆16Nov 19, 2024Updated last year
jeremychee4 / AffectSpeech
View on GitHub
AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
☆68Jun 12, 2026Updated last month
ga642381 / FlappyBird
View on GitHub
Super Flappy Bird in p5.js
☆10Mar 8, 2021Updated 5 years ago
LingweiMeng / MyChatGPT
View on GitHub
A casual and simple ChatGPT Python script that can run using terminal (as long as you have an API). Support Azure API.
☆20May 3, 2025Updated last year
Srijith-rkr / Whispering-LLaMA
View on GitHub
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
☆271May 19, 2024Updated 2 years ago
jdh-algo / MHAD-Dataset
View on GitHub
Multimodal Home Activity Dataset with Multi-Angle Videos and Synchronized Physiological Signals
☆21Dec 21, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cogmhear / avse_challenge
View on GitHub
COG-MHEAR Audio-Visual Speech Enhancement Challenge
☆48Feb 17, 2026Updated 5 months ago
grtzsohalf / buy_vs_rent_and_invest
View on GitHub
☆15Sep 9, 2021Updated 4 years ago
xjchenGit / MTDVocaLiST
View on GitHub
Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).
☆29Apr 3, 2024Updated 2 years ago
WangHelin1997 / SpecAugment-plus
View on GitHub
A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification
☆34Jun 25, 2021Updated 5 years ago
Yash-Dave / Urban-Yogi
View on GitHub
This is an Ai based Yoga Pose Detection System
☆10Jul 5, 2022Updated 4 years ago
shincling / discreteSeparation
View on GitHub
The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".
☆12Oct 25, 2021Updated 4 years ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
Vic0428 / Paper-Reading-Lists
View on GitHub
Random collections of my interested research papers / projects
☆20May 20, 2021Updated 5 years ago
xue926 / Shared-bicycle-usage-forecast
View on GitHub
本项目使用python对影响共享单车使用量的因素进行可视化分析，并使用lightGBM算法对已知条件下的共享单车使用量进行预测。其中为了选择最优模型，使用了k折交叉验证和网格搜索选择最优参数。
☆10Jul 15, 2020Updated 6 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
LouisDo2108 / MediaEval2022-TailAwareSpermDetection
View on GitHub
"Tail-Aware Sperm Analysis for Transparent Tracking of Spermatozoa" Official Implementation
☆10Jan 21, 2026Updated 6 months ago
dynamic-superb / dynamic-superb
View on GitHub
The official repository of Dynamic-SUPERB.
☆200Jun 24, 2025Updated last year
hhguo / FastGriffinLim_Pytorch
View on GitHub
☆13Nov 16, 2020Updated 5 years ago
wayne0926 / countdown
View on GitHub
很久以前写的人生倒计时工具，由于博客内无法运行，拿出来
☆11Jun 9, 2022Updated 4 years ago
yistLin / beamertheme-yeast
View on GitHub
Yeast, a lite and light beamer theme
☆18Dec 6, 2020Updated 5 years ago
THU-KEG / R-Eval
View on GitHub
[KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models
☆11Apr 9, 2024Updated 2 years ago
d223302 / A-Closer-Look-To-LLM-Evaluation
View on GitHub
Code for EMNLP 2023 findings paper "A Closer Look into Using Large Language Models for Automatic Evaluation"
☆19Oct 9, 2023Updated 2 years ago
TrelisResearch / tgi-chat-ui-function-calling
View on GitHub
Add function calling to text-generation-inference
☆13Oct 10, 2023Updated 2 years ago
awesome-sora / awesome-sora
View on GitHub
😎 Awesome list of interesting topics on Sora
☆22Apr 7, 2026Updated 3 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
kaistmm / VoxMM
View on GitHub
☆23May 11, 2026Updated 2 months ago
BlueSkyXN / jd-scripts-docker
View on GitHub
京东薅羊毛脚本，自动签到，做任务等docker一键启动。有使用上的问题可以加qq群644989387交流。【以上内容为原作者说明】
☆10Feb 8, 2022Updated 4 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
MagicHub-io / CSASR_Challenge
View on GitHub
☆11Sep 26, 2022Updated 3 years ago
bitswired / semantic-splitting-tutorial
View on GitHub
☆15May 16, 2024Updated 2 years ago
thaitran / WebChat
View on GitHub
This is a chatbot built using Gradio that can access Google Search and webpages to answer questions. Supports GPT-3.5, GPT-4, Claude 2, …
☆13Aug 31, 2023Updated 2 years ago
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year