CASIA-IVA-Lab/ChatBridge

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CASIA-IVA-Lab/ChatBridge)

CASIA-IVA-Lab / ChatBridge

ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.

☆55

Alternatives and similar repositories for ChatBridge

Users that are interested in ChatBridge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CASIA-IVA-Lab / OPT_Questioner
View on GitHub
Official PyTorch implementation of the paper "Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner"
☆15Aug 9, 2023Updated 2 years ago
CASIA-IVA-Lab / COSA
View on GitHub
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
☆43Dec 25, 2024Updated last year
SuDIS-ZJU / Data-Quality-for-Vision-Language-Models
View on GitHub
☆35Nov 18, 2025Updated 8 months ago
CASIA-IVA-Lab / SC-Tune
View on GitHub
Official code for CVPR 2024 paper, "SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models"
☆16Apr 22, 2024Updated 2 years ago
CASIA-IVA-Lab / MRES
View on GitHub
This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…
☆74Jun 3, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ivattyue / Ada-K
View on GitHub
Official code for the ICLR 2025 paper, "Ada-K Routing: Boosting the Efficiency of MoE-based LLMs"
☆12Mar 1, 2025Updated last year
Rubics-Xuan / Med-DANet
View on GitHub
Med-DANet Series (ECCV 2022 & WACV 2024)
☆13Jan 2, 2024Updated 2 years ago
CASIA-IVA-Lab / MOSO
View on GitHub
☆35Jun 6, 2023Updated 3 years ago
CASIA-IVA-Lab / VRoPE
View on GitHub
[EMNLP 2025 Main] Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.
☆28Nov 18, 2025Updated 8 months ago
CASIA-IVA-Lab / VideoNIAH
View on GitHub
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆57Mar 9, 2025Updated last year
doubledaibo / 2dcaption_eccv2018
View on GitHub
Rethinking the Form of Latent States in Image Captioning
☆20Aug 31, 2018Updated 7 years ago
PeterGriffinJin / Heterformer
View on GitHub
Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks (KDD 2023)
☆28Feb 16, 2024Updated 2 years ago
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
LingweiMeng / Whisper-Sidecar
View on GitHub
The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".
☆34Aug 2, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
baaivision / DIVA
View on GitHub
[ICLR 2025] Diffusion Feedback Helps CLIP See Better
☆301Jan 23, 2025Updated last year
AtmaHou / PromptSlotTagging
View on GitHub
Code for ACL22 findings paper: Inverse is Better! Fast and Accurate Prompt for Slot Tagging
☆27Jul 13, 2022Updated 4 years ago
Lilidamowang / T2VIndexer-generativeSearch
View on GitHub
☆16Aug 28, 2024Updated last year
irvingzhang0512 / open-images-downloader
View on GitHub
☆14Aug 13, 2021Updated 4 years ago
Lzq5 / Video-Text-Alignment
View on GitHub
☆28Jul 18, 2025Updated last year
CASIA-IVA-Lab / VAST
View on GitHub
[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
☆302Mar 14, 2024Updated 2 years ago
rikeilong / Bay-CAT
View on GitHub
[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…
☆59Sep 4, 2024Updated last year
google-research-datasets / maverics
View on GitHub
MAVERICS (Manually-vAlidated Vq^2a Examples fRom Image-Caption datasetS) is a suite of test-only benchmarks for visual question answering…
☆13Feb 18, 2023Updated 3 years ago
xiangzhang1015 / EEG_Shape_Reconstruction
View on GitHub
Multi-task Generative Adversarial Learning on Geometrical Shape Reconstruction from EEG Brain Signals, published in ICONIP 2019.
☆22Jan 20, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
phellonchen / X-LLM
View on GitHub
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
☆318Jul 14, 2026Updated last week
JinhuaLiang / lam4fsl
View on GitHub
An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"
☆31May 31, 2023Updated 3 years ago
showlab / assistgpt
View on GitHub
☆66Jun 16, 2023Updated 3 years ago
kiaia / GIRAFFE
View on GitHub
Extending context length of visual language models
☆12Dec 18, 2024Updated last year
wlzhang2020 / LLMTreeRec
View on GitHub
The implement of LLMTreeRec
☆14Dec 9, 2024Updated last year
isekai-portal / Link-Context-Learning
View on GitHub
☆101May 16, 2024Updated 2 years ago
XuZhang1211 / PVPUFormer
View on GitHub
Implementation of ''VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation''
☆15Sep 16, 2025Updated 10 months ago
Shark-NLP / EVALM
View on GitHub
Official codebase for “In-Context Learning with Many Demonstration Examples”
☆16Feb 13, 2023Updated 3 years ago
vipulgupta1011 / CALM
View on GitHub
☆11Oct 2, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CJReinforce / JOWA
View on GitHub
Official code for the ICLR 2025 paper, "Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining"
☆30Dec 1, 2024Updated last year
AV-Odyssey / AV-Odyssey
View on GitHub
This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"
☆31Dec 23, 2024Updated last year
GeWu-Lab / MWAFM
View on GitHub
Multi-Scale Attention for Audio Question Answering
☆28Jul 19, 2023Updated 3 years ago
PeterGriffinJin / Edgeformers
View on GitHub
Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks (ICLR 2023)
☆71Jul 23, 2023Updated 2 years ago
LingweiMeng / MyChatGPT
View on GitHub
A casual and simple ChatGPT Python script that can run using terminal (as long as you have an API). Support Azure API.
☆20May 3, 2025Updated last year
zjr2000 / GVL
View on GitHub
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
☆28Dec 8, 2023Updated 2 years ago
Ace-Pegasus / EasyDrag
View on GitHub
Official code for EasyDrag (CVPR 2024)
☆17Jun 18, 2024Updated 2 years ago