fabawi/ImageBind-LoRA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fabawi/ImageBind-LoRA)

fabawi / ImageBind-LoRA

Fine-tuning "ImageBind One Embedding Space to Bind Them All" with LoRA

☆195

Alternatives and similar repositories for ImageBind-LoRA

Users that are interested in ImageBind-LoRA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Zeqiang-Lai / Anything2Image
View on GitHub
Generate image from anything with ImageBind and Stable Diffusion
☆201Aug 3, 2023Updated 2 years ago
sail-sg / BindDiffusion
View on GitHub
BindDiffusion: One Diffusion Model to Bind Them All
☆165May 19, 2023Updated 3 years ago
Birch-san / imagebind-guided-diffusion
View on GitHub
Guide diffusion on ImageBind embedding similarity
☆29May 27, 2023Updated 3 years ago
facebookresearch / ImageBind
View on GitHub
ImageBind One Embedding Space to Bind Them All
☆9,061Nov 21, 2025Updated 8 months ago
PKU-YuanGroup / LanguageBind
View on GitHub
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
☆884Mar 25, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
weaviate / multi2vec-bind-inference
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
TencentARC / ViT-Lens
View on GitHub
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
☆189Feb 3, 2025Updated last year
pittisl / mPnP-LLM
View on GitHub
Code for paper "Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI"
☆13Jan 19, 2024Updated 2 years ago
Max-Fu / tvl
View on GitHub
[ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment
☆102Jun 2, 2025Updated last year
HS-YN / PanoAVQA
View on GitHub
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)
☆16Oct 12, 2021Updated 4 years ago
kyegomez / Paper-Implementation-Template
View on GitHub
A simple reproducible template to implement AI research papers
☆24Sep 9, 2024Updated last year
ZiyuGuo99 / Point-Bind_Point-LLM
View on GitHub
Align 3D Point Cloud with Multi-modalities for Large Language Models
☆464Dec 9, 2023Updated 2 years ago
CASIA-IVA-Lab / VALOR
View on GitHub
[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
☆311Dec 25, 2024Updated last year
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
a43992899 / openl2s
View on GitHub
Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.
☆17May 9, 2025Updated last year
mengdi-li / internally-rewarded-rl
View on GitHub
[ICML 2023] Code for paper "Internally Rewarded Reinforcement Learning"
☆13Jul 21, 2023Updated 3 years ago
swimmiing / ACL-SSL
View on GitHub
Repository of the IJCV'26 & WACV'24 paper
☆34Apr 27, 2026Updated 2 months ago
Agora-Lab-AI / SRT
View on GitHub
An open-source non-official community implementation of the model from the paper: Surgical Robot Transformer (SRT): Imitation Learning fo…
☆13Jul 13, 2026Updated last week
kyegomez / Pegasus
View on GitHub
PegasusX: The Future of Multimodal Embeddings 🦄 🦄
☆14Oct 16, 2024Updated last year
LaVi-Lab / Visual-Table
View on GitHub
[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"
☆20Oct 17, 2024Updated last year
zhiqi-li / WechatLogger
View on GitHub
一个mmcv 的logger hook, 可以用来把模型结果推送到微信上
☆21Oct 11, 2022Updated 3 years ago
ChanganVR / action2sound
View on GitHub
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆26Oct 1, 2024Updated last year
Ellenzzn / PersLLM
View on GitHub
☆16Jan 16, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xiangyu-mm / EasyGen
View on GitHub
The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"
☆73Nov 21, 2024Updated last year
OpenGVLab / LLaMA-Adapter
View on GitHub
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,916Mar 14, 2024Updated 2 years ago
khdlr / SoundingEarth
View on GitHub
Self-supervised Audiovisual Representation Learning for Remote Sensing Data
☆34May 22, 2023Updated 3 years ago
facebookresearch / viewseg
View on GitHub
Code for "Recognizing Scenes from Novel Viewpoints"
☆29Sep 16, 2022Updated 3 years ago
JamesQFreeman / LoRA-ViT
View on GitHub
Low rank adaptation for Vision Transformer
☆439Apr 14, 2026Updated 3 months ago
karchkha / MelSpec_GPT_VQVAE
View on GitHub
Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms
☆18Oct 8, 2023Updated 2 years ago
MCG-NJU / PDPP
View on GitHub
[CVPR 2023 Hightlight] PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
☆34Aug 30, 2023Updated 2 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Updated this week
mlfoundations / open_flamingo
View on GitHub
An open-source framework for training large multimodal models.
☆4,114Aug 31, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
kyegomez / TTL
View on GitHub
Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"
☆23Jul 13, 2026Updated last week
kohjingyu / gill
View on GitHub
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
☆471Jan 19, 2024Updated 2 years ago
alvinliu0 / Visual-Sound-Localization-in-the-Wild
View on GitHub
Code for Visual Sound Localization in the Wild by Cross-Modal Interference Erasing (AAAI 2022).
☆29Feb 15, 2022Updated 4 years ago
zjr2000 / REVERIE
View on GitHub
[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
☆20Jul 17, 2024Updated 2 years ago
jiasenlu / vit-vqgan-jax
View on GitHub
Jax implementation of VIT-VQGAN
☆10Jan 25, 2024Updated 2 years ago
google-research-datasets / PropSegmEnt
View on GitHub
PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations…
☆21Dec 21, 2022Updated 3 years ago