[NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
☆23Oct 15, 2024Updated last year
Alternatives and similar repositories for ima-lmms
Users that are interested in ima-lmms are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An exploration of LLM steering☆26Jun 15, 2024Updated last year
- Mitigating Open-Vocabulary Caption Hallucinations (EMNLP 2024)☆18Oct 18, 2024Updated last year
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- [CVPR 2023] Improving Zero-shot Generalization and Robustness of Multi-modal Models☆35Jul 16, 2023Updated 2 years ago
- Applies ROME and MEMIT on Mamba-S4 models☆14Apr 5, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆34Oct 16, 2024Updated last year
- Fully Open Framework for Democratized Multimodal Reinforcement Learning.☆47Dec 19, 2025Updated 4 months ago
- ☆22Apr 22, 2025Updated 11 months ago
- [ICLR 2026] RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation☆34Mar 10, 2026Updated last month
- Today I Learned 🐢☆10Aug 17, 2023Updated 2 years ago
- Official Repository of LatentSeek☆80Jun 6, 2025Updated 10 months ago
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37May 23, 2023Updated 2 years ago
- ☆77Jul 28, 2025Updated 8 months ago
- Table top manipulation calibration between the robot arm, the fixed cameras and the camera in hand.☆11Apr 12, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆14Feb 24, 2025Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- Agentic Keyframe Search for Video Question Answering☆18Apr 7, 2025Updated last year
- Class-agnostic Object Detection and Instance Segmentation using Mask R-CNN☆17Nov 5, 2021Updated 4 years ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆384Nov 5, 2025Updated 5 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆55Mar 9, 2025Updated last year
- ☆16Apr 8, 2026Updated last week
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆20Jun 2, 2025Updated 10 months ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Lightweight control environment for Franka robot☆12Mar 16, 2022Updated 4 years ago
- A PyTorch implementation of TVC☆24Dec 18, 2023Updated 2 years ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆33Jun 23, 2025Updated 9 months ago
- Awesome Jailbreak, red teaming arxiv papers (Automatically Update Every 12th hours)☆107Updated this week
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- Hand eye calibration for panda and fr3 using apriltag☆17Apr 7, 2025Updated last year
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Nov 17, 2023Updated 2 years ago
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Dec 13, 2024Updated last year
- "From ViT Features to Training-free Video Object Segmentation via Streaming-data Mixture Models" [Uziel, Dinari, and Freifeld, NeurIPS 20…☆13Jan 16, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Evaluate Multimodal LLMs as Embodied Agents☆56Feb 14, 2025Updated last year
- Multi-dimensional analysis of orthogonal safety directions in LLM alignment☆22Mar 20, 2025Updated last year
- ☆11Dec 16, 2018Updated 7 years ago
- Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"☆13Apr 27, 2022Updated 3 years ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆86Oct 25, 2024Updated last year
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆14Jun 26, 2025Updated 9 months ago
- ☆23Jun 13, 2024Updated last year