[NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
☆23Oct 15, 2024Updated last year
Alternatives and similar repositories for ima-lmms
Users that are interested in ima-lmms are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An exploration of LLM steering☆26Jun 15, 2024Updated last year
- Mitigating Open-Vocabulary Caption Hallucinations (EMNLP 2024)☆18Oct 18, 2024Updated last year
- ☆84Nov 5, 2024Updated last year
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- [arXiv 2024] FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling☆16Apr 15, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official code of "RoboOmni: Proactive Robot Manipulation in Omni-modal Context"☆105Mar 28, 2026Updated last month
- Applies ROME and MEMIT on Mamba-S4 models☆15Apr 5, 2024Updated 2 years ago
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆34Oct 16, 2024Updated last year
- Fully Open Framework for Democratized Multimodal Reinforcement Learning.☆49Dec 19, 2025Updated 4 months ago
- Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…☆11Oct 18, 2022Updated 3 years ago
- ☆22Apr 22, 2025Updated last year
- Audio-Visual Lip Synthesis via Intermediate Landmark Representation☆18May 16, 2023Updated 2 years ago
- Official Repository of LatentSeek☆82Jun 6, 2025Updated 11 months ago
- ☆80Jul 28, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37May 23, 2023Updated 2 years ago
- Table top manipulation calibration between the robot arm, the fixed cameras and the camera in hand.☆11Apr 12, 2024Updated 2 years ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- Agentic Keyframe Search for Video Question Answering☆18Apr 7, 2025Updated last year
- Class-agnostic Object Detection and Instance Segmentation using Mask R-CNN☆17Nov 5, 2021Updated 4 years ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆384Nov 5, 2025Updated 6 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆57Mar 9, 2025Updated last year
- ☆16Apr 8, 2026Updated last month
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆39Jun 20, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- A PyTorch implementation of TVC☆24Dec 18, 2023Updated 2 years ago
- [ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"☆23Mar 8, 2026Updated 2 months ago
- Awesome Jailbreak, red teaming arxiv papers (Automatically Update Every 12th hours)☆108Updated this week
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Nov 17, 2023Updated 2 years ago
- ☆81Apr 28, 2026Updated last week
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Dec 13, 2024Updated last year
- "From ViT Features to Training-free Video Object Segmentation via Streaming-data Mixture Models" [Uziel, Dinari, and Freifeld, NeurIPS 20…☆13Jan 16, 2024Updated 2 years ago
- Evaluate Multimodal LLMs as Embodied Agents☆57Feb 14, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A simple iOS app that records the BlendShapes feature with timestamps provided by ARKit.☆15Nov 9, 2018Updated 7 years ago
- MineRL DDPG Agent to Obtain Diamond in Minecraft☆14Jan 21, 2020Updated 6 years ago
- ☆23Aug 20, 2024Updated last year
- 用Kinova Gen3实机实现Rekep☆11Mar 18, 2025Updated last year
- ☆14Jan 26, 2021Updated 5 years ago
- Adaptive Multimodal Reasoning via Reinforcement Learning☆23Jan 11, 2026Updated 3 months ago
- Synthesize bio-plausible neural networks for cognitive tasks, mimicking brain architecture☆11Apr 14, 2021Updated 5 years ago