[EMNLP 2025 Demo] Extracting internal representations from vision-language models. Beta version.
☆122Mar 10, 2026Updated last month
Alternatives and similar repositories for vlm-lens
Users that are interested in vlm-lens are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Nov 18, 2024Updated last year
- Logical inference system based on event semantics and degree semantics in formal semantics☆10Jan 22, 2023Updated 3 years ago
- ☆13Feb 21, 2024Updated 2 years ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Mar 30, 2026Updated 2 weeks ago
- [IJCAI 2023 workshop]Expanding dataset for 2D medical image segmentation using diffusion models☆15Feb 28, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆29Jul 9, 2025Updated 9 months ago
- Official repository for "Vid2World: Crafting Video Diffusion Models to Interactive World Models" (ICLR 2026), https://arxiv.org/abs/2505.…☆51Jan 27, 2026Updated 2 months ago
- ☆12Jun 20, 2023Updated 2 years ago
- 【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".☆38Dec 5, 2024Updated last year
- Towards Unified and Effective Domain Generalization☆32Nov 27, 2023Updated 2 years ago
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆27Dec 2, 2025Updated 4 months ago
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆54Dec 28, 2025Updated 3 months ago
- v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning☆19Oct 6, 2025Updated 6 months ago
- A collection of lightweight interpretability scripts to understand how LLMs think☆89Mar 18, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"☆26Feb 9, 2025Updated last year
- [MICCAI 2024] Embracing Massive Medical Data☆20Jul 5, 2024Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- DINO-based perceptual losses and FDD feature extraction☆26Jan 7, 2026Updated 3 months ago
- A comprehensive JAX/NNX library for diffusion and flow matching generative algorithms, featuring DiT (Diffusion Transformer) and its vari…☆145Oct 16, 2025Updated 6 months ago
- ChartSum is a large scale benchmark for automatic chart to text summarization☆11Jul 20, 2023Updated 2 years ago
- ☆38Dec 18, 2025Updated 4 months ago
- ☆15Dec 16, 2023Updated 2 years ago
- [ICCV 2025] Identity Preserving 3D Head Stylization with Multiview Score Distillation☆16Jun 25, 2025Updated 9 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This is a community implementation for the paper EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularizatio…☆37Aug 4, 2023Updated 2 years ago
- [ICLR 2026] Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing☆27Jan 27, 2026Updated 2 months ago
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆58Feb 4, 2026Updated 2 months ago
- ICLR 2026: Agent-X Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆39Apr 5, 2026Updated 2 weeks ago
- ☆15Jan 9, 2026Updated 3 months ago
- Does patch ordering affect context-limited vision transformers?☆17Oct 10, 2025Updated 6 months ago
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding. Accepted to ICLR 2026.☆62Aug 19, 2025Updated 8 months ago
- Retargeting of whole-body human motion to humanoid robots for dexterous manipulation of articulated objects.☆28Jan 28, 2026Updated 2 months ago
- ☆13Nov 29, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"☆14Dec 16, 2024Updated last year
- Text-guided 3D texture generation using training-free multi-diffusion in UV space.☆14Apr 7, 2025Updated last year
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆18Nov 4, 2025Updated 5 months ago
- Super Mario Bros. (NES) gameplay dataset for machine learning.☆12Jul 22, 2025Updated 8 months ago
- Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image☆12May 10, 2025Updated 11 months ago
- Source code of paper "Systematic Assessment of Factual Knowledge in Large Language Models" - EMNLP Findings 2023☆17Mar 17, 2026Updated last month
- ☆31Dec 17, 2025Updated 4 months ago