[EMNLP 2025 Demo] Extracting internal representations from vision-language models. Beta version.
☆123Apr 25, 2026Updated last month
Alternatives and similar repositories for vlm-lens
Users that are interested in vlm-lens are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Nov 18, 2024Updated last year
- [CVPR' 26] MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts☆45Apr 27, 2026Updated last month
- 🔥 [ICLR 2025] Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"☆42Nov 21, 2025Updated 6 months ago
- ☆12Jun 5, 2024Updated 2 years ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆37Nov 13, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Jun 8, 2026Updated last week
- ☆14Feb 21, 2024Updated 2 years ago
- ☆43Feb 4, 2026Updated 4 months ago
- Image/Instance Retrieval using CLIP, A self supervised Learning Model☆29May 30, 2023Updated 3 years ago
- ☆13Jun 20, 2023Updated 2 years ago
- Towards Unified and Effective Domain Generalization☆34Nov 27, 2023Updated 2 years ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆32Jun 5, 2025Updated last year
- ☆71Jun 23, 2025Updated 11 months ago
- v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning☆20Oct 6, 2025Updated 8 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ReaSCAN is a synthetic navigation task that requires models to reason about surroundings over syntactically difficult languages. (NeurIPS…☆19Nov 28, 2021Updated 4 years ago
- Automatic subordinate clause extractor☆11Jul 7, 2022Updated 3 years ago
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆30Dec 2, 2025Updated 6 months ago
- 🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"☆26Feb 9, 2025Updated last year
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆58Dec 28, 2025Updated 5 months ago
- [MICCAI 2024] Embracing Massive Medical Data☆20Jul 5, 2024Updated last year
- [TPAMI 2026] Breaking Barriers, Localizing Saliency: A Large-scale Benchmark and Baseline for Condition-Constrained Salient Object Detect…☆30Dec 12, 2025Updated 6 months ago
- About This repository is a curated collection of the most exciting and influential CVPR 2026 papers. 🔥 [Paper + Code + Demo]☆491Jun 6, 2026Updated last week
- ☆11Sep 1, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- First Latency-Aware Competitive LLM Agent Benchmark☆29Jun 3, 2025Updated last year
- Multimodal grounded language dataset☆11Dec 14, 2021Updated 4 years ago
- DINO-based perceptual losses and FDD feature extraction☆31Jan 7, 2026Updated 5 months ago
- ☆54May 9, 2025Updated last year
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- [ICLR 2025] Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron☆32Apr 30, 2025Updated last year
- ☆41Dec 18, 2025Updated 6 months ago
- ☆15Dec 16, 2023Updated 2 years ago
- Automatically exported from code.google.com/p/incremental-top-down-parser☆14Mar 15, 2015Updated 11 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An ambiguous subtitles dataset for visual scene-aware machine translation☆14Oct 17, 2022Updated 3 years ago
- This is a community implementation for the paper EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularizatio…☆36Aug 4, 2023Updated 2 years ago
- ☆11Oct 2, 2024Updated last year
- ☆24Oct 30, 2025Updated 7 months ago
- Does patch ordering affect context-limited vision transformers?☆17Oct 10, 2025Updated 8 months ago
- Application for 3D visualization of map data from OpenStreetMap.☆13Apr 28, 2019Updated 7 years ago
- ☆22Sep 16, 2025Updated 9 months ago