Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders [Technical Report]
☆180Mar 30, 2026Updated 3 weeks ago
Alternatives and similar repositories for Penguin-VL
Users that are interested in Penguin-VL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repo for the DanQing dataset.☆35Mar 25, 2026Updated last month
- [ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆16Mar 18, 2026Updated last month
- This is a project on visual spatial reasoning tasks-SIBench☆26Jan 12, 2026Updated 3 months ago
- ☆63Nov 12, 2025Updated 5 months ago
- We propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enablin…☆73Apr 12, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆86Jan 27, 2025Updated last year
- Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation☆113Apr 2, 2026Updated 3 weeks ago
- Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.☆113Jan 14, 2026Updated 3 months ago
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Nov 25, 2025Updated 5 months ago
- ☆44Apr 4, 2026Updated 3 weeks ago
- ☆85Mar 16, 2026Updated last month
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 6 months ago
- Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models☆100Jan 14, 2026Updated 3 months ago
- Math-VR Benchmark & CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images☆58Nov 4, 2025Updated 5 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ZwZ model family: SOTA fine-grained perception performace; ZoomBench: a new challenging perception benchmark☆124Mar 9, 2026Updated last month
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]☆16Jul 15, 2025Updated 9 months ago
- A replication of Google's VideoPoet model☆12Feb 18, 2024Updated 2 years ago
- ACL24☆11Jun 7, 2024Updated last year
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆224Mar 20, 2025Updated last year
- official implementation of Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation☆13Apr 15, 2024Updated 2 years ago
- This is an implementation of the paper "Are We Done with Object-Centric Learning?"☆12Apr 12, 2026Updated 2 weeks ago
- ☆12Aug 10, 2022Updated 3 years ago
- DEYOv1.5☆29Jul 22, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CAD - Memory Efficient Convolutional Adapter for Segment Anything☆12Oct 4, 2024Updated last year
- We propose to tackle the multiview photometric stereo problem using an extension of Neural Radiance Fields (NeRFs), conditioned on light …☆11Jan 11, 2023Updated 3 years ago
- Audio-video joint generation☆56Nov 27, 2025Updated 5 months ago
- ☆83Feb 5, 2026Updated 2 months ago
- Official implementation of "Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model"☆254Dec 8, 2025Updated 4 months ago
- Official repo for paper "EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture."☆61Dec 16, 2025Updated 4 months ago
- ONNXモデルをpyca/cryptographyを用いて暗号化/復号化するサンプル☆16Mar 19, 2022Updated 4 years ago
- Official Implementation of Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling☆210Dec 3, 2025Updated 4 months ago
- [NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception☆300Sep 21, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- (CVPR 2026 Highlight) Official repository for Scone (Subject-driven COmposition and DistinctioN Enhancement) model, supporting subject co…☆30Apr 9, 2026Updated 2 weeks ago
- [WACV 2025] Official code of "SEED4D: A Synthetic Ego-Exo Dynamic 4D Data Generator, Driving Dataset and Benchmark"☆23Sep 3, 2025Updated 7 months ago
- [CVPR 2026] TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Dual-Level Scale-Oriented Contrast☆22Mar 3, 2026Updated last month
- 电子科大格院毕设LaTeX模板☆19Jan 17, 2025Updated last year
- ☆12Aug 19, 2023Updated 2 years ago
- ☆13Jan 2, 2025Updated last year
- OmniGAIA: Towards Native Omni-Modal AI Agents☆124Apr 2, 2026Updated 3 weeks ago