[ICML 2026] ZwZ model family: SOTA fine-grained perception performace; ZoomBench: a new challenging perception benchmark
☆160May 4, 2026Updated last month
Alternatives and similar repositories for Zooming-without-Zooming
Users that are interested in Zooming-without-Zooming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (ICML 2024) PyTorch implementation of "Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes"☆16Oct 15, 2024Updated last year
- A simple visual test-time scaling method for GUI agent grounding☆26Dec 7, 2025Updated 6 months ago
- ☆24Sep 12, 2024Updated last year
- [ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"☆62May 26, 2026Updated last month
- EMMA [TMLR 2025]☆14Sep 25, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official repository Flash Local Linear Attention☆37May 28, 2026Updated last month
- ☆28Feb 13, 2026Updated 4 months ago
- ☆23Aug 20, 2024Updated last year
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 8 months ago
- [MICCAI2023] XSurv: Merging-Diverging Hybrid Transformer Networks for Survival Prediction☆12Oct 2, 2023Updated 2 years ago
- Local AI runtime for training & running small LLMs directly on Apple Neural Engine (ANE). No CoreML. No Metal. Offline, on-device fine-tu…☆102Mar 6, 2026Updated 3 months ago
- Image Classification Tutorial: ConvNext--> 98.8% on CIFAR10 + 92.4% on CIFAR100; ResNet18 -- 95.6% on CIFAR10 + 79.1% on CIFAR100☆15Jun 2, 2025Updated last year
- 本项目对Deepseek-R1-Distill-Qwen-7B进行心理咨询CoT数据的LoRA微调,以进一步提升Deepseek-R1-Distill-Qwen-7B在心理咨询领域的慢思考能力。☆12Mar 11, 2025Updated last year
- Simple and Ideal Circuit Simulation☆13Dec 4, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- BusterX and BusterX++☆41Jun 16, 2026Updated 2 weeks ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆152May 25, 2026Updated last month
- ☆28Mar 17, 2026Updated 3 months ago
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)☆26Jun 11, 2025Updated last year
- [ECCV 2026] Pytorch implementation of "SKEL-CF: Coarse-to-Fine Biomechanical Skeleton and Surface Mesh Recovery"☆63Jun 18, 2026Updated last week
- Modality Gap Theory☆74May 16, 2026Updated last month
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆99Mar 9, 2026Updated 3 months ago
- [NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.☆32Nov 13, 2025Updated 7 months ago
- MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning☆40May 7, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for IJCAI 2023 paper 'SLViT: Scale-Wise Language-Guided Vision Transformer for Referring Image Segmentation'☆11May 28, 2023Updated 3 years ago
- The official repository of the EMNLP 2024 Findings paper: Question-guided Knowledge Graph Re-scoring and Injection for Knowledge Graph Qu…☆18Nov 4, 2024Updated last year
- CHEMSMART: Chemistry Simulation and Modeling Automation Toolkit☆38Jun 23, 2026Updated last week
- Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders [Technical Report]☆197Mar 30, 2026Updated 3 months ago
- ☆55Updated this week
- ☆18May 14, 2025Updated last year
- ☆13Jan 14, 2026Updated 5 months ago
- Multi-modal approach for tumor segmentation and survival prediction using PET/CT imaging with attention mechanisms (MICCAI2021 HECKTOR Ch…☆12Apr 22, 2022Updated 4 years ago
- A curated collection of papers and resources on On-Policy Distillation for Large Language Models.☆357Jun 23, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Use 2 lines to empower absolute time awareness for Qwen2.5VL's MRoPE☆29Sep 20, 2025Updated 9 months ago
- Wind visualization over time☆102Oct 23, 2025Updated 8 months ago
- Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models☆110Jan 14, 2026Updated 5 months ago
- Simple intermediate representation language for learning and research.☆22Mar 27, 2020Updated 6 years ago
- Image caption and manage tool for AI training☆11Jan 24, 2025Updated last year
- ☆15Sep 23, 2022Updated 3 years ago
- Unofficial implementation of Hippoformer, Integrating Hippocampus-inspired Spatial Memory with Transformers☆53Apr 28, 2026Updated 2 months ago