Scaffold Prompting to promote LMMs
☆46Dec 16, 2024Updated last year
Alternatives and similar repositories for Scaffold
Users that are interested in Scaffold are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM☆45Oct 12, 2024Updated last year
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆19Nov 12, 2024Updated last year
- Extract features and bounding boxes using the original Bottom-up Attention Faster-RCNN in a few lines of Python code☆11Sep 18, 2022Updated 3 years ago
- ☆19Oct 28, 2025Updated 7 months ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆68Jul 16, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆54Jul 23, 2025Updated 10 months ago
- The Code for Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models☆18Oct 4, 2024Updated last year
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆18Jul 21, 2024Updated last year
- Data and code for the paper: Finding Safety Neurons in Large Language Models☆28Jan 29, 2026Updated 4 months ago
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆51Dec 23, 2024Updated last year
- 目标检测,关键点检测。A pure version of CenterNet, convenient for secondary development and easy to understand.☆21Dec 9, 2020Updated 5 years ago
- ☆21Aug 9, 2024Updated last year
- Spatial Aptitude Training for Multimodal Langauge Models☆33Feb 8, 2026Updated 4 months ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆13Sep 30, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆20Jan 3, 2025Updated last year
- Chest X-Ray Explainer (ChEX)☆24Jan 30, 2025Updated last year
- ☆64Oct 25, 2025Updated 7 months ago
- Augment robotics demonstration datasets with different robots and viewpoints☆41Feb 27, 2025Updated last year
- The collection of medical VLP papars☆20Jul 24, 2024Updated last year
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆40Jul 13, 2024Updated last year
- ☆10Dec 15, 2024Updated last year
- Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"☆27Jul 10, 2023Updated 2 years ago
- [ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"☆45Feb 27, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆81Nov 20, 2025Updated 6 months ago
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆26May 21, 2026Updated 3 weeks ago
- [AAAI 2025] Code for paper:Enhancing Multimodal Large Language Models Complex Reasoning via Similarity Computation☆21Jan 14, 2025Updated last year
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆87Oct 26, 2025Updated 7 months ago
- ☆24Nov 29, 2023Updated 2 years ago
- LLaVa Version of RaDialog☆26May 27, 2025Updated last year
- Imply games202 homework in C++ and OpenGL☆13Sep 14, 2022Updated 3 years ago
- ☆18Jun 23, 2022Updated 3 years ago
- AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World | CoRL 2025☆97Mar 26, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Vision-Language Model for Spatial Affordance Prediction in Robotics☆223Jul 17, 2025Updated 10 months ago
- Official Implementation of implicit reference attack☆11Oct 16, 2024Updated last year
- Galaxea's first diffusion policy release☆37Aug 18, 2025Updated 9 months ago
- Official repo for EscapeCraft (an 3D environment for room escape) and benchmark MM-Escape. This work is accepted by ICCV 2025.☆39Jul 7, 2025Updated 11 months ago
- ☆13Oct 25, 2024Updated last year
- [ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning☆69Sep 20, 2025Updated 8 months ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆15Dec 16, 2024Updated last year