Codebase for paper ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
β29Nov 3, 2025Updated 4 months ago
Alternatives and similar repositories for ToolVQA-release
Users that are interested in ToolVQA-release are comparing it to the libraries listed below
Sorting:
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Modelsβ40Oct 30, 2025Updated 4 months ago
- [ICML 2025] This is the official PyTorch implementation of "π΅ HarmoniCa: Harmonizing Training and Inference for Better Feature Caching iβ¦β45Jul 10, 2025Updated 7 months ago
- Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learningβ¦β47Aug 4, 2025Updated 7 months ago
- β13Mar 13, 2023Updated 2 years ago
- β24Feb 2, 2026Updated last month
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generationβ30Dec 22, 2025Updated 2 months ago
- β16Mar 17, 2025Updated 11 months ago
- β37Dec 18, 2025Updated 2 months ago
- Wanna breeze through some papers?β95Feb 27, 2026Updated last week
- EMIT: Enhancing MLLMs for Industrial Anomaly Detection via Difficulty-Aware GRPOβ21Jan 24, 2026Updated last month
- A small storytelling LLM running on the PS Vitaβ27Jun 12, 2025Updated 8 months ago
- Python package for P2 (Path Planning), a masked diffusion model sampling method for sequence generation (protein, text, etc.).β23Aug 19, 2025Updated 6 months ago
- Official code for 'One-Shot Object Localization in Medical Images based on Relative Position Regression'.β12Sep 10, 2022Updated 3 years ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulinβ¦β13Jul 27, 2025Updated 7 months ago
- [NeurIPS 2023]Federated Learning with Bilateral Curation for Partially Class-Disjoint Dataβ14Aug 1, 2025Updated 7 months ago
- [ACMMM UAVM 2025] ππ VICI: VLM-Instructed Cross-view Image-localisation π‘πΊοΈβ17Feb 4, 2026Updated last month
- An interactive thinking and deep reasoning model. It provides a cognitive reasoning paradigm for complex multi-hop problems.β79Nov 14, 2025Updated 3 months ago
- ICML 2025 Spotlight, PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative APβ¦β14Jun 27, 2025Updated 8 months ago
- β14Jan 8, 2025Updated last year
- Enhanced GPUstat-webβ10Oct 2, 2020Updated 5 years ago
- [AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoningβ106Dec 3, 2025Updated 3 months ago
- The Data Explorer and Machine Learning Appβ14Feb 22, 2026Updated 2 weeks ago
- β11Jan 25, 2024Updated 2 years ago
- How Much Position Information Do Convolutional Neural Networks Encode?β11Sep 20, 2021Updated 4 years ago
- CNN For Fish Trainingβ26Jul 9, 2025Updated 8 months ago
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Informationβ12Oct 11, 2024Updated last year
- β19Dec 20, 2025Updated 2 months ago
- Official repository for the AAAI2026 paper (Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery β¦β22Feb 4, 2026Updated last month
- Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""β18Jul 7, 2024Updated last year
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoningβ113Feb 2, 2026Updated last month
- Noah -- fixing your computer issuesβ47Updated this week
- This repository is the official code for the paper "AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation" (NeurIPS 2024).β14Sep 17, 2025Updated 5 months ago
- The code of Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspectiveβ18Jun 20, 2025Updated 8 months ago
- [NeurIPS 2025] Official Implementation of ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.β47Jan 28, 2026Updated last month
- PyTorch implementation of "PatchVAE: Learning Local Latent Codes for Recognition" to appear in CVPR 2020β14Apr 9, 2020Updated 5 years ago
- Official code base for "Long-Tailed Diffusion Models With Oriented Calibration" ICLR2024β15Jul 11, 2024Updated last year
- Offical implementation of work 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representationβ18Feb 5, 2025Updated last year
- A collection of important papers on Generalizable Diffusion-generated Image Detectionβ17Mar 20, 2025Updated 11 months ago
- ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generationβ27May 27, 2025Updated 9 months ago