tianyi-lab / FaSTARLinks
[ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
☆29Updated last week
Alternatives and similar repositories for FaSTAR
Users that are interested in FaSTAR are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆13Updated 10 months ago
- Cost-Sensitive Toolpath Agent for Multi-turn Image Editing☆25Updated 10 months ago
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆31Updated 4 months ago
- ☆63Updated 6 months ago
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆64Updated 4 months ago
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Updated 2 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Updated last month
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆26Updated last year
- The official implementation of our paper "CoRe^2: Collect, Reflect and Refine to Generate Better and Faster".☆30Updated 10 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆147Updated 6 months ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆33Updated 5 months ago
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆34Updated 3 weeks ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Updated 11 months ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Updated last year
- ☆24Updated 8 months ago
- More reliable Video Understanding Evaluation☆13Updated 4 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆17Updated 3 months ago
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆57Updated 2 weeks ago
- ☆30Updated 3 weeks ago
- ☆68Updated 4 months ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Updated 3 months ago
- ☆39Updated 8 months ago
- The official repo of continuous speculative decoding☆31Updated 10 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆37Updated last year
- ☆13Updated last year
- The official repo for LIFT: Language-Image Alignment with Fixed Text Encoders☆42Updated 8 months ago
- ☆17Updated last year
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆52Updated 6 months ago
- ☆37Updated 2 months ago
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆36Updated 2 months ago