tianyi-lab / CoSTARLinks
Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
☆22Updated 3 months ago
Alternatives and similar repositories for CoSTAR
Users that are interested in CoSTAR are comparing it to the libraries listed below
Sorting:
- ☆11Updated 5 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆27Updated 2 months ago
- ☆11Updated 8 months ago
- Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing☆27Updated 3 weeks ago
- Official implementation of ECCV24 paper: POA☆24Updated 11 months ago
- Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆24Updated last week
- The official repo of continuous speculative decoding☆27Updated 3 months ago
- [Preprint] Efficient Generative Model Training via Embedded Representation Warmup☆30Updated 3 months ago
- ☆42Updated 8 months ago
- Official implementation of "Art-Free Generative Models: Art Creation Without Graphic Art Knowledge"☆31Updated 3 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆25Updated 6 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated 4 months ago
- Multimodal RewardBench☆42Updated 4 months ago
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆16Updated 2 months ago
- ☆48Updated last month
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆18Updated last year
- Resa: Transparent Reasoning Models via SAEs☆39Updated last month
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆60Updated 9 months ago
- Fast-Slow Thinking for Large Vision-Language Model Reasoning☆16Updated 2 months ago
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆13Updated 4 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆43Updated 8 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆16Updated last week
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated last year
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆45Updated 2 weeks ago
- Code, Data and Red Teaming for ZeroBench☆46Updated 2 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆34Updated last year
- Unsupervised GRPO☆38Updated last month
- ☆46Updated 2 months ago
- Official Repository of Personalized Visual Instruct Tuning☆31Updated 4 months ago
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆71Updated last week