☆91Mar 9, 2026Updated last week
Alternatives and similar repositories for SWE-Vision
Users that are interested in SWE-Vision are comparing it to the libraries listed below
Sorting:
- ☆27Jul 23, 2025Updated 7 months ago
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆30Jan 12, 2026Updated 2 months ago
- Visual Generation Tuning☆99Jan 27, 2026Updated last month
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆101May 20, 2025Updated 9 months ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆36Oct 3, 2024Updated last year
- [ECCV22] BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering (Jittor)☆11Sep 16, 2022Updated 3 years ago
- Fine-tuned LLMs generate accurate 3D human avatars from textual descriptions using the SMPL-X model, enhancing customization and simulati…☆37Feb 5, 2025Updated last year
- Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models☆12Nov 1, 2025Updated 4 months ago
- ☆46Jan 21, 2026Updated last month
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆40May 26, 2025Updated 9 months ago
- ResNet-50 for TsinghuaDog classification☆10Feb 2, 2021Updated 5 years ago
- Code for PyMTL Tutorial @ ISCA 2019☆11Jun 22, 2019Updated 6 years ago
- ☆13Nov 5, 2024Updated last year
- ☆34Jan 9, 2026Updated 2 months ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions☆26Feb 11, 2026Updated last month
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆25Nov 28, 2025Updated 3 months ago
- ☆15Feb 11, 2025Updated last year
- Official implementation of "Continual Learning by Modeling Intra-Class Variation" (MOCA). [TMLR 2023]☆16Mar 3, 2023Updated 3 years ago
- Simple MIDAS Examples☆12Nov 25, 2018Updated 7 years ago
- A timer theme of Wallpaper Engine (13k Subscribers)☆15Oct 26, 2022Updated 3 years ago
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆15Aug 6, 2025Updated 7 months ago
- ⚔️ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".☆53Feb 23, 2026Updated 3 weeks ago
- [ACM MM 2025] LMM4Edit: Benchmarking and Evaluating Multimodal Image Editing with LMMs☆15Feb 10, 2026Updated last month
- ☆43Jul 18, 2024Updated last year
- Wan 2.5 AI Video Generator - Transform text & images into HD videos with synchronized audio☆80Sep 25, 2025Updated 5 months ago
- Deep Interest Network for Click-Through Rate Prediction Deep Interest Evolution Network for Click-Through Rate Prediction☆11Oct 14, 2020Updated 5 years ago
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]☆15Jul 15, 2025Updated 8 months ago
- [ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"☆11Sep 3, 2024Updated last year
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆21Jun 12, 2025Updated 9 months ago
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆62Nov 7, 2024Updated last year
- [CVPR 2025] VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning☆13Jun 7, 2025Updated 9 months ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Chisel3 implementation of IEEE-754 compliant floating point data type (logic & representation)☆11Dec 16, 2019Updated 6 years ago
- CoMA: Compositional Human Motion Generation with Multi-modal Agents☆14Jul 31, 2025Updated 7 months ago
- ACM MM 2022 - PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding☆11Aug 12, 2022Updated 3 years ago
- This repository collects awesome representative papers and resources for "From Pre-training to Post-training: A Survey on Time Series Fou…☆32Feb 1, 2026Updated last month
- The PyTorch implementation of DSM (EMNLP 2022).☆10Mar 26, 2024Updated last year