AutoLab-SAI-SJTU / Paper2RebuttalLinks
Official implementation of "Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance"
☆229Updated this week
Alternatives and similar repositories for Paper2Rebuttal
Users that are interested in Paper2Rebuttal are comparing it to the libraries listed below
Sorting:
- Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?☆204Updated last month
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆103Updated 6 months ago
- Collection of Highlight papers☆42Updated last year
- Visual Spatial Tuning☆169Updated 2 weeks ago
- A collection of vision foundation models unifying understanding and generation.☆59Updated last year
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆205Updated 6 months ago
- [NeurIPS DB 2025] IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering☆42Updated 3 months ago
- Official Code for 'TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction' (ICCV 2025)☆77Updated 2 months ago
- A list of works on video generation towards world model☆330Updated last week
- The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'☆196Updated 2 months ago
- [NeurIPS 2025] VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models☆157Updated 3 weeks ago
- Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs☆59Updated 3 weeks ago
- Thinking in 360°: Humanoid Visual Search in the Wild☆110Updated last month
- MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence☆52Updated 3 weeks ago
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆59Updated 6 months ago
- Cambrian-S: Towards Spatial Supersensing in Video☆480Updated last month
- A paper list for spatial reasoning☆614Updated last week
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness☆63Updated 6 months ago
- [NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding☆140Updated last month
- [ECCV2024] Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding☆125Updated last year
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration☆110Updated last month
- [NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆89Updated 6 months ago
- [Arxiv'25] DINO-Tok: Adapting DINO for Visual Tokenizers☆35Updated 2 months ago
- [CVPR 2025 Highlight🔥] Official code repository for "Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuni…☆126Updated 2 months ago
- Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"☆48Updated last month
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆75Updated last month
- ☆34Updated 10 months ago
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆75Updated 3 weeks ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆89Updated 5 months ago
- [CVPR'2022, TPAMI'2024] LAVT: Language-Aware Vision Transformer for Referring Segmentation☆24Updated last year