PKUTAN / SAWTLinks
Official python implementation for ICML 2024: "Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem"
☆14Updated last year
Alternatives and similar repositories for SAWT
Users that are interested in SAWT are comparing it to the libraries listed below
Sorting:
- ☆11Updated last month
- Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation☆140Updated 4 months ago
- [NeurIPS'24 spotlight] MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning☆40Updated 2 months ago
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆243Updated last year
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆54Updated 8 months ago
- [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"☆150Updated last year
- [ICLR'25] Reconstructive Visual Instruction Tuning☆116Updated 5 months ago
- Give us minutes, we give back a faster Mamba. The official implementation of "Faster Vision Mamba is Rebuilt in Minutes via Merged Token …☆40Updated 9 months ago
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆92Updated last year
- Official PyTorch implementation Source code for LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation, accepted at …☆110Updated last year
- [ICLR 2024 Poster] SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos☆19Updated last month
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆117Updated 11 months ago
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆38Updated 7 months ago
- MR. Video: MapReduce is the Principle for Long Video Understanding☆23Updated 5 months ago
- Accepted by CVPR 2024☆38Updated last year
- Empowering Unified MLLM with Multi-granular Visual Generation☆130Updated 8 months ago
- Unofficial implementation of "SODA: Bottleneck Diffusion Models for Representation Learning"☆95Updated last year
- ☆24Updated 2 years ago
- MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning☆132Updated last year
- Official Pytorch implementation of EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens [ICML2024].☆28Updated last year
- NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation Models☆110Updated last month
- [CVPR 2023 Hightlight] PDPP: Projected Diffusion for Procedure Planning in Instructional Videos☆32Updated 2 years ago
- [AAAI2023] Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task (Oral)☆39Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆31Updated last year
- Official PyTorch Implementation of Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos☆11Updated 3 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆86Updated last year
- ☆74Updated 9 months ago
- [ECCV 2024 (Oral)] Towards Scene Graph Anticipation☆18Updated 10 months ago
- [CVPR 2024] Data and benchmark code for the EgoExoLearn dataset☆70Updated last month
- R1-like Video-LLM for Temporal Grounding☆115Updated 3 months ago