PKUTAN / SAWT
Official python implementation for ICML 2024: "Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem"
☆13Updated 7 months ago
Alternatives and similar repositories for SAWT:
Users that are interested in SAWT are comparing it to the libraries listed below
- The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆85Updated 3 months ago
- Unifying Specialized Visual Encoders for Video Language Models☆15Updated last month
- Unofficial implementation of "SODA: Bottleneck Diffusion Models for Representation Learning"☆82Updated 11 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆117Updated last month
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆82Updated last year
- Deformable Graph Convolutional Networks (Author's PyTorch implementation for the AAAI 2022 paper)☆25Updated 2 years ago
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆42Updated 8 months ago
- Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)☆55Updated 10 months ago
- ☆54Updated last month
- [NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning☆68Updated last week
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆84Updated 4 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆72Updated 2 weeks ago
- [AAAI2023] Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task (Oral)☆39Updated 11 months ago
- Code for Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation (ICCV 2023)☆23Updated last year
- Liquid: Language Models are Scalable Multi-modal Generators☆65Updated this week
- [CVPR 2023 Hightlight] PDPP: Projected Diffusion for Procedure Planning in Instructional Videos☆32Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆26Updated 4 months ago
- ☆66Updated 2 months ago
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆36Updated this week
- Official Repository of Multi-Object Hallucination in Vision-Language Models (NeurIPS 2024)☆27Updated 3 months ago
- Official Pytorch implementation of EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens [ICML2024].☆24Updated 8 months ago
- Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).☆35Updated 9 months ago
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆78Updated 2 weeks ago
- Give us minutes, we give back a faster Mamba. The official implementation of "Faster Vision Mamba is Rebuilt in Minutes via Merged Token …☆36Updated 2 months ago
- CCD: Official PyTorch implementation of the paper "Contextual Debiasing for Visual Recognition with Causal Mechanisms"☆16Updated 2 years ago
- source code for NeurIPS'23 paper "Dream the Impossible: Outlier Imagination with Diffusion Models"☆65Updated last month
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆27Updated 3 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆54Updated 3 months ago