PKUTAN / SAWTLinks
Official python implementation for ICML 2024: "Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem"
☆14Updated last year
Alternatives and similar repositories for SAWT
Users that are interested in SAWT are comparing it to the libraries listed below
Sorting:
- ☆11Updated 2 months ago
- Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation☆142Updated 4 months ago
- multiview and self-supervised learning☆11Updated 3 years ago
- [ICLR 2024 Poster] SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos☆19Updated last month
- [AAAI 2023(Oral)] Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences☆27Updated last year
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆244Updated last year
- [CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models☆48Updated 4 months ago
- Official PyTorch implementation Source code for LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation, accepted at …☆110Updated last year
- [NeurIPS'24 spotlight] MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning☆41Updated 3 months ago
- ☆24Updated 2 years ago
- [CVPR 2023 Hightlight] PDPP: Projected Diffusion for Procedure Planning in Instructional Videos☆32Updated 2 years ago
- Official PyTorch Implementation of Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos☆11Updated 3 months ago
- Official Pytorch implementation of EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens [ICML2024].☆29Updated last year
- R1-like Video-LLM for Temporal Grounding☆120Updated 3 months ago
- Unofficial implementation of "SODA: Bottleneck Diffusion Models for Representation Learning"☆96Updated last year
- [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"☆150Updated last year
- [ACMMM 23] Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization☆25Updated last year
- Accepted by CVPR 2024☆39Updated last year
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Updated 2 years ago
- [ICLR'25] Reconstructive Visual Instruction Tuning☆120Updated 6 months ago
- [CVPR 2024] Data and benchmark code for the EgoExoLearn dataset☆70Updated last month
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆31Updated last year
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆86Updated last year
- Empowering Unified MLLM with Multi-granular Visual Generation☆130Updated 9 months ago
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆46Updated last year
- Give us minutes, we give back a faster Mamba. The official implementation of "Faster Vision Mamba is Rebuilt in Minutes via Merged Token …☆40Updated 10 months ago
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆57Updated 8 months ago
- HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos☆21Updated last year
- (ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"☆30Updated 6 months ago
- [ECCV 2024 (Oral)] Towards Scene Graph Anticipation☆18Updated 10 months ago