zeyuanyin / SATA_forkLinks
This is a fork of SATA repo (CVPR 2025), which is invisiable.
☆22Updated 2 months ago
Alternatives and similar repositories for SATA_fork
Users that are interested in SATA_fork are comparing it to the libraries listed below
Sorting:
- [AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention☆113Updated last year
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆307Updated 2 weeks ago
- assistant tools for attention visualization in deep learning☆28Updated 3 years ago
- Official repository of MLLA (NeurIPS 2024)☆354Updated 3 months ago
- [ICCV2025 highlight]Rectifying Magnitude Neglect in Linear Attention☆37Updated 2 months ago
- [NeurIPS2024 Spotlight] The official implementation of MambaTree: Tree Topology is All You Need in State Space Model☆102Updated last year
- An example reproduction checklist for AAAI-26 submissions.☆106Updated 2 months ago
- [CVPR 2024] This is the official implementation of "ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction"☆51Updated 3 months ago
- The official implementation of [CVPR 2025] "5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks".☆366Updated 3 months ago
- [ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs☆82Updated 2 months ago
- vHeat: Building Vision Models upon Heat Conduction☆254Updated 4 months ago
- [AAAI 2025] Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking☆106Updated 4 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆198Updated 2 months ago
- Unified the Anonymous and Camera Ready Version, hope everyone can get an ACCEPT☆253Updated 3 months ago
- [AAAI-2025] The offical code for SiTo (Similarity-based Token Pruning for Stable Diffusion Models)☆36Updated 4 months ago
- Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"☆68Updated last month
- Official repository for VisionZip (CVPR 2025)☆352Updated 2 months ago
- [NeurIPS 2025 Oral] Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think☆140Updated last week
- [ICCV 2025] Official PyTorch Code for "Advancing Textual Prompt Learning with Anchored Attributes"☆99Updated last month
- [ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders☆16Updated 8 months ago
- ☆138Updated last year
- [NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing☆99Updated 3 weeks ago
- ☆50Updated 3 weeks ago
- [ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".☆163Updated 4 months ago
- Survey: https://arxiv.org/pdf/2507.20198☆157Updated last month
- A Fine-grained Benchmark for Video Captioning and Retrieval☆21Updated 2 months ago
- ☆251Updated last year
- 中科大数字图像分析(周文罡、李厚强等)2022秋学期复习资料☆17Updated 2 years ago
- [CVPR 2024] iKUN: Speak to Trackers without Retraining☆134Updated last year
- Official repository of FLatten Transformer (ICCV2023)☆442Updated 11 months ago