Espere-1119-Song / VideoNSAView external linksLinks
VideoNSA: Native Sparse Attention Scales Video Understanding
☆78Nov 16, 2025Updated 2 months ago
Alternatives and similar repositories for VideoNSA
Users that are interested in VideoNSA are comparing it to the libraries listed below
Sorting:
- DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation☆39Aug 3, 2025Updated 6 months ago
- ☆20Oct 22, 2025Updated 3 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆36Oct 3, 2025Updated 4 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆50Jan 23, 2026Updated 3 weeks ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆79Jun 17, 2024Updated last year
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆24Oct 7, 2025Updated 4 months ago
- ☆42Jan 24, 2026Updated 3 weeks ago
- Learn Model Context Protocol with Python, published by Packt☆32Feb 4, 2026Updated last week
- ☆15Nov 7, 2024Updated last year
- ☆131May 29, 2025Updated 8 months ago
- LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba (Official Implementation)☆17Oct 24, 2024Updated last year
- ☆36Dec 25, 2025Updated last month
- ☆24May 13, 2025Updated 9 months ago
- [NeurIPS 2023] Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator☆98Mar 18, 2024Updated last year
- Hands-On Image Processing with Python, Second Edition, Published by Packt☆26Updated this week
- ☆20Oct 13, 2024Updated last year
- The official implementation of "Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding"☆23Jun 26, 2025Updated 7 months ago
- Official repo for StyleMe3D☆28Apr 22, 2025Updated 9 months ago
- Official Code Release for Container : Context Aggregation Network☆46Oct 17, 2021Updated 4 years ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27May 13, 2025Updated 9 months ago
- (NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models☆22Nov 20, 2024Updated last year
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆21Dec 10, 2024Updated last year
- [CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .☆21Nov 28, 2022Updated 3 years ago
- Evaluation codes and data for GenEval2☆55Jan 8, 2026Updated last month
- Official Code Release of NeurIPS 2025 Paper: HoloScene: Simulation‑Ready Interactive 3D Worlds from a Single Video☆86Oct 8, 2025Updated 4 months ago
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆53Mar 25, 2025Updated 10 months ago
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆52Jul 16, 2024Updated last year
- ☆44Sep 1, 2025Updated 5 months ago
- Official implementation for SSDD Single-Step Diffusion Decoder for Efficient Image Tokenization.☆55Nov 12, 2025Updated 3 months ago
- ☆120Feb 4, 2026Updated last week
- Official code for MotionBench (CVPR 2025)☆64Mar 3, 2025Updated 11 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆86Jul 13, 2025Updated 7 months ago
- Official Implementation of the paper: A Complete Recipe for Diffusion Generative Models☆31Nov 1, 2024Updated last year
- Consistent Autoregressive Video Generation with Long Context☆31Feb 6, 2026Updated last week
- ☆31Sep 1, 2025Updated 5 months ago
- SFT+RL boosts multimodal reasoning☆45Jun 27, 2025Updated 7 months ago
- [ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning☆63Sep 10, 2022Updated 3 years ago