ZihaoHuang-notabot / Ultra-Sparse-Memory-NetworkLinks
☆31Updated 2 months ago
Alternatives and similar repositories for Ultra-Sparse-Memory-Network
Users that are interested in Ultra-Sparse-Memory-Network are comparing it to the libraries listed below
Sorting:
- The official repo of continuous speculative decoding☆30Updated 8 months ago
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆51Updated 4 months ago
- PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025☆14Updated last week
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆20Updated 9 months ago
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Updated 4 months ago
- ☆21Updated 6 months ago
- Image Tokenizer Needs Post-Training☆24Updated last month
- Official implementation of ECCV24 paper: POA☆24Updated last year
- 🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…☆32Updated last month
- VideoNSA: Native Sparse Attention Scales Video Understanding☆61Updated 2 weeks ago
- WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models☆22Updated 4 months ago
- ☆34Updated 6 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆109Updated last month
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated 2 years ago
- Unifying Specialized Visual Encoders for Video Language Models☆22Updated last week
- VidKV: Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models☆24Updated 8 months ago
- ☆24Updated 3 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆17Updated 8 months ago
- Resa: Transparent Reasoning Models via SAEs☆44Updated 2 months ago
- ☆45Updated last year
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆125Updated 5 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆39Updated 9 months ago
- [NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Grafting☆62Updated 5 months ago
- Scaling Spatial Intelligence with Multimodal Foundation Models☆117Updated last week
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- Learning to Skip the Middle Layers of Transformers☆15Updated 3 months ago
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆33Updated last week
- Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"☆54Updated 4 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆27Updated 4 months ago