NVlabs/AutoGaze

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVlabs/AutoGaze)

NVlabs / AutoGaze

AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x.

☆292

Alternatives and similar repositories for AutoGaze

Users that are interested in AutoGaze are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zysxmu / DFSQ
View on GitHub
super-resolution; post-training quantization; model compression
☆14Nov 10, 2023Updated 2 years ago
mlomnitz / tcav_pytorch
View on GitHub
Pytorch implementation of Google TCAV
☆10Jan 11, 2019Updated 7 years ago
uyoung-jeong / PoseBH
View on GitHub
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
☆25Jun 20, 2025Updated last year
google-deepmind / visual-memory
View on GitHub
Code & data for "Towards flexible perception with visual memory" (ICML 2025)
☆19Sep 24, 2024Updated last year
Kazuhito00 / M-LSD-warpPerspective-Example
View on GitHub
M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム
☆10Jun 2, 2021Updated 5 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
tzler / mochi_code
View on GitHub
Evaluating Multiview Object Correspondence between Humans and Image models
☆20Feb 12, 2025Updated last year
SCZwangxiao / video-ReTaKe
View on GitHub
Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
☆40Mar 16, 2025Updated last year
JingyangXiang / DFRot
View on GitHub
[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎：https://zhuanlan.zhihu.c…
☆30Mar 5, 2025Updated last year
alibaba / wan-toy-transform
View on GitHub
This is a LoRA model finetuned on Wan-I2V-14B-480P. It turns things in the image into fluffy toys.
☆19Nov 10, 2025Updated 7 months ago
YutaroOgawa / ddpm_cifer10
View on GitHub
拡散モデルを学びたい初学者向けです。書籍「コンピュータビジョン最前線 Summer 2023」の「イマドキノ拡散モデル」の解説をベースに、CIFER-10で画像生成をします
☆19Jul 2, 2023Updated 3 years ago
PhysGame / PhysGame
View on GitHub
PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos
☆49Jul 3, 2025Updated last year
FAVOR-Bench / FAVOR-Bench
View on GitHub
Accepted By The 39th Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track
☆25Nov 17, 2025Updated 7 months ago
xiong-jie-y / depth_based_visual_haptic
View on GitHub
Visual haptic using depth image
☆19Dec 20, 2021Updated 4 years ago
ChenWu98 / generative-visual-prompt
View on GitHub
[NeurIPS 2022] (Amortized) distributional control for pre-trained generative models
☆121Sep 4, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆76Feb 25, 2026Updated 4 months ago
mk322 / LaDiR
View on GitHub
☆50Oct 23, 2025Updated 8 months ago
Jiaxin-Lu / humoto
View on GitHub
[ICCV 2025] HUMOTO Dataset Code Release
☆65Nov 6, 2025Updated 7 months ago
EnVision-Research / LatentMorph
View on GitHub
[ICML 2026] LatentMorph: Morphing Latent Reasoning into Image Generation
☆46May 5, 2026Updated 2 months ago
szacho / pointcam
View on GitHub
Self-supervised adversarial masking for point clouds
☆11Jul 12, 2023Updated 2 years ago
DwangoMediaVillage / manga_frame_extraction
View on GitHub
☆15Jan 17, 2018Updated 8 years ago
sungnyun / cav2vec
View on GitHub
(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
☆16Apr 29, 2025Updated last year
see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
buiquangmanhhp1999 / age_gender_estimation
View on GitHub
Keras implementation of EfficientNet model for age and gender estimation
☆14Mar 18, 2020Updated 6 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Egg-Hu / SMI
View on GitHub
[ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination
☆14Apr 29, 2025Updated last year
RyannDaGreat / MotionV2V
View on GitHub
☆57Mar 5, 2026Updated 4 months ago
mwbini / ether
View on GitHub
[ICML24] Official Implementation of "ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections"
☆16May 31, 2024Updated 2 years ago
thu-ml / TetraJet-MXFP4Training
View on GitHub
Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training
☆40May 4, 2026Updated 2 months ago
MasterZhou1 / Reasoning-Flow
View on GitHub
Code for Paper "The Geometry of Reasoning: Flowing Logics in Representation Space" (ICLR 2026)
☆57Jan 31, 2026Updated 5 months ago
sypsyp97 / diffct
View on GitHub
An CUDA-based library for computed tomography (CT) reconstruction with differentiable operators.
☆23May 21, 2026Updated last month
DeNA / Face2Speech
View on GitHub
☆20Mar 16, 2020Updated 6 years ago
KhachDavid / franka_drake
View on GitHub
Franka simulator in Drake compatible with existing libfranka programs
☆24Aug 29, 2025Updated 10 months ago
DensoITLab / bitprune
View on GitHub
☆11Apr 5, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
UnicomAI / LeMiCa
View on GitHub
[NeurIPS 2025 Spotlight] LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
☆121Jun 22, 2026Updated last week
yxlao / pointcloud_cropper
View on GitHub
☆12Dec 21, 2021Updated 4 years ago
chs20 / fuselip
View on GitHub
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens
☆17Sep 8, 2025Updated 9 months ago
SivanDoveh / IPLoc
View on GitHub
Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples
☆40Nov 27, 2024Updated last year
chinmay5 / vesselformer
View on GitHub
☆14Jul 8, 2023Updated 2 years ago
ShiheWang / FIMA-Q
View on GitHub
[CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
☆29Jun 16, 2025Updated last year
davidliyutong / fsglove
View on GitHub
☆26May 4, 2026Updated 2 months ago