YiwengXie/FluxMem

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YiwengXie/FluxMem)

YiwengXie / FluxMem

[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding

☆63

Alternatives and similar repositories for FluxMem

Users that are interested in FluxMem are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NJU-LINK / OmniVideoBench
View on GitHub
The Source Code for OmniVideoBench @ICLR 2026
☆73Feb 12, 2026Updated 3 months ago
wrchen530 / nova3r
View on GitHub
[ICLR 2026] NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction
☆126May 20, 2026Updated 2 weeks ago
neu-vi / struct2d
View on GitHub
Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)
☆30Oct 28, 2025Updated 7 months ago
Mansoor-at / Semi-supervised-surgical-tool-detection
View on GitHub
This repository contains code for our paper titled "A semi-supervised teacher-student framework for surgical tool detection and localizat…
☆10Nov 16, 2023Updated 2 years ago
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆48Oct 9, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆86Feb 27, 2026Updated 3 months ago
KD-TAO / OmniZip
View on GitHub
[CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
☆87Apr 20, 2026Updated last month
huaixuheqing / VPPO-RL
View on GitHub
[ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"
☆68Apr 3, 2026Updated 2 months ago
RuishengSu / CAVE_DSA
View on GitHub
☆24Nov 13, 2024Updated last year
JaaackHongggg / WorldSense
View on GitHub
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
☆49May 7, 2026Updated last month
CAMMA-public / ivtmetrics
View on GitHub
A Python evaluation metrics package for surgical action triplet recognition
☆17Dec 10, 2024Updated last year
WennyJJ / Coronary-Artery-Vein-Segmentation
View on GitHub
☆22Oct 19, 2023Updated 2 years ago
InternLM / ARC-VL
View on GitHub
[CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"
☆44Nov 26, 2025Updated 6 months ago
ChengHan111 / VPT-or-FT
View on GitHub
Official Pytorch implementation of 'Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning'? (ICLR2024)
☆13Mar 8, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
MRUIL / LoViT
View on GitHub
Long Surgical Phase Recognition
☆26Nov 7, 2024Updated last year
DaoyiG / MeshArt
View on GitHub
MeshArt: Generating Articulated Meshes with Structure-Guided Transformers (CVPR2025)
☆55Jun 9, 2025Updated 11 months ago
MinglangYin / DIMON
View on GitHub
☆58May 13, 2025Updated last year
wangzhichuan123 / DAC
View on GitHub
[ICCV 2025] Official PyTorch Code for "Describe, Adapt and Combine: Empowering CLIP Encoders for Open-set 3D Object Retrieval"
☆18Aug 23, 2025Updated 9 months ago
xbyym / StableWorld
View on GitHub
StableWorld: Towards Stable and Consistent Long Interactive Video Generation
☆95Mar 18, 2026Updated 2 months ago
rkzheng99 / ViLLa
View on GitHub
Video Reasoning Segmentation
☆27Nov 29, 2024Updated last year
fujiso / SODA
View on GitHub
SODA: Story Oriented Dense Video Captioning Evaluation Framework
☆14May 3, 2024Updated 2 years ago
TIGER-AI-Lab / Context-Forcing
View on GitHub
Consistent Autoregressive Video Generation with Long Context
☆88Feb 6, 2026Updated 4 months ago
qiujihao19 / LongVideo-R1
View on GitHub
[CVPR 2026] LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
☆47Feb 28, 2026Updated 3 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
ealicesora / Awesome-Autoregressive-Video-Diffusion
View on GitHub
Collection of forcing related autoregressive video Gen
☆98Mar 31, 2026Updated 2 months ago
nailwatts / FNIN
View on GitHub
FNIN: A Fourier Neural Operator-based Numerical Integration Network for Surface-form-gradients
☆14Jan 22, 2025Updated last year
OmniMMI / OmniMMI
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆23Apr 10, 2026Updated last month
Yangr116 / VST
View on GitHub
Visual Spatial Tuning
☆197Mar 25, 2026Updated 2 months ago
Jayce1kk / SpaceVLLM
View on GitHub
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
☆17May 8, 2025Updated last year
WayneJin0918 / SRUM
View on GitHub
Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a cost-e…
☆92Nov 26, 2025Updated 6 months ago
KaiyangLi1992 / Uni-LoRA
View on GitHub
☆43Jan 16, 2026Updated 4 months ago
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 weeks ago
EsYoon7 / RLHF-TLCR
View on GitHub
[ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"
☆12Dec 6, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
weijielyu / FaceCam
View on GitHub
[CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
☆56Mar 26, 2026Updated 2 months ago
Lliar-liar / Daily-Omni
View on GitHub
This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
☆42Apr 28, 2026Updated last month
JohnZhan2023 / PerpetualWonder
View on GitHub
☆71Apr 12, 2026Updated last month
AVoCaDO-Captioner / AVoCaDO
View on GitHub
https://avocado-captioner.github.io/
☆36Oct 16, 2025Updated 7 months ago
bobhash / Google-Streetview-Panoramas-Collection
View on GitHub
Mini library for collecting images from google streets view. Generally designed for collecting datasets for ML
☆11Nov 15, 2021Updated 4 years ago
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆143Jul 24, 2025Updated 10 months ago
hanxunyu / VisionTrim
View on GitHub
[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"
☆49Feb 24, 2026Updated 3 months ago