iLearn-Lab/CVPR25-LION-FS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/iLearn-Lab/CVPR25-LION-FS)

iLearn-Lab / CVPR25-LION-FS

[CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant

☆29

Alternatives and similar repositories for CVPR25-LION-FS

Users that are interested in CVPR25-LION-FS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

iLearn-Lab / CVPR26-HiconAgent
View on GitHub
[CVPR 2026] HiconAgent: History Context-aware Policy Optimization for GUI Agents
☆31Mar 9, 2026Updated 4 months ago
iLearn-Lab / ACM-MM25-PUMA
View on GitHub
[ACM MM 2025] PUMA: Layer-Pruned Language Model for Efficient Unified Multimodal Retrieval with Modality-Adaptive Learning
☆18Jun 6, 2026Updated last month
JiuTian-VL / UniEmo
View on GitHub
[TIP 2026] UniEmo: Unifying Emotional Understanding and Generation with Learnable Expert Queries
☆34May 7, 2026Updated 2 months ago
iLearn-Lab / ACL26-PersonalAlign
View on GitHub
[ACL 2026 main] PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records
☆21Apr 11, 2026Updated 3 months ago
iLearn-Lab / MM25-EmoSym
View on GitHub
[ACM MM 2025] Official repository of "EmoSym: A Symbiotic Framework for Unified Emotional Understanding and Generation via Latent Reasoni…
☆30May 6, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
OmniMMI / M4
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆18Apr 2, 2025Updated last year
pro-assist / ProAssist
View on GitHub
☆20Jul 21, 2025Updated last year
JiuTian-VL / SimpAgent
View on GitHub
[ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification
☆48Mar 12, 2026Updated 4 months ago
iLearn-Lab / CVPR26-OptimusVLA
View on GitHub
[CVPR 2026] Official Implementation for Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Effi…
☆25Updated this week
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆153Jul 24, 2025Updated 11 months ago
OzymandiasChen / PCGR
View on GitHub
Prototype Conditioned Generative Replay for Continual Learning in NLP - NAACL 2025
☆26Apr 9, 2026Updated 3 months ago
iLearn-Lab / AAAI26-SemanticVLA
View on GitHub
[AAAI 2026 Oral] SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
☆70Apr 5, 2026Updated 3 months ago
HumanMLLM / ViSpeak
View on GitHub
(ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"
☆52Jul 1, 2025Updated last year
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆131Nov 4, 2025Updated 8 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
iLearn-Lab / NeurIPS25-CogVLA
View on GitHub
[NeurIPS 2025] CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
☆185Jun 17, 2026Updated last month
patrick-tssn / VSTAR
View on GitHub
[ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information
☆16Oct 27, 2024Updated last year
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
aurooj / SHG-VQA
View on GitHub
Learning Situation Hyper-Graphs for Video Question Answering
☆23Feb 16, 2024Updated 2 years ago
Steve2457 / Context-Agent
View on GitHub
[ACL 2026] Context-Agent: Dynamic Discourse Trees for Non-Linear Dialogue
☆24Apr 14, 2026Updated 3 months ago
dibschat / ProVideLLM
View on GitHub
[ICCV 2025] Streaming VideoLLMs for Real-time Procedural Video Understanding
☆18Oct 26, 2025Updated 8 months ago
TencentYoutuResearch / HighlightDetection-CLC
View on GitHub
Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"
☆18Mar 21, 2023Updated 3 years ago
SooLab / EyeWO
View on GitHub
[NeurIPS2025] The official PyTorch implementation of the "Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video".
☆34Dec 25, 2025Updated 6 months ago
xiZAIzai / JailExpert
View on GitHub
This is the official repository for JailExpert
☆23Sep 9, 2025Updated 10 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
RayYoh / GaussianCross
View on GitHub
[MM‘25] GaussianCross: Cross-modal Self-supervised 3D Representation Learning via Gaussian Splatting
☆19Nov 24, 2025Updated 7 months ago
wlin-at / MAXI
View on GitHub
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)
☆31Sep 5, 2023Updated 2 years ago
Yuzhuo-Dang / MLLMRec
View on GitHub
☆23Apr 26, 2026Updated 2 months ago
OzymandiasChen / ActorMind
View on GitHub
ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing - ACL Findings 2026
☆25Updated this week
fansunqi / VideoTool
View on GitHub
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
☆23May 18, 2026Updated 2 months ago
SII-dannyXSC / Human2Robot
View on GitHub
AAAI 2026 Oral
☆18Dec 23, 2025Updated 6 months ago
gccnlp / Light-PEFT
View on GitHub
[ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
☆13Sep 2, 2024Updated last year
alibaba / Deep-Vision
View on GitHub
☆37Apr 7, 2022Updated 4 years ago
xinding-bot / StreamMind
View on GitHub
[ICCV 2025] StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
☆72Jun 25, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
JiuTian-VL / MoME
View on GitHub
[NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
☆85Dec 27, 2025Updated 6 months ago
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
apple / ml-streambridge
View on GitHub
☆40Nov 5, 2025Updated 8 months ago
JiuTian-VL / Optimus-3
View on GitHub
Official Implementation for Optimus-3: Dual-Router Aligned Mixture-of-Experts Agent with Dual-Granularity Reasoning-Aware Policy Optimiza…
☆69Apr 14, 2026Updated 3 months ago
SaraGhazanfari / CoF
View on GitHub
Chain-of-Frames [CVPR 2026]
☆40Jul 2, 2025Updated last year
yellow-binary-tree / MMDuet
View on GitHub
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆44Feb 5, 2025Updated last year
iLearn-Lab / CVPR25-Optimus-2
View on GitHub
[CVPR 2025] Official Implementation for Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
☆27Jun 17, 2025Updated last year