☆171Apr 27, 2026Updated last month
Alternatives and similar repositories for EasyVideoR1
Users that are interested in EasyVideoR1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆91Dec 24, 2025Updated 5 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆35Jan 14, 2026Updated 4 months ago
- [CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression☆67Updated this week
- This is the official repo of MLLM-CL.☆65May 16, 2026Updated 3 weeks ago
- [ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Models☆71May 15, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models☆17Jan 2, 2025Updated last year
- ☆12Dec 6, 2024Updated last year
- Implementation of Variational Hierarchical User-based Conversation Model☆10Jul 2, 2021Updated 4 years ago
- Streaming Thinking for VideoLLM Streaming Video Understanding☆104May 21, 2026Updated 2 weeks ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆21Dec 14, 2025Updated 5 months ago
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning☆43Mar 2, 2026Updated 3 months ago
- [CVPR 2025] VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning☆13Jun 7, 2025Updated last year
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆13May 16, 2025Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICCV 2025] "Fine-grained Spatiotemporal Grounding on Egocentric Videos"☆24Nov 23, 2025Updated 6 months ago
- Awesome paper for multi-modal llm with grounding ability☆20Oct 11, 2025Updated 7 months ago
- Improving Neural Text Generation with Reinforcement Learning☆23Jan 13, 2021Updated 5 years ago
- A benchmark for the task of translation suggestion☆60Jun 23, 2022Updated 3 years ago
- ☆19Oct 28, 2025Updated 7 months ago
- [CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆87Apr 20, 2026Updated last month
- ☆12Feb 13, 2025Updated last year
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆26May 29, 2025Updated last year
- A curated list of Story Ending Generation models; DASFAA'22: Incorporating Commonsense Knowledge into Story Ending Generation via Heterog…☆14May 12, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for paper: Reinforced Vision Perception with Tools☆72Oct 3, 2025Updated 8 months ago
- InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models☆108Apr 20, 2026Updated last month
- A Holistic Embodied Cognition Benchmark☆19Apr 3, 2025Updated last year
- ☆23Mar 31, 2023Updated 3 years ago
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆40Mar 16, 2025Updated last year
- survery of small language models☆18Jul 23, 2024Updated last year
- [EMNLP 2025 Findings] Retrieval-Augmented Machine Translation with Unstructured Knowledge☆15Sep 4, 2025Updated 9 months ago
- [ICLR 2026] "VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?", Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, L…☆39Jan 30, 2026Updated 4 months ago
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆33Oct 20, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆52Sep 13, 2024Updated last year
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆57May 25, 2025Updated last year
- Generating Easy-to-Understand Referring Expressions for Target Identifications☆18Aug 30, 2019Updated 6 years ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆65Jul 22, 2025Updated 10 months ago
- [NeurIPS 2025] HoliTom: Holistic Token Merging for Fast Video Large Language Models☆81Oct 10, 2025Updated 7 months ago
- ☆16Feb 27, 2025Updated last year
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆63Oct 9, 2025Updated 7 months ago