☆88Feb 14, 2026Updated 2 weeks ago
Alternatives and similar repositories for LongVideoAgent
Users that are interested in LongVideoAgent are comparing it to the libraries listed below
Sorting:
- ☆13Jun 4, 2025Updated 8 months ago
- ☆47Jan 31, 2026Updated last month
- [ICLR 2026] Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos☆26Jan 26, 2026Updated last month
- The official code for ICML 2024 "FedREDefense: Defending against Model Poisoning Attacks for Federated Learning using Model Update Recons…☆29Jun 6, 2024Updated last year
- [CVPR 2025] Official code for "Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation"☆65Jun 6, 2025Updated 8 months ago
- ☆11Jul 17, 2024Updated last year
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Jan 1, 2026Updated 2 months ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆60Aug 23, 2024Updated last year
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆41Feb 10, 2026Updated 2 weeks ago
- ☆84Jan 4, 2026Updated last month
- Video Diffusion State Space Models☆19Mar 27, 2024Updated last year
- Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."☆61Updated this week
- [AAAI 2026] SlideTailor: Personalized Presentation Slide Generation for Scientific Papers☆43Jan 1, 2026Updated 2 months ago
- Official Repository of Personalized Visual Instruct Tuning☆34Mar 6, 2025Updated 11 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆57Jan 23, 2026Updated last month
- [CVPR 2025] GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping☆42Oct 22, 2025Updated 4 months ago
- Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"☆63Jan 19, 2026Updated last month
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆39Jun 14, 2025Updated 8 months ago
- ASTRA is an end-to-end system for synthesizing agentic trajectories and rule-verifiable environments for SFT and RL training, developed b…☆114Jan 30, 2026Updated last month
- real time face swap and one-click video deepfake with only a single image☆12Sep 13, 2024Updated last year
- Official implementation of FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment☆28Updated this week
- ☆14Feb 13, 2026Updated 2 weeks ago
- we explores the fascinating domain of text-to-image generation using the powerful capabilities of the Flux API. The objective is to trans…☆12Aug 14, 2024Updated last year
- A hand-gesture recognition system using Doppler effect of ultrasonic.☆11Mar 2, 2019Updated 7 years ago
- Create short vertical videos for TikTok, YouTube Shorts, and Instagram Reels using AI. Fully automated pipeline with traceability. 🚀🎥☆21Updated this week
- A large-scale training and benchmarking framework for rPPG.☆10Nov 26, 2024Updated last year
- [ICLR2026] The code for "Interp3D: Correspondence-Aware Interpolation for Generative Textured 3D Morphing."☆24Jan 21, 2026Updated last month
- [ACM MobiSys 2024 Demo] Image-based Indoor Localization using Object Detection and LSTM☆12Feb 12, 2026Updated 2 weeks ago
- (AAAI 2026) OSVBench, a new benchmark for evaluating Large Language Models (LLMs) in generating complete specification code pertaining to…☆13May 13, 2025Updated 9 months ago
- ComfyUI custom node implementation of VideoMaMa for video matting with mask conditioning.☆34Feb 9, 2026Updated 3 weeks ago
- [WACV 2025] Follow-Your-Handle: This repo is the official implementation of "MagicStick: Controllable Video Editing via Control Handle Tr…☆98Dec 7, 2023Updated 2 years ago
- ICML2025, I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models☆193Sep 7, 2025Updated 5 months ago
- An open source code repository of driving world models, with training, inferencing, evaluation tools, and pretrained checkpoints.☆370Jun 19, 2025Updated 8 months ago
- Official code for ECCV2024 paper: GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal☆104Nov 25, 2025Updated 3 months ago
- [ICLR2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models☆115Jan 30, 2026Updated last month
- ☆75Dec 23, 2025Updated 2 months ago
- Official implementation of "ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation"☆89Dec 24, 2025Updated 2 months ago
- Official code for paper: F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting☆50Mar 11, 2025Updated 11 months ago