Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"
☆27Jan 17, 2026Updated last month
Alternatives and similar repositories for MC-ViT
Users that are interested in MC-ViT are comparing it to the libraries listed below
Sorting:
- Code for our ACL 2025 paper "Language Repository for Long Video Understanding"☆34Jun 17, 2024Updated last year
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learning☆14Apr 25, 2024Updated last year
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics☆38Sep 10, 2025Updated 5 months ago
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Jan 29, 2024Updated 2 years ago
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta☆16Nov 11, 2024Updated last year
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Jun 7, 2024Updated last year
- ☆42Apr 7, 2024Updated last year
- Implementation of the report: on the domain robustness of prefix and prompt tuning☆20Mar 10, 2022Updated 3 years ago
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆55Oct 21, 2025Updated 4 months ago
- rmp data ranking☆13Nov 4, 2025Updated 4 months ago
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆56Jul 1, 2025Updated 8 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Feb 9, 2026Updated 3 weeks ago
- ☆21May 11, 2025Updated 9 months ago
- A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series…☆28Nov 11, 2024Updated last year
- An experiment to see if we can process G2 reviews to extract topics from reviews☆10Feb 5, 2024Updated 2 years ago
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆36Jan 31, 2026Updated last month
- Community Open Source Implementation of GPT4o in PyTorch☆26Feb 9, 2026Updated last month
- Learning Representational Invariances for Data-Efficient Action Recognition☆33Oct 26, 2021Updated 4 years ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated last year
- High Security Surveillance Camera using OpenCV, Python & Arduino☆12Jun 20, 2020Updated 5 years ago
- This project is an AI Recruitment System designed to accelerate the hiring process for HR and technical recruiters.☆14Jan 3, 2025Updated last year
- Repository dedicated to developing a robust and modular framework for Multi-Agent Reinforcement Learning (MARL) algorithms.☆13Mar 3, 2024Updated 2 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- ☆13Apr 27, 2021Updated 4 years ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- This repository contains the registries for components, agents and services, the second part of the autonolas-v1 protocol.☆15Updated this week
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆45Dec 11, 2023Updated 2 years ago
- 【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition☆38Apr 27, 2024Updated last year
- This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"☆273Oct 15, 2025Updated 4 months ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆40Nov 11, 2024Updated last year
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"☆42Nov 11, 2024Updated last year
- Paper list of Video LLM hallucination. Welcome to Star and Contribute!☆20Feb 18, 2026Updated 2 weeks ago
- Kait's Site☆14Sep 7, 2021Updated 4 years ago
- A platform aimed at creating websites that perform self-optimization☆12May 4, 2024Updated last year
- Concurrent data extraction from unstructured text and images using AI models.☆18Aug 10, 2025Updated 6 months ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 4 months ago